全部论文链接请参见:Google Scholar 页面,部分选择论文如下:

大模型方向
  • Y. Huang, F. Ma #, Y. Shao, J. Guo, Z. Yu, L. Cui, Q. Tian.
    Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning
    ICLR 2026
  • Z. Chen, H. Lin, Y. Nie, F. Ma #, X. Xu, F. Yu, C. Long.
    Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding
    ICLR 2026
  • Z. Lian, L. Sun, L. Chen, H. Chen, Z. Cheng, F. Zhang, Z. Jia, Z. Ma, F. Ma #, X. Peng, J. Tao.
    EmoPrefer: Can Large Language Models Understand Human Emotion Preferences?
    ICLR 2026
  • S. Chen, T. Zhao, Y. Bin, F. Ma #, W. Shao, Z. Wang.
    D-GARA: A Dynamic Benchmarking Framework for GUI Agent Robustness in Real-World Anomalies
    AAAI 2026 CCF A
  • C. Zhang, J. Peng, Z. Wang, Y. Lai, H. Sun, H. Chang, F. Ma #, W. Yu.
    VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism
    ACL 2025 CCF A
AIGC方向
  • H. Xue, X. Luo, Z. Hu, X. Zhang, X. Xiang, Y. Dai, J. Liu, Z. Zhang, M. Li, J. Yang, F. Ma #, Z. Wu, C. Yang, Z. Dai, F. Yu.
    Human Motion Video Generation: A survey
    IEEE TPAMI, 2025 IF: 20.8
  • Y. Xie, R. Min, Z. Qin, F. Ma #, L. Shen, F. Yu, X. Cao.
    RoMa: A Robust Model Watermarking Scheme for Protecting IP in Diffusion Models
    NeurIPS 2025 CCF A
  • Y. Xie, T. Feng, X. Zhang, X. Luo, Z. Guo, W. Yu, H. Chang, F. Ma #, F. Yu.
    PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis
    AAAI 2025 CCF A
  • H. Xue, Z. Zhang, M. Li, Z. Dai, F. Yu, F. Ma #, Z. Wu.
    VideoHumanMIB: Unlocking Appearance Decoupling for Video Human Motion In-betweening
    IJCAI 2025 CCF A
  • F. Ma #, Y. Xie, Y. Li, Y. He, Y. Zhang, H. Ren, Z. Liu, W. Yao, F. Ren, F. Yu, S. Ni.
    A review of human emotion synthesis based on generative technology
    IEEE Transactions on Affective Computing, 2025 IF: 9.6
具身智能方向
  • Y. Xie, M. Li, S. Li, X. Li, G. Chen, F. Ma #, F. Yu, W. Ding.
    Universal Visuo-Tactile Video Understanding for Embodied Interaction
    NeurIPS 2025 CCF A
  • Y. Xie, B. Ou, F. Ma #, Y. Liu.
    Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation
    IROS 2025
  • Z. Guo, Y. Xie, W. Xie, P. Huang, C. Wang, F. Ma #, F. Yu.
    GaussianPU: Color Point Cloud Upsampling via 3D Gaussian Splatting
    IROS 2025
  • S. Chen, Z. Wu, K. Zhang, C. Li, B. Zhang, F. Ma #, F. Yu, Q. Li.
    Exploring embodied multimodal large models: Development, datasets, and future directions
    Information Fusion, 2025 IF: 14.8
  • Z. Zhong, Y. He, P. Li, F. Yu, F. Ma #.
    A Language-Driven Navigation Strategy Integrating Semantic Maps and Large Language Models
    IROS 2024