首页  才聚绿洲  实验室团队  未来影像实验室
张圣宇
博士生导师
所在实验室:
未来影像实验室
职务:
浙江大学平台“百人计划”研究员
电话:
邮箱:
sy_zhang@zju.edu.cn
地址:
未来影像实验室
研究方向:
大小模型协同、跨媒体生成和大模型智能体
  • 个人简介
  • 研究与成果
  • 产业服务
  • 荣誉奖励


教育经历


工作经历


研究方向
大小模型协同、跨媒体生成和大模型智能体
社会兼职


研究与成果
  • 多模态大模型和GUI Agent 微调优化

  1. InfiGUIAgent系列工作(InfiGUIAgentInfiGUI-R1InfiGUI-G1

    Yuhang  Liu, Zeyu Liu, Shuanghe Zhu, Pengxiang Li, Congkai Xie, Jiasheng Wang,  Xueyu Hu, Xiaotian Han, Jianbo Yuan, Xinyao Wang, Shengyu Zhang*,  Hongxia Yang, Fei Wu:

    InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization. AAAI 2026

  2. Wenkai Wang, Hongcan Guo, Zheqi Lv, Shengyu Zhang*:

    A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models. AAAI 2026

  3. Yurun Chen, Xueyu Hu, Yuhan Liu, Zigi Wang, Zeyi Liao, Lin Chen, Feng Wei, Yuxi qian, Bo Zheng, Keting Yin, Shengyu Zhang*:

    Graph2Eval:Automatic Multimodal Task Generation for Agents via Knowledge Graphs. CVPR 2026

  4. Wenkai Wang, Xiyun Li, Hongcan Guo, Wenhao Yu, Tianqing Fang, Haitao Mi, Dong Yu, Shengyu Zhang*:

    Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding. ACL 2026

  5. Yuqing Zhang, Honghui Sheng, Xueyu Hu, Shengyu Zhang*, Fei Wu:

    DAC-Bench: A Decision-Aware Benchmark for Compositional Mobile GUI Tasks. ACL 2026


  • 多模态大模型和GUI Agent 推理优化

  1. Biao Yi, Xueyu Hu, Yurun Chen, Shengyu Zhang*, Hongxia Yang, Fan Wu:

    EcoAgent: An Efficient Device-Cloud Collaborative Multi-Agent Framework for Mobile Automation. AAAI 2026

  2. Kunxi Li, Zhonghua Jiang, Zhouzhou Shen, Zhaode Wang, Chengfei Lv, Shengyu Zhang*, Fan Wu, Fei Wu:

    MadaKV: Adaptive Modality-Perception KV Cache Eviction for Efficient Multimodal Long-Context Inference. ACL 2025

  3. Zhonghua Jiang, Kui Chen, Kunxi Li, Keting Yin, Yiyun Zhou, Zhaode Wang, Chengfei Lv, Shengyu Zhang*:

    AccKV: Towards Efficient Audio-Video LLMs Inference via Adaptive-Focusing and Cross-Calibration KV Cache Optimization. AAAI 2026

  4. Sihao Liu, YuFan Xiong, Zhonghua Jiang, Zhaode Wang, chengfei lv, Shengyu Zhang*:

    RetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Eviction. ACL 2026

  5. Evaluating the Robustness of Multimodal Agents Against Active Environmental Injection Attacks

    Yurun Chen, Xavier Hu, Keting Yin, Juncheng Li, Shengyu Zhang. ACM MM 2025


  • 视觉内容生成AIGC

  1. Keming Ye, Zhou Zhao, Fan Wu, Shengyu Zhang*:

    CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration. ICLR 2026

  2. Keming Ye, Zhipeng Huang, Canmiao Fu, Qingyang Liu, Jiani Cai, Zheqi Lv, Chen Li, Jing LYU, Zhou Zhao, Shengyu Zhang*:

    UnicEdit-10M:  A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified  Verification for Reasoning-Enriched Edits. CVPR 2026

  3. Jiajian Xie, Shengyu Zhang*, Mengze Li, Chengfei Lv, Zhou Zhao, Fei Wu:

    EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation. ICLR 2025

  4. Zhan Qu, Shengyu Zhang*, Mengze Li, Zhuo Chen, Chengfei Lv, Zhou Zhao, Fei Wu:

    ExpTalk:  Diverse Emotional Expression via Adaptive Disentanglement and Refined  Alignment for Speech-Driven 3D Facial Animation. IJCAI 2025


产业服务
荣誉奖励