Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
awards
Outstanding Achievement Award
Published:
Recognition for exceptional academic performance
publications
Skeleton-Guided Spatial-Temporal Feature Learning for Video-Based Visible-Infrared Person Re-Identification
Published in arXiv preprint, 2024
Video-based visible-infrared person re-identification (VVI-ReID) is challenging due to significant modality feature discrepancies. This work proposes a novel Skeleton-guided spatial-Temporal feAture leaRning (STAR) method that uses skeleton information to improve spatial-temporal features in videos of both modalities.
Recommended citation: Jiang, W., Zhu, X., Gao, J., & Liao, D. (2024). Skeleton-Guided Spatial-Temporal Feature Learning for Video-Based Visible-Infrared Person Re-Identification. arXiv preprint arXiv:2411.11069.
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
Published in arXiv preprint, 2025
This work proposes a novel evolutionary framework for GUI agents that enhances operational efficiency while retaining intelligence and flexibility through a memory mechanism that records task execution history and evolves high-level actions.
Recommended citation: Jiang, W., Zhuang, Y., Song, C., Yang, X., Zhou, J. T., & Zhang, C. (2025). AppAgentX: Evolving GUI Agents as Proficient Smartphone Users. arXiv preprint arXiv:2503.02268.
Learning to Be A Doctor: Searching for Effective Medical Agent Architectures
Published in ACM International Conference on Multimedia (ACM MM) 2025, 2025
This paper introduces a novel framework for the automated design of medical agent architectures, defining a hierarchical and expressive agent search space that enables dynamic workflow adaptation through structured modifications at multiple levels.
Recommended citation: Zhuang, Y., Jiang, W., Zhang, J., Yang, Z., Zhou, J. T., & Zhang, C. (2025). Learning to Be A Doctor: Searching for Effective Medical Agent Architectures. In Proceedings of ACM International Conference on Multimedia (ACM MM) 2025.
Adaptive Mobile Agent for Dynamic Interactions
Published in IEEE International Conference on Multimedia and Expo (ICME) 2025, 2025
This work presents a novel LLM-based multimodal agent framework for mobile devices, designed to enhance interaction and adaptive capabilities in dynamic mobile environments through autonomous navigation and human-like behaviors.
Recommended citation: Li, Y., Zhang, C., Yang, W., Fu, B., Cheng, P., Chen, X., Chen, L., & Wei, Y. (2025). Adaptive Mobile Agent for Dynamic Interactions. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME) 2025.