2023 Autumn Education 1
Published:
Published:
Short description of portfolio item number 1
Short description of portfolio item number 2 
Published in KDD 2025 Datasets and Benchmarks Track, 2025
We introduce GapEval, a symmetric evaluation framework that measures the bidirectional inference consistency of Unified Multimodal Models. The findings challenge the assumption of true model integration, demonstrating that understanding and generation capabilities are often unaligned. By revealing that knowledge within these models is disjointed and unsynchronized, the research provides a critical perspective on the limitations of current “unified” architectures and calls for deeper cognitive convergence in future AI development.
Recommended citation: @article{pu2025judge, title={Judge Anything: MLLM as a Judge Across Any Modality}, author={Pu, Shu and Wang, Yaochen and Chen, Dongping and Chen, Yuhang and Wang, Guohao and Qin, Qi and Zhang, Zhongyi and Zhang, Zhiyuan and Zhou, Zetong and Gong, Shuang and others}, journal={arXiv preprint arXiv:2503.17489}, year={2025} }
Download Paper | Download Slides | Download Bibtex
Published in In submission, 2025
We introduce Paper2Web, the first benchmark for assessing academic webpage generation, and PWAgent, an autonomous system designed to bridge the gap between static PDFs and interactive project sites. By leveraging an iterative refinement process and MCP-driven tools, PWAgent generates layout-aware, multimedia-rich homepages that prioritize both aesthetics and information density. Experimental results demonstrate that this agent-driven approach achieves superior performance over end-to-end LLM generation and existing web-conversion templates, offering a high-quality, low-cost solution for researchers.
Recommended citation: @misc{chen2025paper2webletsmakepaper, title={Paper2Web: Let's Make Your Paper Alive!}, author={Yuhang Chen and Tianpeng Lv and Siyi Zhang and Yixiang Yin and Yao Wan and Philip S. Yu and Dongping Chen}, year={2025}, eprint={2510.15842}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2510.15842}, }
Download Paper | Download Slides | Download Bibtex
Published in CVPR 2026 Findings, 2026
We extend MLLM-as-a-Judge across multiple modalities, present TaskAnything and JudgeAnything benchmarks that reveal MLLM-as-a-Judge excel at judging MMU but struggle with MMG tasks.
Recommended citation: @misc{wang2026quantifyinggapunderstandinggeneration, title={Quantifying the Gap between Understanding and Generation within Unified Multimodal Models}, author={Chenlong Wang and Yuhang Chen and Zhihan Hu and Dongping Chen and Wenhu Chen and Sarah Wiegreffe and Tianyi Zhou}, year={2026}, eprint={2602.02140}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2602.02140}, }
Download Paper | Download Slides | Download Bibtex
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.