📝 Publications

CVPR 2026

Omni-MMSI: Towards Identity-attributed Social Interaction Understanding.
Xinpeng Li, Bolin Lai, Hardy Chen, Shijian Deng, Cihang Xie, Yuyin Zhou, James M. Rehg, Yapeng Tian.

Omni-MMSI-R, addressing the multimodal social interaction understanding in raw audio-video input, is a reference-guided pipeline that extracts identity-attributed cues augmented by tools and conducts chain-of-thought reasoning.

TMLR 2025

Towards Online Multi-Modal Social Interaction Understanding
Xinpeng Li, Shijian Deng, Bolin Lai, Weiguo Pian, James M. Rehg, Yapeng Tian.

Online-MMSI-VLM is a novel framework, for the newly proposed online MMSI setting, that leverages multi-party conversation forecasting and social-aware visual prompting with multimodal large language models.

ACM-MM 2024

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li.

DSCT is a single-stage emotion recognition approach, with subject-context decoupling, for simultaneous subject localization and emotion classification.

TMM 2024

Facial Action Units as a Joint Dataset Training Bridge for Facial Expression Recognition
Shuyi Mao, Xinpeng Li, Fan Zhang, Xiaojiang Peng, and Yang Yang.

AU-ViT improves the performance of a target dataset by jointly training auxiliary datasets with off-the-shelf or pseudo AU labels.

NeurIPS 2023

Real3D-AD: A Dataset of Point Cloud Anomaly Detection
Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong Liu, Chengjie Wang, Feng Zheng.

Real3D-AD is a new dataset and benchmark of high-resolution 3D point clouds for anomaly detection tasks in real-world scenes.

ACM-MM 2022

Rail Detection: An Efficient Row-based Network and a New Benchmark
Xinpeng Li, and Xiaojiang Peng.

Rail-Detection includes Rail-DB and Rail-Net, a new real-world railway dataset and an efficient row-based rail detection method.