📝 Publications

Towards Online Multi-Modal Social Interaction Understanding
Xinpeng Li, Shijian Deng, Bolin Lai, Weiguo Pian, James M. Rehg, Yapeng Tian.
Online-MMSI-VLM is a novel framework, for the newly proposed online MMSI setting, that leverages multi-party conversation forecasting and social-aware visual prompting with multimodal large language models.

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li.
DSCT is a single-stage emotion recognition approach, with subject-context decoupling, for simultaneous subject localization and emotion classification.

Facial Action Units as a Joint Dataset Training Bridge for Facial Expression Recognition
Shuyi Mao, Xinpeng Li, Fan Zhang, Xiaojiang Peng, and Yang Yang.
AU-ViT improves the performance of a target dataset by jointly training auxiliary datasets with off-the-shelf or pseudo AU labels.

Real3D-AD: A Dataset of Point Cloud Anomaly Detection
Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong Liu, Chengjie Wang, Feng Zheng.
Real3D-AD is a new dataset and benchmark of high-resolution 3D point clouds for anomaly detection tasks in real-world scenes.

Rail Detection: An Efficient Row-based Network and a New Benchmark
Xinpeng Li, and Xiaojiang Peng.
Rail-Detection includes Rail-DB and Rail-Net, a new real-world railway dataset and an efficient row-based rail detection method.