Abstract
Eye contact detection in group conversations is the key to developing artificial mediators that can understand and interact with a group. In this paper, we propose to model a group's appearances and behavioral features to perform eye contact detection for each participant in the conversation. Specifically, we extract the participants' appearance features at the detection moment, and extract the participants' behavioral features based on their motion history image, which is encoded with the participants' body movements within a small time window before the detection moment. In order to attain powerful representative features from these images, we propose to train a Convolutional Neural Network (CNN) to model them. A set of relevant features are obtained from the network, which achieves an accuracy of 0.60 on the validation set in the eye contact detection challenge in ACM MM 2021. Furthermore, our experimental results also demonstrate that making use of both participants' appearance and behavior features can lead to higher accuracy at eye detection than only using one of them. Copyright © 2021 Association for Computing Machinery.
Original language | English |
---|---|
Title of host publication | Proceedings of the 29th ACM International Conference on Multimedia, MM '21 |
Place of Publication | USA |
Publisher | Association for Computing Machinery |
Pages | 4873-4877 |
ISBN (Electronic) | 9781450386517 |
DOIs | |
Publication status | Published - 2021 |
Citation
Fu, E. Y., & Ngai, M. W. (2021). Using motion histories for eye contact detection in multiperson group conversations. In Proceedings of the 29th ACM International Conference on Multimedia, MM '21 (pp. 4873-4877). Association for Computing Machinery. https://doi.org/10.1145/3474085.3479230Keywords
- Eye contact detection
- Motion history image
- Group behavior
- Deep learning