Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/104347
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Paggio, Patrizia | - |
dc.contributor.author | Navarretta, Costanza | - |
dc.contributor.author | Jongejan, Bart | - |
dc.date.accessioned | 2022-12-12T14:58:11Z | - |
dc.date.available | 2022-12-12T14:58:11Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Paggio, P., Navarretta, C., & Jongejan, B. (2017, April). Automatic identification of head movements in video-recorded conversations: can words help? Proceedings of the Sixth Workshop on Vision and Language (pp. 40-42). | en_GB |
dc.identifier.isbn | 9781945626517 | - |
dc.identifier.uri | https://www.um.edu.mt/library/oar/handle/123456789/104347 | - |
dc.description.abstract | Head movements are the most frequent gestures in face-to-face communication, and important for feedback giving (Allwood, 1988; Yngve, 1970; Duncan, 1972), and turn management (McClave, 2000).Their automatic recognition has been addressed by many multimodal communication researchers (Heylen et al., 2007; Paggio and Navarretta, 2011; Morency et al., 2007). The method for automatic head movement annotation described in this paper is implemented as a plugin to the freely available multimodal annotation tool ANVIL (Kipp, 2004), using OpenCV (Bradski and Koehler, 2008), combined with a command line script that performs a number of file transformations and invokes the LibSVM software (Chang and Lin, 2011) to train and test a support vector classifier. Successively, the script produces a new annotation in ANVIL containing the learned head movements. The present method builds on (Jongejan, 2012) by adding jerk to the movement features and by applying machine learning. In this paper we also conduct a statistical analysis of the distribution of words in the annotated data to understand if word features could be used to improve the learning model. Research aimed at the automatic recognition of head movements, especially nods and shakes, has addressed the issue in essentially two different ways. Thus a number of studies use data in which the face, or a part of it, has been tracked via various devices and typically train HMM models on such data (Kapoor and Picard, 2001; Tan and Rong, 2003; Wei et al., 2013). The accuracy reported i these studies is in the range 75-89%. Other studies, on the contrary, try to identify head movements from raw video material using computer video techniques (Zhao et al., 2012; Morency et al., 2005). Different results are obtained depending on a number of factors such as video quality, lighting conditions, whether the movements are naturally occurring or rehearsed. The best results so far are probably those in (Morency et al., 2007), where an LDCRF model achieves an accuracy from 65% to 75% for a false positive rate of 20-30% and outperforms earlier SVM and HMM models. Our work belongs to the latter strand of research in that we also work with raw video data. | en_GB |
dc.language.iso | en | en_GB |
dc.publisher | The Association for Computational Linguistic | en_GB |
dc.rights | info:eu-repo/semantics/restrictedAccess | en_GB |
dc.subject | Body language -- Research | en_GB |
dc.subject | Speech and gesture | en_GB |
dc.subject | Facial expression -- Data processing | en_GB |
dc.subject | Conversation analysis | en_GB |
dc.subject | Speech processing systems | en_GB |
dc.subject | Modality (Linguistics) | en_GB |
dc.subject | Machine learning | en_GB |
dc.title | Automatic identification of head movements in video-recorded conversations : can words help? | en_GB |
dc.type | conferenceObject | en_GB |
dc.rights.holder | The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder. | en_GB |
dc.bibliographicCitation.conferencename | The 6th Workshop on Vision and Language | en_GB |
dc.bibliographicCitation.conferenceplace | Valencia, Spain. 04/04/2017. | en_GB |
dc.description.reviewed | peer-reviewed | en_GB |
dc.identifier.doi | 10.18653/v1/W17-2006 | - |
Appears in Collections: | Scholarly Works - InsLin |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Automatic_identification_of_head_movements_in_video-recorded_conversations_Can_words_help(2017).pdf Restricted Access | 82.69 kB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.