عنوان مقاله فارسی: شناخت عمل و رویداد در ویدیوها با یادگیری از منابع وب ناهمگن
عنوان مقاله لاتین: Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources
نویسندگان: Li Niu; Xinxing Xu; Lin Chen; Lixin Duan; Dong Xu
تعداد صفحات: 14
سال انتشار: 2017
زبان: لاتین
Abstract:
In this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.
action and event recognition in videos by learning from heterogeneous web sources_1622969735_48867_4145_1802.zip1.68 MB |