基于 R*CNN 的上下文动作识别

May, 2015

Contextual Action Recognition with R*CNN

Georgia Gkioxari, Ross Girshick, Jitendra Malik

TL;DR通过利用场景和时下流行的深度学习模型RCNN的多区域分类法，作者提出了一种新颖的基于动作的行为识别系统R*CNN，它不仅可以在PASAL VOC Action数据集上实现90.2%平均精确率，超过了同领域其他方法，而且还能在Berkeley Attributes of People数据集上实现最新最好的人物属性分类效果。

Abstract

There are multiple cues in an image which reveal what action a person is performing. For example, a jogger has a pose which is characteristic for the action, but the scene (e.g. road, trail) and the presence of other joggers can be an additional source of information. In this work, we exploit the simple observation that actions are accompanied by