Edited nearest neighbour for selecting keyframe summaries of egocentric videos
Research output: Contribution to journal › Article › peer-review
Standard Standard
In: Journal of Visual Communication and Image Representation, Vol. 52, 04.2018, p. 118-130.
Research output: Contribution to journal › Article › peer-review
HarvardHarvard
APA
CBE
MLA
VancouverVancouver
Author
RIS
TY - JOUR
T1 - Edited nearest neighbour for selecting keyframe summaries of egocentric videos
AU - Kuncheva, Ludmila
AU - Yousefi, Paria
AU - Almeida, Jurandy
PY - 2018/4
Y1 - 2018/4
N2 - A keyframe summary of a video must be concise, comprehensive and diverse. Current video summarisation methods may not be able to enforce diversity of the summary if the events have highly similar visual content, as is the case of egocentric videos. We cast the problem of selecting a keyframe summary as a problem of prototype (instance) selection for the nearest neighbour classifier (1 nn). Assuming that the video is already segmented into events of interest (classes), and represented as a dataset in some feature space, we propose a Greedy Tabu Selector algorithm (GTS) which picks one frame to represent each class. An experiment with the UT (Egocentric) video database and seven feature representations illustrates the proposed keyframe summarisation method. GTS leads to improved match to the user ground truth compared to the closest-to centroid baseline summarisation method. Best results were obtained with feature spaces obtained from a convolutional neural network (CNN).
AB - A keyframe summary of a video must be concise, comprehensive and diverse. Current video summarisation methods may not be able to enforce diversity of the summary if the events have highly similar visual content, as is the case of egocentric videos. We cast the problem of selecting a keyframe summary as a problem of prototype (instance) selection for the nearest neighbour classifier (1 nn). Assuming that the video is already segmented into events of interest (classes), and represented as a dataset in some feature space, we propose a Greedy Tabu Selector algorithm (GTS) which picks one frame to represent each class. An experiment with the UT (Egocentric) video database and seven feature representations illustrates the proposed keyframe summarisation method. GTS leads to improved match to the user ground truth compared to the closest-to centroid baseline summarisation method. Best results were obtained with feature spaces obtained from a convolutional neural network (CNN).
U2 - 10.1016/j.jvcir.2018.02.010
DO - 10.1016/j.jvcir.2018.02.010
M3 - Article
VL - 52
SP - 118
EP - 130
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
SN - 1047-3203
ER -