With video capture devices becoming widely popular, the amount of video data generated per day has seen a rapid increase over the past few years. Browsing through hours of video data to retrieve useful information is a tedious and boring task. To address this issue, video summarization has played a crucial role towards making this possible.
Video summarization is a well-researched topic in the multimedia community. However, the focus so far had been limited to creating summary to videos which are short (only a few minutes). This workshop aims to call for researcher on relevant background to focus on novel solution for user-centric narrative summarization of long videos. Specifically, the goal is to provide the users with meaningful information and insights from long input videos, potentially captured from multiple cameras. Since the generated output of any video summarization task will be finally consumed by humans, it is also important to have an element of storytelling, where the resulting summary is presented in the form of a narrative for humans to understand easily. These aspects have not been adequately addressed in the existing literature.
This workshop will also discuss other important aspects of the current video summarization research. For example, what is ‘important’ in a video and how to evaluate the goodness of a created summary is still subjective. Many works are based on human annotated training data where the relevance of each frame of the video is annotated (supervised learning methods), or some works consider the summary to be good if the original video can be well reconstructed (unsupervised methods). However, most of the current works do not explicitly take into account the scene semantics (e.g., scenes, objects, people, actions and relations) happening in the video, which are significant indicators in deciding what is ‘important’ in a video.
This workshop aims to bring together researchers in academia and industry to discuss about topics related to video summarization of long videos, its applications, and other open problems.
Ioannis (Yiannis) PatrasQueen Mary University of London
Manmohan ChandrakerUniversity of California San Diego
|09:00 - 09:05||Opening remarks|
|09:05 - 09:45||Invited talk||
Learning, Understanding and Interaction in Videos
Manmohan Chandraker (University of California San Diego)
|09:45 - 10:00||Invited presentation||
Compute to Tell the Tale: Goal-Driven Narrative Generation
(Brave New Idea paper, ACM MM 2022)
|10:00 - 10:45||Paper presentations|
|- 10:00 - 10:15||Paper||Narrative Dataset: Towards Goal-Driven Narrative Generation|
|- 10:15 - 10:30||Paper||Soccer Game Summarization using Audio Commentary, Metadata, and Captions|
|- 10:30 - 10:45||Paper||Contrastive Representation Learning for Expression Recognition from Masked Face Images|
|10:45 - 11:00||Break|
|11:00 - 11:40||Invited talk||
Video Summarization in the Deep Learning Era: Current Landscape and Future Directions
Ioannis Patras (Queen Mary University of London)
|11:40 - 12:30||Panel discussion||Emerging Topics on Video Summarization|
|12:30 - 12:35||Closing|