Multi-Script Video Caption Localization Based on Visual Rhythms

Souza, Marcos Roberto e and Maia, Helena de Almeida and Santos, Anderson Carlos Souza e and Bernardes Vieira, Marcelo and Pedrini, Helio (2022) Multi-Script Video Caption Localization Based on Visual Rhythms. Applied Artificial Intelligence, 36 (1). ISSN 0883-9514

[thumbnail of Multi Script Video Caption Localization Based on Visual Rhythms.pdf] Text
Multi Script Video Caption Localization Based on Visual Rhythms.pdf - Published Version

Download (4MB)

Abstract

Localization of video caption plays an important role in information retrieval in multimedia applications. In this work, we present and evaluate a novel method for localizing video captions using visual rhythms, which enable the representation and analysis of a specific feature throughout the time. We build visual rhythms from the text location maps produced by general text localization methods that are far more common in the literature than caption-oriented ones. Then, we process the maps properly to keep only the captions, generating caption localization masks. To meet the need for a standardized and large dataset, we constructed a new one, where captions with thirteen different scripts are added to the video frames, generating a total of 221 videos with ground truth. Experiments demonstrate that our method achieves competitive results when compared to other literature approaches.

Item Type: Article
Subjects: Apsci Archives > Computer Science
Depositing User: Unnamed user with email support@apsciarchives.com
Date Deposited: 16 Jun 2023 04:39
Last Modified: 06 Dec 2023 04:28
URI: http://eprints.go2submission.com/id/eprint/1299

Actions (login required)

View Item
View Item