Souza, Marcos Roberto e and Maia, Helena de Almeida and Santos, Anderson Carlos Souza e and Bernardes Vieira, Marcelo and Pedrini, Helio (2022) Multi-Script Video Caption Localization Based on Visual Rhythms. Applied Artificial Intelligence, 36 (1). ISSN 0883-9514
Multi Script Video Caption Localization Based on Visual Rhythms.pdf - Published Version
Download (4MB)
Abstract
Localization of video caption plays an important role in information retrieval in multimedia applications. In this work, we present and evaluate a novel method for localizing video captions using visual rhythms, which enable the representation and analysis of a specific feature throughout the time. We build visual rhythms from the text location maps produced by general text localization methods that are far more common in the literature than caption-oriented ones. Then, we process the maps properly to keep only the captions, generating caption localization masks. To meet the need for a standardized and large dataset, we constructed a new one, where captions with thirteen different scripts are added to the video frames, generating a total of 221 videos with ground truth. Experiments demonstrate that our method achieves competitive results when compared to other literature approaches.
Item Type: | Article |
---|---|
Subjects: | Apsci Archives > Computer Science |
Depositing User: | Unnamed user with email support@apsciarchives.com |
Date Deposited: | 16 Jun 2023 04:39 |
Last Modified: | 06 Dec 2023 04:28 |
URI: | http://eprints.go2submission.com/id/eprint/1299 |