Aligning and linking metadata

 

In TOSCA-MP, many different metadata types have been extracted from the material. Moreover, a lot of this metadata is time-coded. Automatic speech recognition generates a transcript of the multimedia objects together with a very detailed time-based description of the position of each single word in the sentence.

Also the visual metadata extraction tools researched during the project result in time-positioned metadata. The tools recognise objects and persons together with the exact timestamp at which these recognitions have occurred.

Exploitation of this time-based metadata into the search index has a significant impact on the efficiency of multimedia retrieval by the end users. The metadata not only allows finding occurrences of the topic which the user searches for, but it also allows jumping to the exact position in the multimedia object at which the match occurs. This avoids the manual browsing of the search results by the user, which makes the retrieval process significantly more efficient.

The automatic content clustering researched during the TOSCA-MP project has added an extra layer to this time-based metadata, since metadata extracted by different automatic feature extraction tools are linked together in the tools developed. Moreover, objects appearing at different times in the media are automatically linked together.

© 2017 TOSCA-MP - Task-Oriented Search and Content Annotation for Media Production
The research leading to the presented results has received funding from the European Union's
Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 287532. - Imprint