Recording of Content through Automatic Indexing

Due to the increasing flood of information in our digital era, it is becoming more and more difficult for users to orientate themselves or to find specific information in volumes of heterogeneous data (BIG DATA) that are becoming larger and larger.

Due to the incredibly fast pace of the modern world, time pressure for high-quality, intellectual research is on the rise as well. At the same time, the technical requirements of media companies – such as publishers, news agencies, online portals, public broadcasters and private television companies – are increasing in order to find high-quality content in audio, video and text data that stands out from the masses.

Brochure Tech Blog


The dio:semantic Categorizer indexes documents of any language and taxonomy. The combination of statistical methods with core linguistic principles guarantees very high cataloging quality.

Details: Categorization

Entity Recognition

Entities (for example persons, organizations, geographic terms,…) are identified in documents by means of entity recognition. Together with clustering, this provides considerable support for fast research.

Details: Entity Recognition

Language Identification

Automatic language recognition identifies the most important European languages in the document in question. If the document contains passages with different languages, then all languages can be recognized.

Details: Language Identification

Related Topics and Duplicates

The semantic system provides comparative values between one document and another through analyzing thematic overlaps or commonalities in the formulation.

Details: Related Topics


All documents can be grouped into clusters completely automatically. Thematic structuring makes it possible to collect and screen large volumes of digital content.

Details: Clustering

Automatic Summary

The dio:semantic Summarizer creates text summaries fully automatically with the relevant key statements of a document reduced to a predefined amount of text.

Details: Automatic Summary

Keywords and Tag Cloud

dio:semantic determines the central keywords of a document and thereby provides the perfect prerequisite for modern tag cloud visualization and automatic indexing.

Details: Automatic Keywords

Analysis in Video & Audio Data

Through the partnership with EML, the experts in the field of Speech-2-Text (audio transcription), dio:semantic can offer its customers the semantic repertoire with additional information for video and audio data as well.

Details: Speech-2-Text

Context-sensitive Advertising

Conventional advertising systems are highly dependent on existing metadata and its quality. Our solution can directly record the plain text of any media content and is therefore completely independent of existing metadata.

Details: Context-sensitive Advertising

Semantic Search

dio:semantic completes the semantic repertoire through the possibility of making it possible to accurately find all previously analyzed and indexed data that has been enhanced with metadata.

Details: Semantic Search

Application Examples:

Presse-Monitor GmbH Burda Information Services GmbH Deutsche Presse-Agentur GmbH

On these pages, you will find out what performance and individual adaptations are possible with the semantic system. If you would like additional explanations of these current and real implementations, then please send us a message.

Training Client

The dio:semantic Training Client completes the semantic portfolio in the area of data maintenance. With this desktop application, the knowledge base can be conveniently managed, maintained and optimized.

Details: Training Client

Do you have any questions? Contact us – we would be happy to help:

picturesafe media/data/bank GmbH

Wendenstraße 21
20097 Hamburg / Germany
Tel.: +49 40 - 37 41 27 - 700
Fax: +49 40 - 37 41 27 - 999

Leave us a message

Your E-Mail (required)

Your Message