Marie Skłodowska-Curie Actions

Computer Vision Centre - The Intelligent Reading Systems Group

    15/07/2018 17:00 - Europe/Brussels
    H2020 / Marie Skłodowska-Curie Actions
    Spain, Cerdanyola del Vallès
    Computer Vision Centre
    Research Projects Office

The Intelligent Reading Systems (IRS) group at the Computer Vision Centre (CVC) counts with more than 20 years of research and technology transfer experience in computer vision systems for extracting and interpreting written (textual or symbolic) information in images.

The group is an international leader in scene text understanding, document image analysis, graphics recognition, handwritten recognition, musical scores understanding, and human-document interaction. The group is a recognised consolidated group of the Catalan research systems since 2005.

With more than 30 members, the group is one of the biggest ones worldwide in its field. The members of the group have produced more than 300 scientific publications and have participated in in a large number of European and national research projects, as well as numerous technology transfer activities.

Senior members of the group have, or have had leadership positions in the Technical Committees 10 (Graphics Recognition) and 11 (Reading Systems) of the International Association of Pattern Recognition, and have played several roles in the organization and scientific areas of the top international conferences in the field (ICPR, ICDAR, DAS, ICFHR, GREC, IWRR, CBDAR).


Project Description: Text in images provides important high-level semantic information. There is text in about 50% of the images in typical large-scale real-scene image datasets, and the percentage goes up sharply in urban environments. When it comes to text, explicit recognition is required. Recognising scene text in real-life imagery has driven significant research efforts over the past years. Nevertheless, contextual information is not generally taken into account, while scene text is not properly incorporated in visual scene interpretation methods.

We expect to receive proposals from candidates interested in learning multi-modal models for image interpretation, and in particular in the joint modelling of visual and textual information (scene text or scene-related textual information), with potential application areas in scene text understanding, image classification, captioning and visual-question answering. Multi-disciplinary research proposals, integrating the people in the loop (interaction, eye-tracking, serious games, citizen science, active learning) will be particularly welcome.