The CIIRC CTU participates in the implementation of European project dealing with extreme data

0
1138

The Czech Institute of Informatics, Robotics and Cybernetics, CTU became one of the partners in the nine-member consortium that came together to implement the EXA4MIND project, which received almost EUR 5 million in funding under the Horizon Europe program.

The name of the project is an acronym for EXtreme Analytics for MINing Data Spaces and its goal is to build a platform for extremely large data by 2026 that will connect data stores and powerful computing infrastructure (supercomputers) by introducing new methods of automatic data management and their efficient transfer and storage.

At our institute, the group of applied algebra and geometry of the department of robotics and machine perception participates in this project, the implementation of which should take place from the beginning of 2023 to the end of 2025. It develops machine learning methods aimed at processing a large amount of multimodal data coming from cameras, LiDAR sensors, and possibly also radars of autonomous vehicles, which are used to render the space around them.

A group of researchers from the CIIRC is working closely with Valeo company to develop these methods, in particular with its research group valeo.ai, which specializes in the research and development of artificial intelligence and machine learning in the automotive industry with the aim of improving the efficiency, safety and comfort of automotive applications.

The main task of our research team is research and development in the field of machine learning with limited access to annotated data, i.e. to data marked in such a way that the machines understand what they are and what they mean during their learning. This is a typical case of data collected in cars, where a large number of recordings is available, but these recordings lack markings (annotations).

Annotated data are very important for training machine learning models, but manual data annotation is very demanding both in terms of time and money. This is a process where a person has to manually assign some kind of meaningful information label to each data example so the machine learning model can learn from it. The methods developed in this project will therefore be able to extract useful information from unannotated data and produce a system that subsequently needs less annotated data to perform the final task.

“In practice, this means that we first run the model using the method we developed on unannotated data. Subsequently, we will use this model as the initialization of the model for the target task, such as the detection of pedestrians or moving cars. The model initialized in this way subsequently needs much less annotated data for the target task, which brings great savings in terms of the need for manual annotations“, explains Antonín Vobecký from our institute, who participates in the research.

The problem of extreme data refers to such a huge amount of data that exceeds the capacity of common tools for their management and analysis. And the amount of data keeps growing, so it becomes more and more challenging to store, process and interpret them. It requires the use of special technologies and tools such as distributed databases, cloud storage, parallel data processing tools, machine learning and others.

The management and analysis of extreme data is important for many areas, such as biomedicine, finance, marketing or production. By analyzing extreme data, organizations can uncover hidden relationships and patterns in it, which can lead to improved business processes, improved products and services, and better understanding of customer behavior.

Representatives of major universities, research institutions, small and medium-sized enterprises and the industrial sector participate in the implementation of the European EXA4MIND project. As models, four application areas characterized by large volumes of data were chosen:
– molecular dynamics
– autonomous driving
– smart agriculture/viticulture
– healthcare/society.

The output of the project should be the design of a unique database and new methods of data transmission, storage and analysis, which will be carried out using supercomputers using artificial intelligence and machine learning.

The coordinator of the EXA4MIND project is the IT4Innovations – the national supercomputer center at the VSB – Technical University in Ostrava, which currently operates three supercomputers, Karolina, Barbora and NVIDIA DGX -2, of which the most powerful Karolina with a performance of 15.7 PFlop/s is in operation from 2021.

The EXA4MIND project consortium consists of a total of nine partners:

IT4Innovations, VSB – Technical University Ostrava (Czech Republic)
Leibniz Supercomputing Center of the Bavarian Academy of Sciences and Humanities (Germany)
Middle East Technical University (Turkey)
AUSTRALO Marketing Agency (Spain)
Consulting organization EURAXENT (France)
CIIRC, CTU in Prague (Czech Republic)
Valeo Car air conditioner k.s. (Czech Republic)
IT consulting company ALTRNATIV (France)
Terraview Consulting (Switzerland)

Additional information and resources:

Read the press release issued by VSB-Technical University Ostrava HERE.

More information about the project can be found on the EK CORDIS website.

Learm more about machine learning with limited access to annotated data HERE.

Previous articleThe Czech group DECCI has started the construction of a unique solution for higher energy security and flexible use of renewable resources. CIIRC is one of the partners.
Next articleTEPLATOR presented in Brussels on the occassion of launching the “EU SMR Initiative 2030”