If the adoption of solutions based on artificial intelligence is progressing throughout the world, the question of data, its collection, its quality and its sovereignty is at the center of many debates. We spoke with Antoine Couret, founder of ALEIA and president of the France AI Hub, about these issues, the difficulties for companies and the opportunities available to them.
1) What are the main risks and difficulties for a company in today’s artificial intelligence projects?
I’m going to separate the difficulties on one side, and the risks on the other, because they are two very different issues. The difficulties obviously depend on the size of the company. The three main ones that we have identified are the following:
-
- The first difficulty is acquiring native, quality data and having enough data to train the algorithms. As we can see, in some sectors, and even in very large groups, there is a lack of data, which poses real problems for developing the most relevant algorithms. And when the data is there, it is sometimes not of good enough quality to be used.
- The second difficulty, in my opinion, is governance, data management and security, both in terms of guaranteeing compliance with regulations but also in terms of ensuring that data is protected, as its misuse can obviously be catastrophic.
- Finally, we find a third difficulty in most companies: the transition to production, which allows us to move from an initial test to an operational application, with an adapted infrastructure. This industrialization stage is essential but often not mastered, because it is necessary to be able to scale up, which must be planned from the start, otherwise it will be a source of failure. This is why many projects do not go into production: they have generated costs (to produce them) but no benefits.
I would obviously add a final, very cross-cutting difficulty: the extreme scarcity of skills, especially to take AI projects through to production. We don’t just need data scientists, we also need developers, devOPS, AI project managers, business profiles trained in the use of AI, etc. These skills are not easy to gather, and it’s “normal” to have a lot of them. These skills are not easy to gather, and it is “normal” that companies face difficulties to develop and/or deploy AI-based applications. As for the risks, they are obviously linked to data security, but also to compliance: it is becoming increasingly difficult to be sure that data is compliant at all times with European regulations, which are changing regularly, or that there are no risks of sovereignty or leaks on sensitive data. This is obviously a major challenge for certain industries.
2) What is ALEIA’s value proposition to address these challenges?
ALEIA is a software company that offers a SaaS AI platform. Following these observations and after numerous exchanges with actors and manufacturers, the platform offers its customers a new way to industrialize their AI projects thanks to 3 main components: openness with a marketplace, collaboration with sharing between companies and “sovereign” industrialization in a few hours on French or European environments. These main components are not available in the current AI offer, or in an atomized way.
ALEIA is already recognized in this market as one of the first winners of the BPI on the transformation of industries, chosen by collaborative ecosystems such as IA Cargo in logistics, Renovaite in energy renovation in France and Germany, as well as the Cyber Campus in cyber security.
First of all, with ALEIA, we respond to the lack of data with what we call the promise of openness. This translates into several elements. At the enterprise level, we standardize datasets, which allows us to go out and find other datasets. Then, outside the enterprise, we also provide the ability to bring in other datasets, which will allow us to enrich those of the enterprises, thanks to a real “marketplace” developed with our partner DAWEX.
And all this is completed by specialized algorithms, thanks to the partnerships that ALEIA has with other editors. This allows us to provide what we call best of breed, i.e. partners available directly on the platform who can intervene on, for example, language processing, image analysis, etc. With this solution, a customer never starts from scratch but already has access to powerful algorithms and data sets. He can therefore process, within our one-stop-shop solution, all the data sources he has, without having to switch from a tool for text to a tool for images, etc. as is the case today.
Regarding governance and security, the platform offers a fully managed pipeline, from data ingestion to production, including the processing part. This pipeline is entirely secure and is based on a sovereign Cloud infrastructure that can be chosen either from OVH or Scaleway, or even on premise. The entire ingestion, processing and production process will take place in a fully secure environment with the marketplace fully integrated into it to reduce the number of entries and exits from the infrastructure and thus provide greater security.
We also have a strong promise that focuses on collaboration. This translates into permissions management to enable users from different entities to collaborate on projects, but also into data traceability to know their life cycle, control rights and access. This makes it possible to track all the processes carried out, to certify rights, rights on data, rights that are also distributed to the various participants according to their tasks.
These different elements allow us to respond effectively to the scarcity of skills among our customers, thanks to a fully managed pipeline, which allows us to pool the resources needed to have AI projects that can be easily deployed in production. These pipeline topics offer a high level of service and reduce the need for skills as users will be able to focus on their core business without the need for resources to set up the necessary IT infrastructure;
3) Which companies can use your services? (size, sector…)
We believe in AI for everyone! In concrete terms, our clients come from businesses where the interest of data and its processing was understood very quickly, but which are not technically ready. Most of the time, a datalake exists, but it lacks the necessary AI infrastructure for calculation, security, and production. …. Other customers already have this technological maturity, but a platform like ALEIA allows to manage the complexity of the subjects and to accelerate the production.
In fact, ALEIA addresses 3 main types of customers: companies, ecosystems or industrial sectors and communities of data scientists or AI Ops.
All organizations need to accelerate their AI projects, whether they are large corporations, SMEs or SMEs, but also local authorities or ministries for example. ALEIA allows them to quickly put into production developed algorithms or to integrate AI in their industrial applications, as for example at DIAC for credit scoring or with the Loire Valley for the management of tourist flows.
ALEIA also provides a very concrete service for the industry, bringing together several companies willing to share data and develop algorithms on it. We can mention the AgDataHub for agriculture, Campus Cyber for cybersecurity or IACargo for logistics. These data ecosystems will benefit the most from the collaboration, governance, and rights management aspects, in order to have confidence in the use of the data. They will also use the marketplace, if they develop algorithms, to be able to exchange them with each other. And we are the only ones to offer the set of services they need with a level of collaboration that the American platforms do not offer.
Finally, we are already receiving requests from beta testers, freelancers such as data scientists and Data/AI Ops who are interested in AI to carry out their projects, or the projects of their clients.
4) The notion of sovereignty is more and more present, why is it becoming so important for client companies?
Sovereignty has progressively become a real market demand, and even more so after the events of the last few years. There is obviously a need to control data, and to avoid being subject to extraterritorial laws. Digital sovereignty implies having access to sovereign environments in terms of hosting on the one hand, but also in terms of processing: at ALEIA, all the companies and data sets that will be present will be validated, whether French or European.
The second issue is more industrial. When you work with European partners, when you develop algorithms or datasets, the more complex you work on, the more you are confronted with difficulties related to infrastructure and processing. If you are accompanied by non-European partners, the sharing of intellectual property and know-how can be more diffuse and can be carried out outside your own country. This industrial know-how is very important and you acquire it more when you work with European partners because you will grow together, the proximity helping to co-design the expected solutions.
The third topic, we had an example of this with the recent fight between Facebook and the European Union on the location of personal data. It is a question of developing a certain European and French technological independence, insofar as we have all the necessary skills in Europe. This is a more global issue that affects the awareness and the will of France and Europe to make sure that technology and AI is a competence and a European technological industry recognized worldwide, without depending on the choices of giants like Microsoft, Amazon or Google. Europe is an important provider of AI skills with its significant investments in terms of R&D and innovation but benefits little from these investments because they are then exploited rather on the US side. This awareness started two or three years ago and is even stronger now. It is now a question of creating a real industry, especially in terms of AI, to have a real digital industrial sector covering the deeptech part, that is to say the subjects of processors, semi-conductors, cloud, hosting but also data processing, algorithms, learning, standardization, creating operating systems designed for AI, whether it is cloud AI or embedded AI.
5) Does choosing sovereignty allow us to benefit from the latest innovations, at the risk of cutting ourselves off from those of the major American publishers?
More than anything else! And in full confidence! The capacity for creativity and innovation is very strong in France and in Europe. This does not mean cutting ourselves off from the latest innovations, but rather benefiting from the quality of European innovation and investment in R&D. We have all the cards in hand to do this. Today, we have industrial leaders on one side, and AI R&D on the other. In Europe, the bridge between the two is missing, and it is the large American companies that ultimately benefit from European innovations. So we need to operationalize this R&D so that industrialists can finally benefit from it without this bridge going through the United States alone, as is currently the case, which is regrettable. Today, we have all the means and the first players to ensure that these bridges are built directly in Europe.
6) ALEIA has just raised a very nice amount of funds, what are the next steps?
Many projects! Three mainly.
First of all, it will allow us to continue the development of the product because we need to be technically among the best to compete with our North American and Asian competitors. Our main challenges are to have increasingly rich data sets and pre-trained algorithms. This is a priority for the first half of 2022 to have a large ecosystem of partners available in the market place. And then obviously, we will further intensify investments in quality and production management in the second half of 2022.
The second major step is the opening of the AI ecosystem. AI is for everyone! We want to broaden access to AI with three main axes: access to SMEs and ETIs, increasing the community at the beginning of the school year and finally preparing open source bricks in 2023. With all the work we have done, we believe that there are a number of interesting bricks foropen source. This is obviously very much linked to our desire to build a community and to the fact that for this digital ecosystem to be built and deployed in Europe, we need to be very close to what is done in the United States. Leading open source communities is a very important pillar for us for 2022-2023. This participation in the growth of the ecosystem is in line with strengthening our sovereignty.
Finally, the third step is to prepare for the international market, particularly in Europe. There are rich ecosystems in all European countries, but they face the same problems as in France. What’s more, they are not very well connected between countries, whereas large companies need solutions that don’t worry about borders.
All this will only be possible if we continue to grow. We are going to recruit more than 50 people in the next 6 months, particularly for positions as developers, product managers, scrum masters, data scientists, etc. We offer a pleasant work environment, full remote work possibilities and, above all, complex technological challenges to be met in computer science and data science to support the transformation of industries through AI.
7) On a personal note, what are the latest innovations (research project, product, startup, concept, last use case) in AI that have impressed you the most?
If we stay in the field of ALEIA’s activity, I am surprised by the maturity and the desire of more and more companies to tackle data and AI issues in their industrial field. These are still complex subjects, not really mainstream, but there is a real will and a lot of interest.
At the same time, I am following very closely the GPT / OpenAI subject on which I think that Europe and France have roles to play. In France, we have the capacity to go as far as unsupervised models with billions of parameters with companies such as LightOn and Genci’s computing capacities. Thanks to our computing resources, we have a strong capacity to make progress on the most advanced algorithms in all areas of language processing. With the diversity of languages in Europe, there is obviously a huge potential and a huge need in this area. In Germany, things are also moving forward and we come back to the question of sovereignty. Indeed, when we approach the subject of languages, it is obviously important to learn with European bias, in the sense that each language has its linguistic but also cultural and emotional specificities, intentional.
Another scientific subject on which France also has a role to play is that of synthetic data, particularly in the medical field and the creation of synthetic patients. These make it possible to increase data sets in a very interesting way, especially when we have little data. They can have a huge industrial impact. For example, in clinical research, the cost of cohort collection is very high and, therefore, being able to augment the cohort from real patient data allowing, from a certain number of real patients, to create thousands of them, synthetic, greatly reduces the costs of clinical trials. This necessarily has an impact on scientific research. This is a central area, but it is also interesting in other areas such as autonomous driving on vehicles, trains or even airplanes. The environment to be discovered is enormous, there are many possibilities and the fact of creating synthetic data environments is therefore very relevant.
Finally, it is important to mention the European awareness to move towards a real technological independence and to create a real industry around it. Europe is moving forward, but there is also a lot of work to be done on the regulatory side to free up this capacity, to free up the data. For example, to stay with the issue of synthetic data: at the moment, this cannot be put into production because of regulatory issues, particularly at the level of the CNIL. There are hearings on the subject at the moment.
Faced with the desire, we obviously need security, rights management, and qualified data, but we are at a point where we still have many obstacles. We were very “shielded” two years ago and now we are in the process of taking out the “sword”. This is positive, but we need to strengthen ourselves further so that Europe can move forward, get rid of its old demons and better appreciate the balance between the freedom given to data exploitation and the preservation of privacy.
Translated from Adoption de l’IA, données et souveraineté : Entretien avec Antoine Couret (ALEIA)