Language models are generally designed by the giants of new technologies such as GPT-3 from Open AI or Gopher from Deepmind. Recently we dedicated an article to BLOOM,the largest open-source multilingual language model trained to date involving a thousand researchers. LightOn, a French start-up that developed a language model for European companies, has just announced that its LightOn Muse application is now available in French.
Founded in 2016 by Igor Carron, Laurent Daudet, Florent Krzakala and Sylvain GiganCo, LightOn has 20 collaborators among the best European ML engineers and researchers. After launching its first-ever photonic co-processor in 2020, the startup tackled a new challenge in the emerging field of AI: creating the first large language models for European languages, in particular French.
The genesis of the VLM-4 project and the Muse API
Initially, training large language models was one of the possible applications for LightOn’s photonic processor. Very quickly, the team became passionate about this new generation of AI, still unknown in France, and developed and trained its own models. In 2020, it makes its first French model freely available through a simple text generation interface: PAGnol. In 2022, after a year and a half of intensive work, it has developed VLM-4, a suite of large language models in 5 European languages: English, German, Spanish, French and Italian.
Language models and European sovereignty
Most language models, due to their complexity and cost, are the prerogative of large companies such as GAFAM, and are only available in English, Chinese and Korean. The use of these large models for other languages can only be done through a translation tool, which implies a decrease in quality and an increase in costs. Models such as BLOOM or LightOn Muse are changing this situation.
The key question raised by the relatively low level of development of these technologies in Europe is that of sovereignty: these technologies are essential to the digital transformation of companies, which gives them a decisive competitive advantage over their competitors. European companies find themselves powerless in the face of the free recovery and use of their data, which will be used to improve non-European products and services.
LightOn is the first company to train large language models directly in four European languages other than English.
Democratize access to large VLM-4 language models in 5 languages and customization features (skills)
The Muse API is aimed at all European players, whatever their size or sector of activity (marketing, media, entertainment industry, technology companies, even administrations), who need to address their audience in their own language. It gives them access to the major VLM-4 language models in five languages (French, English, German, Spanish, Italian), as well as to customization features (skills) allowing them to “specialize” the model for particular tasks.
The goal is to allow all these actors to easily build :
- Business products and services around text-related tasks: a shopkeeper will be able to be helped by AI to create daily content for his store’s Instagram account, a student to better classify and synthesize the countless articles for his dissertation work… Content editors, meanwhile, will be able to rely on the model’s suggestions to write faster;
- Products and services for their internal use: managers will be able to be helped by an AI assistant in project management, emails or internal documents will be automatically summarized so that employees only have to read the essential, by detecting those that call for an immediate response…
A multilingual API
Muse opens the models on a large scale to a wide range of languages: French, English, Italian, Spanish, German, many others are planned (40 languages by the end of 2023).
Ease of use
The Muse API is designed to be flexible, easy to integrate into any system, and usable by anyone, anywhere. Simply give instructions and examples in natural language, as if you were interacting with a human.
Save time and increase efficiency
Time-consuming tasks that are essential to the operation of a business, such as responding to emails and customer reviews, writing posts to rank in online searches, can be handled by the Muse API.
Ultra-powerful language models
The Muse API uses VLM-4, some of the most powerful language models on the market. LightOn engineers are constantly innovating to increase the size of their models and the quality of the data they are trained with (two critical parameters in text generation). VLM-4s have the ability to respond in context, to learn to perform a task from only a few examples (few-shot models) or even without any examples (zero-shot models).
Many customization possibilities
By making the models very efficient at executing specific tasks (skills), LightOn engineers can adapt them to the needs and particularities of each company.
For e-marketing, for example, the Muse api provides :
- High-performance SEO to automatically generate text around popular keywords, to improve the company’s visibility;
- Email campaigns and advertisements that hit the bull’s-eye as they target each customer in particular;
- Much higher quality content creation, with much less effort.
Improved customer experience
Customer satisfaction increases as they benefit from real-time assistance via more efficient chatbots. At the same time, the implementation of personalized search engines facilitates the analysis of all types of data.
Customer feedback analysis
The “Sentiment Analysis” feature provides a reliable summary of customer feedback to simplify decision-making. The various opinions and evaluations are analyzed and classified. The customer database is structured, which allows for more efficient management.
Data management
Muse can summarize documents and emails to extract essential information to save time. In addition, by creating a custom search or classification tool, large amounts of data of all types can be processed efficiently.
Translated from Machine learning : LightOn annonce la disponibilité de son API MUSE en français