Massive collective intelligence: behind the scenes of the partnership between Bluenove and Inria

0
Massive collective intelligence: behind the scenes of the partnership between Bluenove and Inria

Since 2014, bluenove has been supporting private and public companies and institutions in transforming their organization through major internal or external consultations. To achieve this, it operates its deliberative platform called Assembl. Two years ago, in 2019, the technology and consulting company specializing in massive collective intelligence partnered with the French National Institute for Research in Digital Science and Technology (Inria) to develop new algorithms to automate debate analysis. In 2021, bluenove announces the renewal of this partnership as part of its R&D strategy.

For the occasion, Actu IA spoke with Frank Escoubès, co-founder of bluenove, and Éric de la Clergerie, researcher specializing in automatic language processing at Inria, who has just been appointed to the position of R&D director on automatic language processing at bluenove. Focus on the massive collective intelligence sector and on all the issues involved in the partnership with Inria, which would enable the company to take a step forward.

Massive collective intelligence

Massive collective intelligence is the ability to mobilize very large communities of people (of the order of several thousand, tens of thousands, hundreds of thousands, or even more) in order to collectively solve a complex problem. In concrete terms, this means co-developing a strategic plan, a technological roadmap, public policies, etc. These participatory processes are very often carried out internally (with all the employees of an organization), even if the trend is to open up the debates to external stakeholders (customers, industrial partners, etc.).

Through the Assembl digital platform for consultation and debate, it is thus possible to argue and counter-argue on a given subject. To give an order of magnitude, clients such as Engie, EDF or Décathlon consult around 100,000 people simultaneously. And the Great National Debate of 2019 saw nearly two million people express themselves. At the beginning of 2021, the Assembl massive collective intelligence platform was updated with the ambition to accelerate and partly automate the analysis of free expression in order to make it “opposable” (verifiable by the participant).

Strengthening the practice of debate

In the field of massive collective intelligence, there are two cases:

  • Consultation: participants are asked open-ended questions and respond individually, without knowing what others are answering (as in a survey). The task of the consultation experts (and the underlying algorithms) is to apply a clustering method, i.e. to group together expressions conveying a similar meaning. This clustering is published in an open manner: the result of the semantic grouping by cluster and thus by key idea, as well as the weighting of these clusters, are visible to all. This contributes to the confidence of the stakeholders. There is no opacity in the process. For Frank Escoubès, it is more effective to publish the result of the clustering than the code of the algorithm that made it possible to carry it out, because everyone can judge the quality of a synthesis whereas the people able to audit an artificial intelligence algorithm are rare.
  • Deliberation: a debate is a co-construction in which citizens respond to the arguments of others. The confrontation of arguments and points of view gives rise to “refined” ideas. From a technical point of view, the analysis does not require pointing out frequencies or recurrences (because ideas are rarely repeated in the strict sense in a debate), but identifying “taxonomies”, i.e. logical units of meaning. They are based on a universal categorization: problems, solutions and arguments. On the basis of this repository describing the intellectual structure of the discourse, a “Mind Map” type of cartography can be constructed. The underlying technology is based on natural language recognition (NLP), and this is usually done in a multilingual version.

Both bluenove’s approaches aim to ensure that as many people as possible can express themselves and that this expression can be collected and restituted in a synthetic way so that it can be proposed to the decision-maker for arbitration. For it must be remembered that collective intelligence does not mean collective decision. It is not a question of direct democracy. When employees or citizens are consulted and/or deliberated upon, this means that they are considered to be fully-fledged adults, capable of having a point of view, and that this point of view may influence the decision of a third party decision-maker (such as the general management of a group, a minister, the president of a local authority, etc.). And if this point of view does not convince the decision-maker, the latter should ideally give reasons for refusing to take it into account.

bluenove and Assembl therefore support a decision-making process that uses the collective to characterize the field of possibilities.

The arrival of Éric de la Clergerie as R&D director of automatic language processing at bluenove

Éric de la Clergerie’s collaboration with bluenove began more than two years ago with a post-doctoral student who worked on two aspects: debate analysis and expression clustering. The company wanted to continue this collaboration in a more ambitious form:

“Between what was done two years ago and what could be done today in 2021, a real evolution has taken place thanks to the arrival of language models. It is part of the roadmap to understand to what extent these language models can be used for clustering and to detect formulations related to a problem, a solution or an argument. On the debate part, we need to look at the structural, semantic, syntactic, discursive aspects, which are key to conducting the analysis. »

Frank Escoubès states:

“An algorithm had already been designed even before the partnership with Inria. It uses machine learning to recognize the syntactic form of the expression of a “solution” based on a series of rules (about 300). For example, the use of conditional tense such as “it would have to be” or certain infinitives are markers that help the algorithm detect nuggets. The concatenation of these rules allows to recognize the formal expression of a solution. The challenge now is to have a double performance of precision and recall that is maximum, which allows both not to select noise and not to forget a part of the expressions falling under the category “solutions. »

bluenove’s challenges in developing new tools for debate

As part of its partnership with Inria, the company is focusing on three issues in its research to develop new algorithms for collective intelligence:

  • Increase the performance (precision and recall) of its algorithms so that they can efficiently locate the greatest number of solutions. Currently, the recognition rate is between 65 and 78% depending on the corpus, which is a respectable result, but still insufficient in the company’s eyes.
  • Complement the recognition of “solutions” with the recognition of “problems” upstream. This means focusing research on issues in order to identify them more easily.
  • Detecting singularities, nuggets, rare ideas that can disrupt the resolution of a problem, and which are unfortunately drowned in the mass when they can disrupt decision-making.

For Eric de la Clergerie, this last point is the real challenge. For Eric de la Clergerie, this last point is the real challenge. According to him, it is entirely possible to find solutions that can meet the first two challenges by refining the performance of algorithms. However, the singular suggestion or “nugget” is a particularly complex aspect, as rarity markers are an almost virgin search space in language recognition. How can we highlight rare but particularly relevant suggestions without giving more credit to so-and-so, which would be tantamount to recreating a hierarchy between people, and thus ignoring the meritocracy of ideas?

Frank Escoubès reveals one of the markers of singular ideas identified by the company: ideas expressed in the form of rare co-occurrences or with terms that are usually not compatible (such as the association of a concept with a brand. For example “to become the Meetic of self-generation of electricity”). But well beyond the imagined ideas, it is above all the “refined” ideas that represent the future of the understanding of a debate dynamic: how to trace the process of progressive enrichment of an idea by multiple contributions? Finally, Frank Escoubès reminds us that the objective is not to automate 100% of the processing, but to reduce it to volumes of data that can be processed on a human scale:

“If an algorithm allows us to go from 100,000 verbatim to 300 singularities that consultants can discover, we will have made a superb advance in the field of human knowledge. »

Translated from Intelligence collective massive : les coulisses du partenariat entre Bluenove et Inria