The software development platform GitHub announced, at the end of June, the launch of its latest tool: GitHub Copilot. Designed in collaboration with Open AI and based on Codex, an engine announced to be more powerful than GPT-3 for code generation. This artificial intelligence, which comes in the form of an extension for Visual Studio Code , is capable of generating entire blocks of code.
The latest artificial intelligence model for code generation: GitHub Copilot
The principle of GitHub Copilot is relatively simple: when a developer writes code to develop a website, the tool offers autocompletion of entire portions of code. Nat Friedman, GitHub’s CEO spoke about how GitHub works:
“It helps you quickly discover alternative ways to solve problems, write tests, and explore new APIs without having to tediously customize a web search for answers. As you type, it adapts to the way you write code, to help you finish your work faster.”
The tool recognizes a wide range of frameworks and languages, but it remains most effective on the following languages: Go, Ruby, TypeScript, JavaScript and Python, the five it was designed for. The developer using the tool remains, of course, “in control” of what he writes at all times: the system offers suggestions that can be accepted or rejected.
Several features are available to coders, including: code generation from comments written in natural language, or writing unit tests corresponding to the code written by the user, an activity that is rarely perceived as exciting but which is a determining factor in the production of reliable code.
Finally, copilot also allows the automatic filling of repetitive code, as well as the proposal of alternative solutions, allowing the discovery and/or implementation of new approaches.
The tool adapts to the behavior and habits of the developer and takes into account his past choices when proposing new suggestions, so that, for example, it no longer makes suggestions similar to those previously made.
Autocompletion tools: a new step in the evolution of programming
Autocompletion has been at the heart of the evolution of IDEs for the last twenty years. From the display of expected function parameters to the new features of Github Copilot, the goal is to allow developers to gain productivity in an increasingly complex activity.
This is one of the selling points of the TabNine editor, which is able to make autocompletion proposals to developers in twenty-two different development languages, including Python, Java and JavaScript, C, C#, PHP or Ruby. This AI was trained using two million lines of free code on GitHub and is based on the use of the predecessor of GPT-3: GPT-2 and a neural network of type Transformer.
In 2019, Codota, an Israeli startup, acquired TabNine to “boost” its AI-based code prediction. At the time, Dror Weiss, co-founder and chief executive officer of Codota, said of autocomplete:
“Using AI to create code is already resulting in huge throughput gains for development teams, and that number is only expected to grow as Codota’s user base expands and its product line and technology are upgraded.”
The two companies have since merged and last month the TabNine brand was retained as the company’s main name. Now, the tool works with more than 30 languages including Typescript, Go or Rust, all three available on GitHub Copilot.
What to think of these tools?
The ability of developers to rely on existing bricks to build new ones allows them not to try to reinvent the wheel every day, and to work on increasingly complex systems by erasing the difficulty of “low-level” tasks. This is one of the fundamental principles of programming, and the sharing of portions of code and entire open source libraries is a big part of it.
Despite this, programming remains a very repetitive activity. Even in an innovative software, 70% to 90% of the lines of code are very often standard: opening a file, connecting to a database, checking form fields…
For a long time, the main editors have allowed to use customizable code snippets and templates. However, code generation tools based on NLP offer greater flexibility and adaptability: this is the difference between using fixed patterns and using a generative grammar.
On the other hand, programming languages leave no room for approximations. Even a subtle difference between two lines of code can have a radical impact on the way a program works. However, automatic language processing, which is constantly making progress, still has a little trouble with subtleties. The question is not so much whether the generated code will be syntactically valid as whether it does exactly what the developer wants. Extreme vigilance will still be required, so this is still code writing assistance.
Code generation and no-code in vogue
The “No code” is however well and truly one of the major trends. Last May, Microsoft announced the release of its programming tool Power Apps Ideas which allows anyone to develop algorithmic treatments in natural language (which Github Copilot does for developers via its feature of translating comments into code). The model was built using GPT-3 and its 175 billion parameters so that it is able to process natural language text and then translate it into computer code. The codes generated in this way remain more limited than natively developed codes, either in terms of performance or functionality. It is above all “scripting”, but it has the advantage of allowing white-collar workers to become more autonomous. A major argument in the growth of the no-code market.
Anyway, GitHub Copilot and TabNine on their side, Power Apps Ideas on his, are three precursory tools that suggest a major evolution of the way to develop software in the decade 2021-2030.
Translated from Copilot, TabNine : le défi des outils d’autocomplétion pour aider les développeurs à écrire leurs codes