FSF Calls for Papers on 'Unacceptable and Unjust' GitHub Copilot
Along with existential angst among developers, the recent debut of GitHub Copilot -- an AI "pair programmer" -- raised all kinds of concerns in the development community, with the Free Software Foundation (FSF) recently calling it "unacceptable and unjust."
The nonprofit FSF holds a hard, strict line on issues concerning the free and open source software (FOSS) space:
The Free Software Foundation is working to secure freedom for computer users by promoting the development and use of free (as in freedom) software and documentation -- particularly the GNU operating system -- and by campaigning against threats to computer user freedom like Digital Restrictions Management (DRM) and software patents.
In a July 28 post, the group's Donald Robertson issued a call for white papers addressing philosophical and legal questions around GitHub Copilot, unveiled a month ago as an AI-driven "pair programmer" that goes beyond common code-completion and other helpers like IntelliSense and IntelliCode. It uses a new AI system developed by OpenAI, a partner of Microsoft, which also owns GitHub. It uses machine learning to gather AI "smarts" about coding practices by examining billions of lines of public code hosted on GitHub. If the ongoing technical preview proves successful, plans call for it to become publicly available for Visual Studio Code and Visual Studio as a for-pay product.
In calling for white papers on the implications of GitHub Copilot, Robertson said it was already a settled question that the project is "unacceptable and unjust" because it requires running software that does not fit the FSF's definition of "free/libre." That's because it works with Microsoft's proprietary Visual Studio IDE and VS Code, and the FSF has issues with the latter's licensing even though it's based on open source code. Also, Robertson said, GitHub Copilot is Service as a Software Substitute ("another way to give someone else power over your computing").
With the above questions being already settled, the FSF is seeking to gather input on other questions, some with legal implications, about the new service.
"The Free Software Foundation has received numerous inquiries about our position on these questions," said Robertson, FSF's licensing and compliance manager. "We can see that Copilot's use of freely licensed software has many implications for an incredibly large portion of the free software community. Developers want to know whether training a neural network on their software can really be considered fair use. Others who may be interested in using Copilot wonder if the code snippets and other elements copied from GitHub-hosted repositories could result in copyright infringement. And even if everything might be legally copacetic, activists wonder if there isn't something fundamentally unfair about a proprietary software company building a service off their work.
"With all these questions, many of them with legal implications that at first glance may have not been previously tested in a court of law, there aren't many simple answers. To get the answers the community needs, and to identify the best opportunities for defending user freedom in this space, the FSF is announcing a funded call for white papers to address Copilot, copyright, machine learning, and free software."
The funding means that authors of published papers that FSF thinks "help elucidate the problem" will be awarded $500. Questions of particular interest to help elucidate the problem include:
Is Copilot's training on public repositories infringing copyright? Is it fair use?
How likely is the output of Copilot to generate actionable claims of violations on GPL-licensed works?
How can developers ensure that any code to which they hold the copyright is protected against violations generated by Copilot?
Is there a way for developers using Copilot to comply with free software licenses like the GPL?
If Copilot learns from AGPL-covered code, is Copilot infringing the AGPL?
If Copilot generates code which does give rise to a violation of a free software licensed work, how can this violation be discovered by the copyright holder on the underlying work?
Is a trained artificial intelligence (AI) / machine learning (ML) model resulting from machine learning a compiled version of the training data, or is it something else, like source code that users can modify by doing further training?
Is the Copilot trained AI/ML model copyrighted? If so, who holds that copyright?
Papers no longer than 3,000 words must be submitted by Monday, Aug. 23, 2021, according to the post.
David Ramel is an editor and writer for Converge360.