Open Source Codeium Challenges GitHub Copilot, Strips Out Non-Permissive GPL Code -- Visual Studio Magazine

Open Source Codeium Challenges GitHub Copilot, Strips Out Non-Permissive GPL Code

By David Ramel
04/24/2023

Free and open source Codeium has launched an assault on the front-running, for-pay GitHub Copilot tool in the coding assistant space.

Along with being free of OpenAI hegemony, a key selling point in that assault is that Codeium, while providing similar code-completion capabilities, does not emit code with non-permissive licensing such as GPL (General Public License). Even though the GPL license guarantees end users the four freedoms to run, study, share and modify software, it's described as a non-permissive license.

All that is explained in last Thursday's (April 20) blog post titled "GitHub Copilot Emits GPL. Codeium Does Not."

Basically, Codeium says permissive licenses (for example MIT, BSD and Apache) let people use code for commerce or any other reason, but non-permissive licenses such as GPL prohibit such usage without consent. Codeium, developed by the deep learning specialist company Exafunction, uses the MIT license. Exafunction's GitHub repos include code for using Codeium in Vim and Neovim, the Chrome browser, Emacs and more.

Last week's post discusses the legal ramifications of violating GPL licenses, regardless of intent, which is an area of software licensing that the Codeium team said has been become muddled in the wake of startling new advancements in generative AI and large language models (LLMs). Those LLMs are the "secret sauce" powering the machine learning tech that powers generative AI constructs like ChatGPT and GPT-4 from Microsoft partner OpenAI, the clear leader in advanced AI.

The post states:

Clearly a developer copy-pasting GPL code without consent is bad and grounds for legal action, but what about a generative code model? Is it wrong for such a model to "learn" from this data? The argument to do so is clear -- GPL-licensed OSS is some of the highest quality code that is publicly available, and just like any machine learning model, better quality training data almost always means better quality LLMs. The argument to not do so is perhaps less clear -- researchers say LLMs rarely spit out training data verbatim unless interacted with adversarially, but theoretically, they could. In which case, who is responsible for this clear legal infringement? The developer of the LLM or the user who unknowingly ends up accepting the LLM's suggestions and committing the code to their team's codebase? Honestly, there is no clear answer, but that's the scary part -- no user or company should be subject to legal action, even potentially, just for using an AI code assistant tool.

While GitHub Copilot is trained on GPL-licensed code, GitHub uses nonpermissive filters to screen out potentially problematic code, but Codeium claims those filters don't work, noting that "we at Codeium have removed GPL licensed code from our training data, guaranteeing peace of mind to our users."

With the licensing angle fleshed out, a comparison of GitHub Copilot and Codeium turns to features and functionality. Here, Codeium rounded up salient points for its comparison and boiled them down into the graphic below.

**[Click on image for larger view.]** GitHub Copilot vs. Codeium *(source: Codeium).*

As can be seen, besides being free, Codeium reportedly works in more IDEs and with more programming languages, while sporting similar code-generation functionality. The relative quality of that generated code, though, is measured subjectively. A comparison conducted by Codeium awarded both a 9/10 score, saying, "it appears that Github Copilot and Codeium had roughly similar consistency in addressing the goals across the tasks, with similar rates of manual intervention necessary."

That latter observation comes in a comparison among Codeium and three similar tools: GitHub Copilot, Replit and Tabnine. Unsurprisingly, Codeium comes out on on top, with the team providing the following graphic:

**[Click on image for larger view.]** Computed Cumulative Comparison Scores *(source: Codeium).*

In addition to code completion and related capabilities to explain, refactor and translate code, Codeium comes with search and chat functionality. Chat is the newest capability and is only available on the Codeium extension for Visual Studio Code.

**[Click on image for larger view.]** VS Code Extension *(source: Codeium).*

With more than 66,000 installs, the tool promises:

Unlimited single and multi-line code completions forever

IDE-integrated chat: no need to leave VSCode to ChatGPT, and use convenient suggestions such as Refactor and Explain
Support for 70+ programming languages: Javascript, Python, Typescript, PHP, Go, Java, C, C++, Rust, Ruby, and more.
Support through our Discord Community

Codeium also comes in an enterprise offering, which is fully self-hosted and comes with additional features including local personalization on private repositories, with the team noting that enterprises often have higher requirements on data handling and security than do individual developers. However, the enterprise offering only includes code completion, not the newer search and chat functionality. The enterprise offering is priced per-seat, with exact pricing dependent on the size of an organization and any custom needs.

"We are committed to keep improving our data sanitization and filtering processes as well as maintaining a fresh training dataset (with up-to-date license metadata)," Codeium said last week. "We're also going to be taking this approach to remove potentially insecure code practices from our training data. This is possible because we are one of the very few companies that are building AI applications in a fully integrated manner independent of OpenAI -- the training, the models, the serving, the integrations, and the product."

About the Author

David Ramel is an editor and writer at Converge 360.

Printable Format

comments powered by Disqus

Featured

As Agentic AI Explodes, Microsoft Announces MS365 Copilot Agent Debugging

Microsoft announced agent debugging functionality for Microsoft 365 Copilot directly from the AI tool itself, no Visual Studio 2022 or Visual Studio Code needed.
Creating Business Applications Using Blazor

Expert Blazor programmer Michael Washington' will present an upcoming developer education session on building high-performance business applications using Blazor, focusing on core concepts, integration with .NET, and best practices for development.
GitHub Celebrates Microsoft's 50th by 'Vibe Coding with Copilot'

GitHub chose Microsoft's 50th anniversary to highlight a bevy of Copilot enhancements that further the practice of "vibe coding," where AI does all the drudgery according to human supervision.
AI Coding Assistants Encroach on Copilot's Special GitHub Relationship

Microsoft had a great thing going when it had GitHub Copilot all to itself in Visual Studio and Visual Studio Code thanks to its ownership of GitHub, but that's eroding.
VS Code v1.99 Is All About Copilot Chat AI, Including Agent Mode

Agent Mode provides an autonomous editing experience where Copilot plans and executes tasks to fulfill requests. It determines relevant files, applies code changes, suggests terminal commands, and iterates to resolve issues, all while keeping users in control to review and confirm actions.

Subscribe on YouTube

.NET Insight

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 4-Hour In-Depth Workshop: Deep Dive into ASP.NET Core Razor Pages
May 29, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Free Webcasts

> More Webcasts