News

GitHub Research Claims Copilot Code Quality Gains in Addition to Productivity

GitHub says new research proves its Copilot AI tool can improve code quality, following earlier reports that said it boosts developer productivity.

"Our findings overall show that code authored with GitHub Copilot has increased functionality and improved readability, is of better quality, and receives higher approval rates," said Microsoft-owned GitHub in a blog post this week.

It's the latest of several research-based reports from GitHub that address the effectiveness of the original "AI pair programmer" that unleashed GenAI on software development, fundamentally changing the space and spearheading a wave of Copilots throughout Microsoft's products and services.

While GitHub's reports have been positive, a few others haven't. For example, a recent study from Uplevel Data Labs said, "Developers with Copilot access saw a significantly higher bug rate while their issue throughput remained consistent."

And earlier this year a "Coding on Copilot" whitepaper from GitClear said, "We find disconcerting trends for maintainability. Code churn -- the percentage of lines that are reverted or updated less than two weeks after being authored -- is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of 'added code' and 'copy/pasted code' is increasing in proportion to 'updated,' 'deleted,' and 'moved 'code. In this regard, AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited."

GitHub has an answer for those contrarian reports: "We hypothesize that other studies might not have found an improvement in code quality with GitHub Copilot, not because of the tool itself, but because developers may have lacked the opportunity or incentive to focus on quality." The company characterized its new research as the first controlled study to examine GitHub Copilot's impact on code quality.

Of course, the subject of GitHub Copilot's effectiveness is of intense interest, and reports vary in their conclusions, with other brand-new research published on Springer Nature Link saying, "The findings of the study suggest that GitHub Copilot can be a valuable asset to development processes, resulting in enhancements in satisfaction, performance, efficiency, and monetization dimensions. However, areas for improvement include communication features, unit testing, and addressing potential security concerns. This study demonstrates Copilot's potential as an effective tool for enhancing software development productivity and quality, providing valuable insights for future research and industry adoption."

And in August, GitHub topped research firm's Gartner's inaugural Magic Quadrant report on vendors of AI code assistants, leading in both completeness of vision and ability to execute.

Magic Quadrant for AI Code Assistants
[Click on image for larger view.] Magic Quadrant for AI Code Assistants (source: Gartner).

"GitHub has an extensive developer community and GitHub Copilot has high user engagement, which enables it to gather feedback quickly and continuously innovate," Gartner said. "GitHub's high customer retention rates and annual contract value retention underscore its ability to maintain and grow its customer base."

As far as this week's report, GitHub listed the key findings as:

  • Increased functionality: developers with GitHub Copilot access had a 56% greater likelihood of passing all 10 unit tests in the study, indicating that GitHub Copilot helps developers write more functional code by a wide margin.
  • Improved readability: in blind reviews, code written with GitHub Copilot had significantly fewer code readability errors, allowing developers to write 13.6% more lines of code, on average, without encountering readability problems.
  • Overall better quality code: readability improved by 3.62%, reliability by 2.94%, maintainability by 2.47%, and conciseness by 4.16%. All numbers were statistically significant. These quality improvements were consistent with those found in the 2024 DORA Report.
  • Higher approval rates: developers were 5% more likely to approve code written with GitHub Copilot, meaning that such code is ready to be merged sooner, speeding up the time to fix bugs or deploy new features.

That DORA report (2024 Accelerate State of DevOps) only mentions Copilot once in passing, but does address AI-assisted coding in general, with this chart addressing developers' trust in the quality of AI-generated code:

Trust in Quality of AI-Generated Code
[Click on image for larger view.] Trust in Quality of AI-Generated Code (source: DORA).

Here's GitHub's bottom line on the new research:

So, what do these findings say about how GitHub Copilot improves code quality? While the number of commits and lines of code changed was significantly higher for the GitHub Copilot group, the average commit size was slightly smaller. This suggests that GitHub Copilot enabled developers to iterate on the code to improve its quality. Our hypothesis is that because developers spent less time making their code functional, they were able to focus more on refining its quality. This aligns with our previous findings that developers felt more confident using GitHub Copilot. It also demonstrates that with the greater confidence GitHub Copilot gave them, they were likely empowered to iterate without the fear of causing errors in the code.

That "GitHub Copilot group" refers to some of 243 developers with at least five years of Python experience who were randomly assigned to use the tool in the first phase of the study. "In the second phase, developers were randomly assigned submissions to review using a provided rubric. They were blind to whether the code was authored with GitHub Copilot," GitHub said.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • New 'Visual Studio Hub' 1-Stop-Shop for GitHub Copilot Resources, More

    Unsurprisingly, GitHub Copilot resources are front-and-center in Microsoft's new Visual Studio Hub, a one-stop-shop for all things concerning your favorite IDE.

  • Mastering Blazor Authentication and Authorization

    At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.

  • Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.

  • Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

    Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.

  • Vibe Coding with Latest Visual Studio Preview

    Microsoft's latest Visual Studio preview facilitates "vibe coding," where developers mainly use GitHub Copilot AI to do all the programming in accordance with spoken or typed instructions.

Subscribe on YouTube