News
Another Report Weighs In on GitHub Copilot Dev Productivity: 👎
Is the original "AI pair programmer" helping developers be more productive? Several reports seeking to measure the impact of GitHub Copilot -- including one from GitHub itself -- have weighed in on the subject, typically answering the question with "yes." A new one says "no."
In fact quite the contrary.
"Developers with Copilot access saw a significantly higher bug rate while their issue throughput remained consistent," is one of the takeaways from the report titled "Can Generative AI Improve Developer Productivity?" from Uplevel Data Labs, which culled data from Uplevel's engineering intelligence platform.
"This suggests that Copilot may negatively impact code quality," the company said. "Engineering leaders may wish to dig deeper to find the PRs with bugs and put guardrails in place for the responsible use of generative AI."
That takeaway, or key insight if you will, was one of three listed by the company, which said it conducted the research to answer three questions:
- Does Copilot access help developers ship faster code?
- Does it help developers ship higher quality code?
- Does it mitigate the impact of developer burnout?
The other two takeaways as presented by the company include:
- Copilot access provided no significant change in efficiency metrics: "When comparing PR cycle time, throughput, and complexity along with PRs with tests, Copilot neither helped nor hurt the developers in the sample, and also did not increase coding speed. While some of these metrics were statistically significant, the actual change was inconsequential to engineering outcomes, e.g. cycle time decreased by 1.7 minutes."
- Copilot access was in mitigating the risk of burnout: "Uplevel's 'Sustained Always On' metric (extended working time outside of standard hours and a leading indicator of burnout), decreased for both groups. But it decreased by 17% for those with Copilot access and by almost 28% for those without."
That data comes from a sample of nearly 800 developers in the company's customer base, with its Data Labs analyzing the difference in how teams with and without Copilot access performed according to objective metrics including cycle time, PR throughput, bug rate and extended working hours (the latter accounting for the "Always On" metric reference).
Uplevel had advice for engineering leaders to prepare for further advancements in the tool:
- Set specific goals: What specifically are the outcomes that you are wanting to achieve by including Copilot in your team's workflow?
- Offer training to your teams: Onboarding can be a good way to lay out where Copilot should and shouldn't be used and what safeguards are in place as an organization.
- Continue to experiment with generative AI: Seek out the specific use cases in which Copilot can be helpful and the prompts that yield the best results. Share these findings across your organization so that success can be replicated.
- Monitor the engineering effectiveness metrics that Copilot might impact: Start A/B testing on your own to gain objective, quantitative insight into whether AI is actually improving developer productivity and/or helping you reach your operational goals.
As noted, other studies reached different conclusions. The aforementioned GitHub report, for example, was covered in the article, "A Year In, GitHub Measures AI-Based Copilot's Productivity Boost."
Rather than using a survey, GitHub recruited a group of developers, divided them and timed how long devs in each segment took to write an HTTP server in JavaScript, finding:
- The group that used GitHub Copilot had a higher rate of completing the task (78%, compared to 70% in the group without Copilot).
- The striking difference was that developers who used GitHub Copilot completed the task significantly faster -- 55% faster than the developers who didn't use GitHub Copilot. Specifically, the developers using GitHub Copilot took on average 1 hour and 11 minutes to complete the task, while the developers who didn't use GitHub Copilot took on average 2 hours and 41 minutes. These results are statistically significant (P=.0017) and the 95% confidence interval for the percentage speed gain is [21%, 89%].
As the graphic below shows, the GitHub study sought to measure benefits beyond just speed and productivity.
The conclusion? "GitHub Copilot supports faster completion times, conserves developers' mental energy, helps them focus on more satisfying work, and ultimately find more fun in the coding they do."
As for other research, the company Harness in September 2023 published a report titled "
The Impact of Github Copilot on Developer Productivity: A Case Study," which listed these key takeaways:
- GitHub Copilot has the potential to significantly improve developer productivity.
- The tool's ability to generate code snippets and suggest contextually relevant code can help developers to save time and focus on more creative and strategic tasks.
- GitHub Copilot's ability to provide coding insights can help developers to improve the quality and efficiency of their code.
- The study findings underscore the importance of AI-powered tools in the future of software development.
A couple data points summarizing "The GitHub Copilot Impact" were:
- Found a significant 10.6% increase in PRs with Copilot integration.
- Demonstrated a 3.5 hours reduction in cycle time, enhancing development efficiency.
A more recent study was detailed in a February post from Communications of the ACM, titled "Measuring GitHub Copilot's Impact on Productivity," wherein "A case study asks Copilot users about the tool's impact on their productivity, and seeks to find their perceptions mirrored in user data."
Key insights there include:
- AI pair-programming tools such as GitHub Copilot have a big impact on developer productivity. This holds for developers of all skill levels, with junior developers seeing the largest gains.
- The reported benefits of receiving AI suggestions while coding span the full range of typically investigated aspects of productivity, such as task time, product quality, cognitive load, enjoyment, and learning.
- Perceived productivity gains are reflected in objective measurements of developer activity.
- While suggestion correctness is important, the driving factor for these improvements appears to be not correctness as such, but whether the suggestions are useful as a starting point for further development.
Seeing as GitHub Copilot only became widely available to individual developers in June 2022, it's still the relatively early days for such research, so stay tuned for more updates.
About the Author
David Ramel is an editor and writer at Converge 360.