News

New VS Code Tool: StarCoderEx (AI Code Generator)

StarCoder, a new open-access large language model (LLM) for code generation from ServiceNow and Hugging Face, is now available for Visual Studio Code, positioned as an alternative to GitHub Copilot.

StarCoder is a transformer-based LLM capable of generating code from natural language descriptions, a perfect example of the "generative AI" craze popularized by ChatGPT, the sentient-sounding, AI-supercharged chatbot from Microsoft partner OpenAI (and creator of Copilot).

Available as a VS Code extension called StarCoderEx, it can be used to generate code from natural language descriptions in the editor or in the command palette.

[Click on image for larger view.] StarCoderEX (source: Lisoveliy).

It stems from an open scientific collaboration between Hugging Face (machine learning specialist) and ServiceNow (digital workflow company) called BigCode.

While not strictly open source, it's parked in a GitHub repo, which describes it thusly:

StarCoder is a language model (LM) trained on source code and natural language text. Its training data incorporates more that 80 different programming languages as well as text extracted from GitHub issues and commits and from notebooks.

"The StarCoder model is designed to level the playing field so developers from organizations of all sizes can harness the power of generative AI and maximize the business impact of automation with the proper governance, safety, and compliance protocols," said a May 4 news release from ServiceNow. "This new LLM marks the next major milestone in the BigCode Project, an ambitious initiative to develop state-of-the-art AI systems for code in an open and responsible manner with the support of the open-scientific AI research community."

On the same day, Hugging Face published a blog post about the project, which involves both StarCoder and StarCoderBase LLMs. The company trained a nearly 15 billion parameter model for 1 trillion tokens, fine-tuning the StarCoderBase model for 35 billion Python tokens, which resulted in a new model called StarCoder.

"We found that StarCoderBase outperforms existing open Code LLMs on popular programming benchmarks and matches or surpasses closed models such as code-cushman-001 from OpenAI (the original Codex model that powered early versions of GitHub Copilot). With a context length of over 8,000 tokens, the StarCoder models can process more input than any other open LLM, enabling a wide range of interesting applications. For example, by prompting the StarCoder models with a series of dialogues, we enabled them to act as a technical assistant. In addition, the models can be used to autocomplete code, make modifications to code via instructions, and explain a code snippet in natural language. We take several important steps towards a safe open model release, including an improved PII redaction pipeline, a novel attribution tracing tool, and make StarCoder publicly available under an improved version of the OpenRAIL license. The updated license simplifies the process for companies to integrate the model into their products. We believe that with its strong performance, the StarCoder models will serve as a solid foundation for the community to use and adapt it to their use-cases and products."

Hugging Face set up a StarCoder - Code Completion Playground that lets users try out the model by entering a natural language description and seeing the generated code, along with a HuggingChat site that lets users chat with a prompted version of the model, for demonstration purposes only.

When asked about StarCoder, the HuggingChat site responded with: "Starcoder is a natural language processing tool built specifically for developers. Its core capabilities include generating code snippets, providing documentation links, suggesting variable names etc., while keeping track of user interactions over time."

[Click on image for larger view.] Tech Assistant Chat Examples (source: Hugging Face).

The Hugging Face team also conducted an experiment to see if StarCoder could act as a tech assistant in addition to generating code. They built a Tech Assistant Prompt that enabled the model to act as a tech assistant and answer programming related requests, as shown in the graphic above.

"The model was trained on GitHub code," Hugging Face said. "As such it is not an instruction model and commands like 'Write a function that computes the square root.' do not work well. However, by using the Tech Assistant prompt you can turn it into a capable technical assistant."

The model is licensed under the BigCode OpenRAIL-M v1 license agreement.

As of this writing, the VS Code extension -- with the tagline: "Extension for using alternative GitHub Copilot (StarCoder API) in VSCode" -- has been downloaded 1,890 times since its debut last Friday, May 5. It has earned an average 3.0 rating (scale 0-5) from four reviewers.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube