News

New Microsoft Sandbox Uses Natural Language LLMs for SQL Queries

Microsoft has found a new use for natural language processing capabilities in machine learning large language models (LLMs): SQL queries.

The company has set up a sandbox for developers and data pros to use its Semantic Kernel SDK to experiment and test the abilities of LLMs -- GPT-4 in this case -- to generate SQL queries based on natural language expressions.

Called NL2SQL, the sandbox project is housed in the Natural Language to SQL Console GitHub repo.

The company emphasized the experimental nature of the project, noting there are alternatives for such functionality (like WikiSQL and Spider) and cautioning that it won't necessarily lead to a viable production product.

"While other approaches exist in this space, this sample serves to showcase the capability (and limitations) of LLM using Semantic Kernel for dotnet," the GitHub repo says. "Whether or not this approach provides an adequate or cost-effective solution for any particular use-case depends on its specific context and associated expectations."

An Aug. 4 announcement expounded on that notion, stating the project's focus is to zoom into the natural abilities and limitations of GPT-4 -- an advanced LLM from partner OpenAI -- to produce relevant SQL queries, promising to share the approach, learnings and some best practices.

[Click on image for larger view.] A Query (with Typo) Translated to SQL and Execution Results (source: Microsoft).

In fact, some standard best practices are already being shared:

  • Least privilege -- Restrict to read-only access on relevant tables or views and utilize column and row-level security as appropriate.
  • Credential management -- Do not expose secrets and connection strings.
  • Injection prevention -- Never directly inject user-input into SQL statements.

"Avoid inadvertent disclosure by capturing/describing database schema at design-time to allow for review/refinement," Microsoft said. "This approach aligns with least privilege as describing schema requires higher elevation than those needed to query data."

Yet more advice: "Restrict access only to the desired data. Do not rely on schema definition or query criteria to control access."

Furthermore, the company's approach follows some basic principles:

  • Synchronizing an existing database to vector-storage is a non-starter as there is no desire to introduce consistency considerations or any type of data-movement.
  • Injecting data into the prompt-frame is also a non-starter (due to the token limit).
  • Prompts cannot be hardcoded to a specific database schema or platform.
  • Must discriminate across multiple schemas (in order to support multiple data-sources or to decompose a large schema).

Microsoft said it had heard from many in the community who want to use the Semantic Kernel SDK to query their relational database using natural language expressions.

The open source SDK helps developers easily combine AI services with conventional programming languages in their applications.

The sandbox project's GitHub repo includes a ready-made Visual Studio solution that developers can use to put the tech through its paces.

The new experimental sandbox isn't Microsoft's first use of AI for working with SQL Server, as the company announced an AI-powered "Copilot" for the latest release of SQL Server Developer Tools (SSDT) in Visual Studio, as detailed in the June Visual Studio Magazine article, "Even SQL Server Developer Tools Gets an AI Copilot."

About the Author

David Ramel is an editor and writer for Converge360.

comments powered by Disqus

Featured

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

  • What's New for Python, Java in Visual Studio Code

    Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

Subscribe on YouTube