Practical .NET

Converting JSON Objects to Relational Tables: Flattening the Object

Here's how you can use SQL Server's OpenJson function to dismantle JSON structures into relational tables targeting either transactional systems or data warehouses.

JSON is great for moving data around but not so hot when you need to convert it into data stored in a relational database -- it's a problem complex enough that you might consider using a document-oriented or other NoSQL database rather than using a relational database. If, however, you want to get your data into some sort of SQL database (either to support your transactional systems or as the first step to get your data into a data warehouse) and you're not interested in a full-fledged Extract, Transform and Load solution, SQL Server's OpenJson function might be all you need.

I looked at OpenJson in an earlier post, but I'm going to return to the topic with a more practical focus: to address a specific real-world case with a typical JSON object as my example. Specifically, this JSON sample with both simple and complex properties (and a complex property with repeating values):

Declare @json NVarChar(MAX) = N'
{
 {
  "customers" : 
  [
   {"id" : "A123",
    "createdOn" : "2021-05-19T15:10:48.852073Z",
    "salesToDate" : {
          "amount" : "2040.15",
          "currency" : "USD"
    },
    "addresses" : [
        {
            "type" : "Billing",
            "street" : "500 Ridout St."
        },
        {
            "type" : "Shipping",
            "street" : "300 Strathcona"
        }
    ]       
 
  },
  …more customer objects…
]}';

One solution to moving this structure to relational tables is just to flatten the JSON into a single table with columns like id and createdOn. In this scenario, the properties in salesToDate could, for example, be converted into columns called salesToDateAmount and salesToDateCurrency. That may make sense if you're targeting a data warehouse where you may not want to take your data to third or fourth normal form -- simply eliminating repeated fields might be sufficient for reporting purposes.

On the other hand, you might prefer to move salesToDate into a separate table. The collection of addresses probably gives you less leeway -- those addresses probably do belong in a related Address table to support the variety of addresses a customer might have (not only "shipping" and "billing" but, potentially, "contact" and "alternate shipping").

In this post, I'm going to look at flattening salesToDate into a row with simple properties in a Customers table. In an upcoming post, I'll look at setting up both salesToDate and addresses as separate tables from that Customers table .

Creating a Flattened Table
My first step is to strip the JSON collection of customer objects out of the customers property it's nested inside of and put it in a string variable to use in subsequent steps. OpenJson prefers JSON to be held in NVarChar(Max) variables so that's how I declare my string variable:

Declare @CustomerHold nvarchar(MAX);

By default, OpenJson will refer to the value of the array inside the customers property as "value" so this code declares a string variable called CustomerHold and then loads it with that property from my JSON object:

Select @CustomerHold = value from OpenJson(@json);

Next I need to deal with the simple properties on the JSON document by loading them into a Table variable. My first step, therefore, is to define a Table variable to hold those properties:

Declare @Customers Table (id nvarchar(100), createdOn Date);

In a production system, rather than use a Table variable, I could truncate an existing table on disk and load the rows from my JSON object into it; alternatively, I could skip truncating the existing table and just append the rows from my JSON object. In my examples, I'll use Insert Into which supports both scenarios.

The OpenJson's With clause allows me to select property names within my JSON document and return them as if they were columns in a table just by giving my columns names that match properties in my JSON object. Also in the With clause, I define the data types for those columns which, among other benefits, allows me to convert the string-value date in the object into a genuine SQL Server date.

This SQL statement extracts the id and createdOn properties from my JSON object (and creates a row with columns called id and createdOn) for every customer in the JSON array:

Insert into @Customers
   Select * From
       OpenJson(@CustomerHold) With (id nvarchar(100), 
                                     createdOn Date);

Another nice feature of the With clause is that if a property is missing in a particular object (something that's perfectly OK in JSON), OpenJson just returns NULL for that property rather than giving up completely.

Flattening Complex Properties
Flattening complex properties into my main row is also pretty straightforward. First, I have to add columns to hold the data to my Table variable:

Declare @Customers Table (id nvarchar(100),
                          createdOn Date, 
                          salesToDateAmount decimal(7,2), 
                          salesToDateCurrency nchar(3));

The next step is to extend my With clause by specifying the new properties I want to extract from the salesToDate. So far, I've been counting on OpenJson's ability to match the column names in my table to the properties in my JSON object. With these nested properties, however, I'll use a different approach.

Where I can't use column names to match property names (or where I want my columns to have different names than my property names), I can use JSON path expressions in OpenJson's With clause to tie a column to a JSON property. These path expressions begin with a $ (indicating the root of the object) followed by the names of the nested properties, separated by periods. To specify, for example, the amount property nested inside the salesToDate property, I use '$.salesToDate.amount' (the path must be enclosed in single quotes).

This statement grabs the amount and currency properties inside the salesToDate property so that my Select statement can return them in columns called salesToDateAmount and salesToDateCurrency:

Insert into @Customers
   Select * From
       OpenJson(@CustomerHold) With (id nvarchar(100), 
                                     createdOn Date, 
                                     salesToDateAmount decimal(7,2) '$.salesToDate.amount', 
                                     salesToDateCurrency nchar(3) '$.salesToDate.currency' );

That's the easy stuff out of the way. With my next post, I'll create related tables both for the salesToDate property and the array in the addresses property.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

  • VS Code Copilot Previews New GPT-4o AI Code Completion Model

    The 4o upgrade includes additional training on more than 275,000 high-quality public repositories in over 30 popular programming languages, said Microsoft-owned GitHub, which created the original "AI pair programmer" years ago.

  • Microsoft's Rust Embrace Continues with Azure SDK Beta

    "Rust's strong type system and ownership model help prevent common programming errors such as null pointer dereferencing and buffer overflows, leading to more secure and stable code."

  • Xcode IDE from Microsoft Archrival Apple Gets Copilot AI

    Just after expanding the reach of its Copilot AI coding assistant to the open-source Eclipse IDE, Microsoft showcased how it's going even further, providing details about a preview version for the Xcode IDE from archrival Apple.

  • Introduction to .NET Aspire

    Two Microsoft experts will present on the cloud-native application stack designed to simplify the development of distributed systems in .NET at the Visual Studio Live! developer conference coming to Las Vegas next month.

  • Microsoft Previews Copilot AI for Open-Source Eclipse IDE

    Catering to Java jockeys, Microsoft is yet again expanding the sprawling reach of its Copilot-branded AI assistants, previewing a coding tool for the open-source Eclipse IDE.

Subscribe on YouTube

Upcoming Training Events