News

SQL Server 2019 Preview Bakes In Big Data

Microsoft introduced a preview of the latest edition of its flagship RDBMS, SQL Server 2019, highlighting new Big Data capabilities.

The company said v2019 creates a unified data platform by packaging Apache Spark and Hadoop Distributed File System (HDFS) in with the SQL Server database engine, helping data developers seamlessly ingest, store and analyze vast amounts of data.

This integration is crucial to evolving the product in the age of Big Data, Microsoft said, because a single instance of SQL Server was never designed or built to handle analytics on the scale of petabytes or exabytes that are common in Big Data analytics implementations.

Also, Microsoft said while announcing the preview at its Ignite conference, this new Big Data integration takes SQL Server even further beyond its roots as a traditional database. Asad Khan, principal PM manager, SQL Server, expounded more about that and other details in a blog post. "And as with every release, SQL Server 2019 continues to push the boundaries of security, availability, and performance for every workload with Intelligent Query Processing, data compliance tools and support for persistent memory," Khan said. "With SQL Server 2019, you can take on any data project, from traditional SQL Server workloads like OLTP, Data Warehousing and BI, to AI and advanced analytics over Big Data."

SQL Server 2019
[Click on image for larger view.] SQL Server 2019 (source: Microsoft).

It was the baked-in Spark and HDFS functionality that highlighted the preview announcement, however. Microsoft calls this new integrated architecture a "Big Data cluster," and the company's Travis Wright, principal program manager, SQL Server, provided more information in a blog post yesterday (Sept. 25). "The SQL Server 2019 relational database engine in a Big Data cluster leverages an elastically scalable storage layer that integrates SQL Server and HDFS to scale to petabytes of data storage," Wright said. "The Spark engine that is now part of SQL Server enables data engineers and data scientists to harness the power of open source data preparation and query programming libraries to process and analyze high-volume data in a scalable, distributed, in-memory compute layer."

Wright provided more details on Big Data clusters in an interview with our sister publication, RedmondMag.com, where our resident SQL Server expert, Joey D'Antoni, and Wright discussed the new breakthrough. D'Antoni focused on the clusters providing a scale-out, data virtualization platform built on top of the Kubernetes (K8s) container platform. He noted the clusters amount to a lot of change in the platform in one swoop and asked Wright if he foresees any gaps to adoption, and what Microsoft is doing to mitigate that.

"Probably the most obvious adoption hindrance will be the K8s/container adoption for database workloads," Wright replied. "Companies are getting on the bandwagon, similar to virtualization. [Another hindrance is] container questions. When people see how easy it is to deploy a SQL Server Availability Group into K8s, it makes it a no-brainer."

Big Data Clusters Provide 'A Complete AI Platform'
[Click on image for larger view.] Big Data Clusters Provide 'A Complete AI Platform' (source: Microsoft).

Also, artificial intelligence (AI) functionality in Big Data clusters was highlighted in the new preview, echoing the emphasis that Microsoft put on AI capabilities across many new products and features announced at the ongoing Ignite conference in Orlando.

"SQL Server 2019 Big Data clusters provide a complete AI platform," Microsoft said. "Data can be easily ingested via Spark Streaming or traditional SQL inserts and stored in HDFS, relational tables, graph, or JSON/XML. Data can be prepared by using either Spark jobs or Transact-SQL (T-SQL) queries and fed into machine learning model training routines in either Spark or the SQL Server master instance using a variety of programming languages, including Java, Python, R, and Scala. The resulting models can then be operationalized in batch scoring jobs in Spark, in T-SQL stored procedures for real-time scoring, or encapsulated in REST API containers hosted in the Big Data cluster."

D'Antoni also covered the many other new features in SQL Server 2019 in a separate wrap-up, where he detailed security, database performance enhancements, availability and more.

"SQL Server 2019 is still in the early preview stage," said D'Antoni, an architect and SQL Server MVP with more than a decade of experience. "There are still many more things to come between now and the time SQL Server 2019 becomes generally available. However, it is clear that Microsoft continues to make big investments in the data platform and is working hard to keep it available and consistent across server and data platforms, expanding the broader audience for data services."

About the Author

David Ramel is an editor and writer for Converge360.

comments powered by Disqus

Featured

Subscribe on YouTube