Q&A
My App Is Fully Concurrent and Async, but What About the Database?
You've carefully crafted a concurrent application taking advantage of all the latest and greatest language features. It scales magnificently and it's the jewel of the cloud. But what about the database? What happens when simultaneous queries are executed against the same data?
Davide Mauri knows the answers to those questions, as the principal product manager working for Microsoft's Azure SQL Database team has earned Microsoft MVP status (Data Platform) for 12 consecutive years.
What's more, he's going to share his hard-won knowledge in a session at the March Visual Studio Live! developer conference in Las Vegas in a presentation titled "My Application is Fully Concurrent and Async. What About the Database?"
"Using Azure SQL Database or SQL Server, we'll see what happens behind the scenes, how it guarantees concurrency and consistency and what options you have to performance and scalability," said Mauri, who serves as the developer's voice inside the product group.
He promises that attendees will learn:
- What locks are, why they are needed and how they work
- How to create applications with databases that can scale as much as needed
- About isolation levels and database internals
We recently caught up with Mauri for a short Q&A to learn more about his upcoming, intermediate-to-advanced-level, 75-minute presentation.
VisualStudioMagazine: What inspired you do present a session on concurrent/async databases?
Mauri: Asynchronous programming is amazing for scalability and once you master it, you really feel like you have a superpower. Unfortunately, it is also quite complex to fully grasp and manage. When you add the database to the mix, everything becomes even more complex as now you have data and consistency to preserve, and concurrency challenges are brought to a much bigger scale. Databases like Azure SQL and SQL Server do a lot under the cover to make sure that such complexity stays manageable.
"Knowing what happens behind the scenes is not only very interesting and geeky but also helps developers to have the "ah-ah!" moment that everyone loves to have from time to time, as once you have that, everything is much clearer and easy to understand."
Davide Mauri, Principal Product Manager, Microsoft
Knowing what happens behind the scenes is not only very interesting and geeky but also helps developers to have the "ah-ah!" moment that everyone loves to have from time to time, as once you have that, everything is much clearer and easy to understand. This session is all about explaining what happens without you even knowing it, learning where and how a developer may want to change the default behavior and what the consequences of such changes are.
Your session focuses on addressing database challenges in concurrent and asynchronous applications. Can you explain the primary challenges that arise when databases interact with such applications?
The biggest challenge is to make sure that whenever a change is made to the data, other concurrent requests can access such a change as soon as possible, but not sooner. From this simple requirement, all sorts of complications arise. Given that concurrency happens both at the database and at the application level, understanding how the two integrate and interact with each other can avoid a lot of headaches and race conditions.
You mention that attendees will learn about locks in databases. Could you give an overview of what locks are, their significance in maintaining database integrity, and how they function in a concurrent environment?
Locks are one of the foundational techniques to regulate access to data. For example, in the most basic scenario, if a process is changing some data, it will get an exclusive lock to make sure only that process can access that data. Until the process is done, other processes will not be able to access the same data. The interesting challenge starts here: what should they be given access to if they can't access the data being changed? Should they just wait or access some other version of the data? Both are valid and both are very interesting and different approaches to solving consistency and concurrency problems.
How can developers ensure that their database-connected applications are scalable? Are there specific techniques or best practices that you will discuss in the session?
The beauty of relational databases is that they support declarative programming, so developers will just have to ask for what they want, and the database will figure out how to give it to them. By defining transaction boundaries and specifying the right transaction isolation level, developers usually can get what they need without breaking a sweat. When that is not possible, as there are always edge cases, they can help the database engine by specifically defining what lock levels to use. Luckily, this is something that is rarely needed, but it is good knowledge to have when you want to get the highest possible concurrency without sacrificing consistency.
In the session, we'll look at transactions, transaction levels, how they are related to locks and versions, and how long a lock is held in place by the query engine.
Can you provide some insights into how Azure SQL Database and SQL Server handle concurrency and consistency? How do these systems compare to other database solutions in terms of handling simultaneous queries?
Azure SQL and SQL Server offer a range of options to handle concurrency and consistency. They use a pessimistic approach based on locks and a more scalable optimistic approach based on versioning. Both approaches have their pros and cons. The engine is constantly improving, and a new mechanism called optimized locking has been recently released. This mechanism introduces advanced concepts like Transaction ID Locking and Lock-After-Qualification. The goal of these efforts is to provide the highest consistency without compromising on scalability and concurrency, which is a complex problem to solve.
Your presentation will cover isolation levels and database internals. Could you briefly describe what isolation levels are and their importance in database design, particularly in concurrent and asynchronous applications?
A developer can tell a database engine what kind of balance they would like to have between consistency and concurrency by setting the correct isolation level. This can be done at the database level or at the transaction level, and once set, the database will take care of using the right amount of locking and versioning to achieve the desired result. Understanding isolation levels is absolutely key. For example, in a concurrent environment, there will always be cases in which two or more processes want to access the same data, and that must be governed, otherwise, it will be chaos, with unpredictable results and no guarantee of reproducibility or correctness. As the name implies, isolation levels define how much two processes, or more correctly for a database, two transactions, are isolated from each other. The more they are isolated, the stronger the consistency guarantees are, but of course, the concurrency capability is likely to be impacted. Understanding and using the right isolation level means having much simpler code to manage on the application side and having a much more performant application without sacrificing correctness.
Note: Those wishing to attend the conference can save hundreds of dollars by registering early, according to the event's pricing page. "Register for VSLive! Las Vegas by the Super Early Bird Deadline (Jan. 16) to save up to $400 and secure your seat for intensive developer training in exciting Las Vegas!" said the organizer of the developer conference, which is presented by the parent company of Visual Studio Magazine.
About the Author
David Ramel is an editor and writer at Converge 360.