Q&A
Indexing Internals for Developers and DBAs
Developers working with SQL Server often treat indexing as a DBA's domain--something happening "under the hood" that doesn't require much attention unless performance takes a hit. But ignoring the internals of indexing can be a costly oversight, leading to sluggish queries, bloated databases, and finger-pointing when apps don't run as expected.
In the upcoming Live! 360 Orlando session "Indexing Internals for Developers & DBAs," longtime data professional Bradley Ball is on a mission to bridge the knowledge gap between developers and DBAs. Drawing from decades of experience--including high-stakes work as Lead DBA for the Office of the President--Ball brings deep insight into how SQL Server indexing really works, and why it matters at every layer of application development.
"I shared my vocabulary; any fix we found was a shared fix. If a developer is armed with a powerful vocabulary, it can disarm even the most difficult DBA."
Bradly Ball, Sr. Azure Engineer for the Azure FastTrack CXP PG, Microsoft
Whether you're optimizing queries, diagnosing performance bottlenecks, or simply trying to build more maintainable systems, understanding indexing internals isn't just "nice to have"--it's essential. In his session, Ball will go beyond surface-level tips and dive into the architecture and mechanics behind indexes: how they're used (or not used), what common design patterns can hurt more than help, and how developers can make informed indexing decisions without relying solely on DBAs.
We caught up with Ball ahead of the conference to talk about how indexing knowledge can empower developers, dispel common myths, and prevent performance disasters before they happen.
VisualStudioMagazine: What inspired you to present a session on this topic?
Ball: In my career I've worn many hats. I started out as a Developer building internal systems for YKK AP America Inc, a Japanese company that build's manufacturing store fronts. Every system we built I was the Analyst gather requirements, Developer building the system, the DBA building the database, and then I would travel to our branch and manufacturing offices to train people on how to use the system. I was fresh out of college and an army of one. I didn't realize that the code I was writing in the database was suboptimal. I didn't fully understand databases. I was a Developer next for the US Army and became an accidental DBA. It was with the Army I got my database certifications. Reading those Certification manuals was where I got my first fundamental understanding of databases. I understood right away that a database was just an application. The developers of that application had built it to work certain ways, and I was using the application without a fundamental understanding of how it worked and how to get the best performance from it.
Why is it important for developers--not just DBAs--to understand indexing internals?
While I eventually transitioned from being a Developer to a DBA, that is not everyone's path. I had witnessed many DBA's and Developers having an adversarial relationship. But the issue was a fundamental misunderstanding, a lack of common vocabulary. The goal was always the same. We want this to perform, we want it to be fast, we want a good user experience. Often it came out as, "why is your database so slow!" or it would be returned as "why doesn't your application work better." As a DBA I didn't have this issue. I would ask how the application was programmed, I would ask what the goal was, I would look at the performance with the Developers I worked with. I shared my vocabulary; any fix we found was a shared fix. If a developer is armed with a powerful vocabulary it can disarm even the most difficult DBA.
What's one myth about indexes that you regularly have to correct?
There's no magic to them, the way they work has rules. Application rules that govern how they perform, when they are used, and when they are not used. Adding more doesn't automatically fix query performance, and it may cause performance degradation over time. We want to be intentional about how and when we use them; as the Developers of the database application intended.
What kinds of problems can poor indexing cause that developers might miss?
Poor indexing can cause performance issues, but so can system design. When I was the Lead DBA for the Office of the President of the United States back in 2009 we had an application that had used a row guid for the primary key clustered index. As a rule you want a clustered index to be small, narrow, static, and ever increasing. This breaks several of those rules. The performance issues were so bad we had to take the system offline and noon to rebuild the indexes, there was no online index rebuilds back then! We fixed this by changing the page free space, so we could deal with the fragmentation until the end of the workday. Not ideal system design, but sometimes you have the system you have and you need to learn how to co-exist. In this case understanding the maintenance of the index was just as important as the quality of the index.
What's a tip for identifying when an index is actually hurting performance?
I love this question. We have a couple types, reverse indexes, unused indexes, indexes with more scans than seeks, but let's go with duplicate indexes. I once worked with a customer, as a consultant, where they had hundreds of indexes on each of their tables. Each index must be in a transactionally consistent state when insert, update, or delete operations complete. This adds overhead if a row is changed, where each corresponding index must also be updated. Imagine an insert of one row on a table. Pretty quick right? Now imagine an insert of one row on a table, but before it can commit you must find the 120 non-clustered indexes and update those as well. Even if you just count to 120 it takes a lot longer. Now imagine that on a system with hundreds or thousands of transactions per second! We found and dropped thousands of indexes while I was there. We shrunk the database by over 50 GB! Getting rid of that overhead helped the database to speed up quite a bit.
How do you explain things like page splits or fragmentation in a way that clicks with newcomers?
These concepts are evergreen concepts. You want to showcase this but give concrete examples. I can show you how Page Splits that occur because of suboptimal index keys slow performance, more importantly I share real world scenarios. That application we discussed earlier that was fixed by maintenance, the fix was there because of key that cause fragmentation. Fragmentation to the point the application stopped working. DBA or Developer we all understand it is bad when the application stops working! I've also prevented customers from moving into production with the same issues when I was barley on a project. Fortunately or unfortunately if you stay in this field long enough you will see many issues that repeat. This is a fundamental concept and knowledge of it is very helpful.
How can attendees learn more about this topic and prepare for your session?
Microsoft has a wonderful and detailed learn article on Indexing in SQL Server, and the key to understand is this applies to AWS RDS, Azure SQL DB, Azure SQL Managed Instance, or any other edition of SQL Server. I also have some videos on our YouTube channel Tales from the Field on this subject, SQL Server Internals: HEAPS VS. CLUSTERED INDEXES! and SQL Server Internals: SQL Internals Heaps & Indexes Part 2!!.
Note: Those wishing to attend the session can save money by registering early, according to the event's pricing page. "Save $500 when you register by Friday, Aug. 22," said the organizer of the event, which is presented by the parent company of Visual Studio Magazine.
About the Author
David Ramel is an editor and writer at Converge 360.