Deleting All Data from a SQL Server Database
There are times as a developer that you will want to be able to delete all the data from a database as efficiently as possible. Unfortunately, doing so may not be as easy as it seems.
There are times as a developer that you will want to be able to delete all the data from a database as efficiently as possible. No, I am not trying to entice you to the dark side and have your name appear as a headline on the local evening news. When used for peaceful purposes, the ability to efficiently clear the data from a database can be very helpful for development and testing of large or highly confidential databases.
I needed this capability on a recent consulting project to build a dashboard management reporting system for a very large medical database on a server with a slow DSL connection.
I needed to copy the SQL Server database, with its most recent schema changes, down to my laptop for a client demonstration at a location where Internet connectivity might not be available. Even compressed, the database backup would take most of the evening to copy from the remote server. Should the copy fail, I would not have time to try again before the demonstration. I did not need actual data; an empty copy of the latest database would suffice. I could easily add a few test records for the demonstration after installing the database on my laptop.
I created a copy of the database on the remote server and generated a SQL script to TRUNCATE all the tables in the copy database as shown in Listing 1. After a quick backup of the newly cleaned database downloaded to my laptop, I would be home on time by 6pm.
SELECT 'TRUNCATE TABLE ' +
['+ TABLE_CATALOG + '].['+ TABLE_SCHEMA + '].['+ TABLE_NAME + ']'
WHERE Table_Type = 'BASE TABLE'
ORDER by TABLE_NAME
Listing 1. SQL Server 2005 or later script to generate TRUNCATE statements for all tables.
Anyone that has tried this on anything but the most simplistic database has found, as I did, that this won't work. Any table that is referenced by a foreign key relationship, participates in an indexed view, or is published using transactional or merge replication cannot be truncated. A DELETE FROM statement will work in those cases, but it requires much more system resources since each row is individually deleted and included in the transaction log. Another potential disadvantage is that IDENTITY columns are not reset to their original Seed value with the DELETE FROM statement. A TRUNCATE statement is preferred since it just drops all data pages associated with the table, does not fire table triggers, and resets IDENTITY columns.
In the case above, I resorted to using DELETE FROM statements for each table, and suffered the performance penalty. What I would have preferred is the ability to run a SQL script to delete all the data from the database using the most efficient method possible on a table by table basis, in the proper order.
In Part 2 of this article, I present the source code for a Visual Basic program to generate a SQL script to efficiently delete all data from a database according to the current configuration and available permissions of the database connection. Foreign key relationships are dropped and recreated, if possible, to allow more tables to participate in the TRUNCATE statements. Identity columns are explicitly reset if the table cannot be truncated. Child tables are cleared before parent tables.
As a safety precaution, the program does not directly perform any DELETE FROM or TRUNCATE actions, instead requiring sufficient knowledge of the database tools to connect and execute a SQL script. For clarity, the script does not use any loops; each action affecting a table is individually listed in order to present a clear picture of exactly what actions will be performed, for review and approval before running the script.
I invite you to share your experiences when working with large databases in the online comments area of this article. Have you needed to clear a database as I describe here? Have you found a clear way to accomplish it? What other challenges have you faced and perhaps solved when working with large databases?
Joe Kunk is a Microsoft MVP in Visual Basic, three-time president of the Greater Lansing User Group for .NET, and developer for Dart Container Corporation of Mason, Michigan. He's been developing software for over 30 years and has worked in the education, government, financial and manufacturing industries. Kunk's co-authored the book "Professional DevExpress ASP.NET Controls" (Wrox Programmer to Programmer, 2009). He can be reached via email at [email protected].