Code Focused

Deleting all Data from a SQL Server Database (Part 2)

In part 2 of a series, On VB columnist Joe Kunk provides a Visual Basic program to generate a SQL script that clears all the data from a SQL Server database.

In part one of this article, I described a scenario in which I needed to empty all the data from a copy of a large database on a remote server, in order to download a minimal backup and populate it with test data for a demonstration on my laptop in a location without a reliable Internet connection. I could have scripted the existing database to create a new one, but that required trust that the new database is an identical copy of the existing database. Creating a copy of the database and then deleting all the data gives me a much higher confidence that I have an empty database structurally identical to the original.

As promised in part one, I developed a Visual Basic program to generate a SQL script that clears all the data from a SQL Server database (See Code Download for full source code and executable). The program only generates a T-SQL script; it does not make any changes to the database structure or data directly when run. Print statements are included throughout the script to display a log of actions taken by the script when run. The script has been tested on SQL Server 2005 and SQL Server 2008 R2. The program is compiled against the x86 CPU and the .NET Framework 3.5 Client Profile so it will run on a standard Windows 7 computer without the need to install the .NET Framework 4. No other assemblies are needed.

The basic approach is to TRUNCATE each table, dropping and re-creating foreign key constraints and associated extended properties if needed. If not truncated, the table is cleared with a DELETE FROM statement and the identity seed field is explicitly reset to the original seed value. The order in which the tables are processed is determined based on the foreign key dependencies. The application generates the SQL script quickly since it is simply reading schema information to generate text for a script.

The choice of how to clear the table is made in part on the security permissions of the user's database connection. This utility is meant to be used by a developer or database administrator with access to SQL Server Management Studio, not an application user. The script assumes that the user has at least the permission to read and write all tables. See Listing 1 for the generated script to clear the Microsoft sample AdventureWorks database with the ALTER permission on the tables. See Listing 2 for the generated script to clear the same database but without the ALTER permission, as would be the case with only the DataReader and DataWriter database roles.

The T-SQL script to determine if the user has ALTER permissions must be called for each table. ‘mytablename' is the name of the specific table being checked.

SELECT permission_name FROM fn_my_permissions('mytablename', 'OBJECT') 
WHERE permission_name = 'ALTER' and subentity_name = ''

Running the Application
To run the application, copy the three files in the bin\Debug\ folder to the location of your choice, modify the connDatabase connection string value to specify the desired database, and run ClearDatabaseScriptUtiity.exe. The .config file has a sample connection string for both integrated security and SQL Login modes. For the AdventureWorks database with integrated security, the connection string would be similar to:

Data Source=(local);Initial Catalog=AdventureWorks;Integrated Security=True 

Figure 1 shows the main application form. The drop down in the upper left corner allows you to specify which of the available databases on the server that you would like to use. The first four buttons display informational grids that can be used to validate the generated script. Click the "Copy to Clipboard" button to place the grid contents on the clipboard to be pasted into other applications such as Excel.


[Click on image for larger view.]
Figure 1. The main application form.

The Table Counts button generates an alphabetical list of all tables by schema and table name, along with the current count of records in each table (see Figure 2). This grid provides a way to validate the record counts before and after the ClearDatabase script has been run. The list of tables is generated from the "connSQL.GetSchema(SqlClient.SqlClientMetaDataCollectionNames.Tables)" statement where connSQL is the SqlClient.SqlConnection object.


[Click on image for larger view.]
Figure 2.

Foreign keys affecting a table must be dropped in order to TRUNCATE the table. The Foreign Key button generates an alphabetical listing of foreign key information sorted by the foreign key name and field index, as shown in Figure 3. This grid pulls together all the information needed to generate the scripts to both drop the foreign key constraints and re-create them after the table is truncated, including any extended properties attached to the foreign key constraints. These must be re-created as well if the foreign key constraint is dropped. This is the most complex query in the application. See Listing 3.


[Click on image for larger view.]
Figure 3.

Indexed views must be schema bound to the referenced tables using the WITH SCHEMABINDING attribute. T-SQL User defined functions may optionally be schema bound to improve performance. Tables referenced in these situations cannot be altered to drop the foreign key constraints without dropping and re-creating the schema bound view or function. To avoid that, any schema bound tables use the DELETE FROM option. The Schema Bound button generates an alphabetical listing of tables affected by schema binding. See Figure 4 below.


[Click on image for larger view.]
Figure 4.

The script to determine which tables are affected by schema binding is shown here:

SELECT DISTINCT db_name() as DatabaseName, SC.name as SchemaName, SY.Name as TableName 
FROM sys.sql_expression_dependencies SD 
    INNER JOIN Sys.objects SY on SD.referenced_id = SY.Object_id 
    INNER JOIN Sys.schemas SC on SY.schema_id = SC.schema_id 
    INNER JOIN syscomments CM on CM.id = SD.referencing_id 
WHERE is_schema_bound_reference=1 AND CM.text like '%schemabinding%' 
ORDER BY SC.Name, SY.Name

Truncating a table is much faster than DELETE FROM when clearing a table, primarily because log transactions are not generated for each row. Truncating a table automatically resets any identity seed field to its original value. If the DELETE FROM is used, the script must explicitly reset the identity seed value to reflect the same behavior. The Identity Seed button generates an alphabetical listing of tables with identity seed values and the T-SQL command needed to reset it, as shown in Figure 5.


[Click on image for larger view.]
Figure 5.

The script to determine the identity seed values is:

SELECT
	'[' + TABLE_SCHEMA + '].[' + TABLE_NAME + ']' as SchemaTable,  
	IDENT_SEED('['+TABLE_SCHEMA + '].[' + TABLE_NAME+']') AS Seed, 
	IDENT_CURRENT('['+TABLE_SCHEMA + '].[' + TABLE_NAME+']') AS [Current], 
	'DBCC CHECKIDENT (''[' + TABLE_SCHEMA + '].[' +  
		TABLE_NAME + ']'',RESEED,' +  
		CAST(IDENT_SEED('['+TABLE_SCHEMA + '].[' + TABLE_NAME+']') as VarChar(8)) + ')'  
	as ResetCommand  
FROM INFORMATION_SCHEMA.TABLES  
WHERE IDENT_SEED('['+TABLE_SCHEMA + '].[' + TABLE_NAME+']') IS NOT NULL  
ORDER BY TABLE_SCHEMA,TABLE_NAME

The Show Script button generates the actual T-SQL script to be used in SQL Server Management Studio to clear the database. See Figure 6. This button can be the first one clicked after selecting the database in the drop down. The four buttons to the left are informational and do not need to be clicked in order to generate the final script. Click the "Copy to Clipboard" button and then paste the text into a SQL Server Management Studio query window.


[Click on image for larger view.]
Figure 6.

The Application Code
The Visual Basic solution to generate the script in Listings 1 and 2 is shown in Figure 7. The Data Access Layer (DAL) project interacts with the database and contains most of the application logic. The LibADO.vb file provides two general functions to easily return dynamic query results based on just the connection string, the query string, and an empty DataTable or integer. The ClearDatabaseScriptUtility project serves as the graphical user interface and has minimal code.


[Click on image for larger view.]
Figure 7.

SQLServer.CreateClearDatabaseScript is the main method in the DAL project. It uses a list of ClearTableAction objects to represent each table and the pre-processing and post-processing script needed to clear the tables. CreateClearDatabaseScript follows the logic:

  • Creates a StringBuilder initialized with 10K of string space to hold the final script but can grow larger.
  • Gets the table names for the selected database and initializes the List(Of ClearTableAction).
  • Sets each ClearTableAction to a TRUNCATE command, but then changes to a DELETE FROM command for any table that is schema bound.
  • If the user has ALTER permission, adds script to disable any triggers prior to the table clear action and enables them after it is completed. See Listing 4 for the script to list the triggers on all tables.
  • Gets the list of foreign key constraints.
    • Multiple field foreign key constraints are combined into a single row.
    • Any extended properties such as comments are combined and kept to be created with the constraint.
    • Sort them by dependency so that all tables using a particular key are cleared before the key itself.
    • If the user has ALTER permission, script to drop the foreign key constraint, truncate the table, and then recreate the foreign key constraint and any foreign key constraint extended properties.
  • Loops through the List(Of ClearTableAction) and scripts the pre-processing and clear action for each table.
  • Loops through the List(Of ClearTableAction) and scripts the post-processing for each table.

I used several coding techniques in the DAL that I would like to point out.

For an object that was returning the result of a SQL query, if I needed a field that could be derived from the existing query data, I used a custom property to expose that field rather than further complicate the query. The code below shows a public property that provides a result based on the other data in the class if it was never explicitly set, but returns the explicit value instead if one was provided.

Private _referencedtable As String

Public Property ReferencedTable() As String
    Get
        Dim result = _referencedtable
        If String.IsNullOrEmpty(_referencedtable) Then
            If String.IsNullOrEmpty(References) = False Then _
                result = References.Substring(0, References.IndexOf("(")).Trim
            End If
        Return result
    End Get
    Set(ByVal value As String)
        _referencedtable = value
    End Set
End Property

For each custom class, be sure to override the ToString() method to provide a useful display when used in any grid or list. Simply return the field or calculation of your choice based on its class. See the code below for the ToString() method for the DatabaseObject class that provides the Schema.ObjectName value as its display value.

Public Overrides Function ToString() As String
    Dim result As String = String.Empty
    If Schema.Trim.Length > 0 Then result += "[" & Schema & "].["
    result += Name & "]"
    Return result
End Function

Finding the index of a particular entry in a List (Of Items) is very easy with the FindIndex extension method if you use a lambda expression. Below is the code for the IdentitySeedRowIndex function that uses a lambda expression to find the index for the particular Schema.TableName value passed into the function.

Public Shared Function IdentitySeedRowIndex( _
	IdentitySeedTables As List(Of IdentitySeedFields), _
	SchemaTable As String) As Integer

        Dim result = -1
        Dim index = IdentitySeedTables.FindIndex(Function(x) x.SchemaTable = SchemaTable)
        If index > -1 Then result = index
        Return result

End Function

Enjoy, but Use with Caution
Be sure to carefully review the generated SQL script before running it, removing any tables from the script that you wish to retain data. Be sure that you have the right server connection and the right database selected in SQL Management Studio before running it. Finally, be sure to back up the database before running the script, just in case.

I have needed a utility to efficiently "empty out" a SQL Server database several times over the last few years and also have talked to other developers that agreed such a utility would be helpful. I present such an application here in this article for your use. Since it generates explicit actions in a script, it is very easy to see exactly what will happen and have a good comfort level with those actions before running it. I hope this utility enhances your professional toolbox.

Please let me know your experiences using this application, including any enhancements that you think would be useful. Would you like the code placed somewhere that you and other readers could contribute to enhancing it? Enjoy!

comments powered by Disqus

Featured

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

  • What's New for Python, Java in Visual Studio Code

    Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

Subscribe on YouTube