Creating the Right Database Initializer for Entity Framework
If you want to treat your database design as an 'implementation detail' that just falls out of getting your object model right, then Entity Framework gives you four choices. Picking the right one, however, may mean creating your own.
When you commit to working with Entity Framework in a code-first manner, you have four options that control when your database schema is created. In my opinion, none of them is perfect.
Why None of the Existing Initializers Are Right
For example, the default option is to create a database initializer class that inherits from CreateDatabaseIfNotExists. With this option the first line of code that accesses the database triggers the initializer. The initializer checks to see if the database exists and, if it doesn't, the initializer creates the database and populates it with tables that correspond to your object model. If your initializer includes a Seed method that adds data to your table, it's run also.
This option is the most benign of the four (that is, the least likely to do harm) and, in my opinion, the most useless: It's just not clear to me when this option makes sense. About the best you can say for CreateDatabaseIfNotExists is that it will throw an exception if, as you enhance your application, your object model and database schema are different. It won't do anything about that difference, though.
In my experience, if I'm making changes to an application, then I'm usually making changes to the objects the application uses and, as a result, I want changes made to my database schema. The very last thing I need is an exception telling me that I'm doing what I'm trying to do.
Of the provided initializers, my favorite option is DropCreateDatabaseAlways. This option means that when I press F5 to debug my application, the database is always deleted and its schema recreated. What I like about this option is that data from my last test run is blown away and (assuming I put some code in the initializer's Seed method) a fresh set of my data loaded in. In an automated testing environment, ensuring that my data is what I think it is matters to me so this is my preferred option ... of the ones provided by Entity Framework.
Which leads to my objection to the remaining two options: DropCreateDatabaseWhenModelChanges and MigrateDatabaseToLatestVersion. The DropCreateDatabaseWhenModelChanges option drops and creates my database ... but only if my database schema and object model are different. This does ensure my database schema matches my object model and that I have fresh data ... but only when my object model and database differ. And, quite frankly, most of my changes don't involve changes to my object model. As far as I'm concerned, MigrateDatabaseToLatestVersion just makes the problem worse: not only is it only triggered when I have a change to my object model, it doesn't run the Seed method when the database is first created (or whenever there are pending migrations, for that matter).
These last two options get me into debugging faster but, without fresh data, I feel that I'm just doing the wrong thing quicker.
So I created my own initializer: It only drops and recreates my schema when my object model changes. However, it always deletes any existing data and reruns my Seed method to populate the schema with fresh data.
Creating a Custom Initializer
The first step is creating a custom initializer that implements the IDatabaseInitializer interface. You'll want to be able to pass any DbContext object to your initializer, so the initializer must be set up as a generic class constrained to accept only DbContext classes (or anything that inherits from DbContext). That class begins like this:
class LoadDataDropCreateWhenChange<T> : IDatabaseInitializer<T>
where T : DbContext
The interface requires a single method called InitializeDatabase that accepts a DbContext object and returns void. This is the method that Entity Framework calls when that first line of code in your application accesses the database.
What I'm going to do in this method is access the database associated with the DbContext object. Once I have the database, I can check to see if it (a) exists and (b) matches my object model. If either of those condition are true I delete the database and recreate it from my object model using methods on the Database object. After that, I call the Seed method that I'll (eventually) build into my initializer.
Here's that code:
public void InitializeDatabase(T context)
Database db = context.Database;
One caveat here: for CompatibleWithModel to work, my database must contain the Migration tables that hold information about my database changes. As a result, your database must be created using the Create method on the Database object, as I do here.
Clearing Out the Data
When I don't want to drop and recreate the database, I still want to clear the data out of my tables and run my Seed method. To speed up that process, I first create a transaction:
DbContextTransaction tran = db.BeginTransaction();
Once the transaction is started, I retrieve the PropertyInfo objects for the DbSet properties that I added to my DbContext object. However, the DbContext object has some properties in addition to my DbSet properties so I have to filter for the DbSet properties. This code does just that:
IEnumerable<PropertyInfo> dbSetProperties =
.Where(p => p.PropertyType.Name.StartsWith("DbSet"));
Now that I have the PropertyInfo object for my DbSet properties, I can just loop through them, pulling the name from each property and using that in an SQL Delete command:
foreach (PropertyInfo dbProp in dbSetProperties)
string tableName = dbProp.Name;
if (tableName != null)
context.Database.ExecuteSqlCommand("Delete from " + tableName);
This reveals an important limitation in my initializer: If you're using the TableName attribute to override your property name or counting on pluralization to match a property called Customers to a table called customer, this simple version of the code won't deal with that.
Finally, I commit the transaction and, as before, call my Seed method:
Finally, of course, I need to provide a Seed method that developers using my initializer can override if they want to load their tables with some sample data. That method looks like this:
protected virtual void Seed(T context)
Obviously, I'm making a trade-off here: I'm betting that deleting the rows in each individual table is faster than dropping and recreating the whole database. That bet, I believe, usually pays off for me because, typically, I have many tables but, for unit testing purposes, relatively few rows in each table. My experience is that dropping and recreating all the tables every time is more expensive than deleting, from all the tables, what few rows I have.
Besides, if it appears that I'm losing my bet, I can just switch to my initializer to inherit from some other initializer and see which pain I prefer.
Using the Initializer
To use this initializer, all a developer needs to do is create a class that inherits from my LoadDataDropCreateWhenChange class, specifying the data context it is to be used with. The developer can then choose to override the Seed method to load the schema with some test data.
Here's an example of my initializer being used with a DbContext object called CustomerModel, along with a Seed method that loads the Customers table with a single row:
class MyInitializer: LoadDataDropCreateWhenChange<CustomerModel>
protected override void Seed(CustomerModel context)
Customer cust = new Customer
Id = 1,
FirstName = "Peter",
// ... additional properties ...
To have the Initializer invoked, you need to pass it to the DbContext's SetInitializer method along with a reference to your DbContext object. The most obvious place to do that is in the DbContext object's constructor. This example, using a DbContext object called CustomerModel, leverages my initializer:
class CustomerModel: DbContext
And now you have an initializer that works the way you want it to.
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.