In-Depth

Making Do with Absent Foreign Keys

The absence of foreign key properties in classes adds some challenges to working in disconnected apps.

This month I'm writing about an issue I've found myself helping people with frequently of late: problems with related classes defined in Code First and then, in most cases, used in the Model-View-Controller (MVC) framework. The problems developers have been experiencing aren't specific to Code First. They're the result of underlying Entity Framework (EF) behavior and, in fact, common to most object-relational mappers (ORMs). But it seems the problem is surfacing because developers are coming to Code First with certain expectations. MVC is causing them pain because of its highly disconnected nature.

Rather than only showing you the proper code, I'll use this column to help you understand the EF behavior so you can apply this knowledge to the many scenarios you may encounter when designing your classes or writing your data access code with EF APIs.

Code First convention is able to detect and correctly infer various relationships with varying combinations of properties in your classes. In this example, which I'm writing as the leaves are turning spectacular colors on trees near my home in Vermont, I'll use Tree and Leaf as my related classes. For a one-to-many relationship, the simplest way you could describe that in your classes and have Code First recognize your intent is to have a navigation property in the Tree class that represents some type of collection of Leaf types. The Leaf class needs no properties pointing back to Tree. Listing 1 shows the Tree and Leaf classes.

By convention, Code First will know that a foreign key is required in the database in the Leaf table. It will presume a foreign key field name to be "Tree_TreeId," and with this information provided in the metadata created by Code First at run time, EF will understand how to work out queries and updates using that foreign key. EF leverages this behavior by relying on the same process it uses with "independent associations" -- the only type of association we could use prior to the Microsoft .NET Framework 4 -- which don't require a foreign key property in the dependent class.

This is a nice, clean way to define the classes when you're confident that you have no need to ever navigate from a Leaf back to its Tree in your application. However, without direct access to the foreign key, you'll need to be extra diligent when coding.

Creating New Dependent Types Without Foreign Key or Navigation Properties
Although you can easily use these classes to display a tree and its leaves in your ASP.NET MVC application and edit leaves, developers often encounter issues creating new leaves in a typically architected MVC app. I used the template from the MVCScaffolding NuGet package to let Visual Studio automatically build my controllers, views and simple repositories by selecting "MvcScaffolding: Controller with read/write action and views, using repositories." Note that because there's no foreign key property in the Leaf class, the scaffolding templates won't recognize the one-to-many relationship. I made some minor changes to the views and controllers to allow a user to navigate from a tree to its leaves, which you can see in the sample download.

The Create postback action for Leaf takes the Leaf returned from the Create view and tells the repository to add it and then save it, as shown in Listing 2,

The repository takes the leaf, checks to see if it's new and if so, adds it to the context instance that was created as a result of the postback:

public void InsertOrUpdate(Leaf leaf,int treeId){
  if (leaf.LeafId == default(int)) {
    // New entity
    context.Leaves.Add(leaf);
  } else {
    // Existing entity
    context.Entry(leaf).State = EntityState.Modified;
  }
}

When Save is called, EF creates an Insert command, which adds the new leaf to the database:

exec sp_executesql N'insert [dbo].[Leaves]([FellFromTreeDate], [FellFromTreeColor], 
[Tree_TreeId]) values (@0, @1, null)
select [LeafId]
from [dbo].[Leaves]
where @@ROWCOUNT > 0 and [LeafId] = scope_identity()',
N'@0 datetime2(7),@1 nvarchar(max) ',
@0='2011-10-11 00:00:00',@1=N'Pale Yellow'

Notice the values passed in on the second line of the command: @0 (for the date); @1 (for the modified color); and null. The null value is destined for the Tree_TreeId field. Remember that the nice, clean Leaf class has no foreign key property to represent the TreeId, so there's no way to pass that value in when creating a standalone leaf.

When the dependent type (in this case, Leaf) has no knowledge of its principal type (Tree), there's only one way to do an insert: The Leaf instance and the Tree instance must be added to the context together as part of the same graph. This will provide EF with all the information it needs to work out the correct value to insert into the database foreign key (for example, Tree_TreeId). But in this case, where you're working only with the Leaf, there's no information in memory for EF to determine the value of the Tree's key property.

If you had a foreign key property in the Leaf class, life would be so much simpler. It's not too difficult to keep a single value at hand when moving between controllers and views. In fact, if you look at the Create action in Listing 2, you can see that the method has access to the value of the TreeId for which the Leaf is being created.

There are a number of ways to pass data around in MVC applications. I chose the simplest for this demo: stuffing the TreeId into the MVC ViewBag and leveraging Html.Hidden fields where necessary. This makes the value available as one of the view's Request.Form items.

Because I have access to the TreeId, I'm able to build the Tree/Leaf graph that will provide the TreeId for the Insert command. A quick modification to the repository class lets the InsertOrUpdate method accept that TreeId variable from the view and retrieves the Tree instance from the database using the DbSet.Find method. Here's the affected part of the method:

public void InsertOrUpdate(Leaf leaf,int treeId)
{
  if (leaf.LeafId == default(int)) {
    var tree=context.Trees.Find(treeId);
    tree.Leaves.Add(leaf);
  }
...

The context is now tracking the tree and is aware that I'm adding the leaf to the tree. This time, when context.SaveChanges is called, EF is able to navigate from the Leaf to the Tree to discover the key value and use it in the Insert command.

Listing 3 shows the modified controller code using the new version of InsertOrUpdate.

With these changes, the insert method finally has the value for the foreign key, which you can see in the parameter called "@2":

exec sp_executesql N'insert [dbo].[Leaves]([FellFromTreeDate], [FellFromTree-
Color], [Tree_TreeId])
values (@0, @1, @2)
select [LeafId]
from [dbo].[Leaves]
where @@ROWCOUNT > 0 and [LeafId] = scope_identity()',
N'@0 datetime2(7),@1 nvarchar(max) ,
@2 int',@0='2011-10-12 00:00:00',@1=N'Orange-Red',@2=1

In the end, this workaround forces me to make another trip to the database. This is the price I'll choose to pay in this scenario where I don't want the foreign key property in my dependent class.

Problems with Updates When There's No Foreign Key
There are other ways you can paint yourself into a corner when you're bound and determined not to have foreign key properties in your classes. Here's another example.

I'll add a new domain class named TreePhoto. Because I don't want to navigate from this class back to Tree, there's no navigation property, and again, I'm following the pattern where I don't use a foreign key property:

[Table("TreePhotos")]
public class TreePhoto
{
  public int Id { get; set; }
  public Byte[] Photo { get; set; }
  public string Caption { get; set; }
}

The Tree class provides the only connection between the two classes, and I specify that every Tree must have a Photo. Here's the new property that I added to the Tree class:

[Required] public TreePhoto Photo { get; set; }

This does leave the possibility of orphaned photos, but I use this example because I've seen it a number of times -- along with pleas for help -- so I wanted to address it.

Once again, Code First convention determined that a foreign key property would be needed in the database and created one, Photo_Id, on my behalf. Notice that it's non-nullable. That's because the Leaf.Photo property is required (see Figure 1).


[Click on image for larger view.]
Figure 1. Using Code First convention, Tree gets a non-nullable foreign key to TreePhotos.

Your app might let you create trees before the photos have been taken, but the tree still needs that Photo property to be populated. I'll add logic into the Tree repository's InsertOrUpdate method to create a default, empty Photo for new Trees when one isn't supplied:

public void InsertOrUpdate(Tree tree)
{
  if (tree.TreeId == default(int)) {
    if (tree.Photo == null)
    {
      tree.Photo = new TreePhoto { Photo = new Byte[] { 0 },
                                   Caption = "No Photo Yet" };
    }
    context.Trees.Add(tree);
}
...

The bigger problem I want to focus on here is how this issue affects updates. Imagine you have a Tree and its required Photo already stored in the database. You want to be able to edit a Tree and have no need to interact with the Photo. You'll retrieve the Tree, perhaps with code such as "context.Trees.Find(someId)." When it's time to save, you'll get a validation error because Tree requires a Photo. But Tree has a photo! It's in the database! What's going on?

Here's the problem: When you first execute a query to retrieve the table, ignoring the related Photo, only the scalar values of the Tree will be returned from the database and Photo will be null (see Figure 2).


[Click on image for larger view.]
Figure 2. A Tree instance retrieved from the database without its photo.

Both the MVC Model Binder and EF have the ability to validate the Required annotation. When it's time to save the edited Tree, its Photo will still be null. If you're letting MVC perform its ModelState.IsValid check in the controller code, it will recognize that Photo is missing. IsValid will be false and the controller won't even bother calling the repository. In my app, I've removed the Model Binder validation so I can let my repository code be responsible for any server-side validation. When the repository calls SaveChanges, EF validation will detect the missing Photo and throw an exception. But in the repository, we have an opportunity to handle the problem.

If the Tree class had a foreign key property -- for example, int PhotoId -- that was required (allowing you to remove the requirement on the Photo navigation property), the foreign key value from the database would've been used to populate the PhotoId property of the Tree instance. The tree would be valid, and SaveChanges would be able to send the Update command to the database. In other words, if there were a foreign key property, the Tree would have been valid even without the Photo instance.

But without the foreign key, you'll again need some mechanism for providing the Photo before saving changes. If you have your Code First classes and context set up to perform lazy loading, any mention of Photo in your code will cause EF to load the instance from the database. I'm still somewhat old-fashioned when it comes to lazy loading, so my personal choice would probably be to perform an explicit load from the database. The new line of code (the last line in the following example, where I'm calling Load) uses the DbContext method for loading related data:

public void InsertOrUpdate(Tree tree)
{
  if (tree.TreeId == default(int)) {
  ...
  } else {
    context.Entry(tree).State = EntityState.Modified;
    context.Entry(tree).Reference(t => t.Photo).Load();
  }
}

This makes EF happy. Tree will validate because Photo is there, and EF will send an Update to the database for the modified Tree. The key here is that you need to ensure the Photo isn't null; I've shown you one way to satisfy that constraint.

A Point of Comparison
If the Tree class simply had a PhotoId property, none of this would be necessary. A direct effect of the PhotoId int property is that the Photo property no longer needs the Required annotation. As a value type, it must always have a value, satisfying the requirement that a Tree must have a Photo even if it isn't represented as an instance. As long as there's a value in PhotoId, the requirement will be satisfied, so the following code works:

public class Tree
{
  // ... Other properties
  public int PhotoId { get; set; }
  public TreePhoto Photo { get; set; }
}

When the controller's Edit method retrieves a Tree from the database, the PhotoId scalar property will be filled. As long as you force MVC (or whatever application framework you're using) to round-trip that value, when it's time to update the Tree, EF will be unconcerned about the null Photo property.

Easier, but Not Magic
Although the EF team has provided more API logic to help with disconnected scenarios, it's still your job to understand how EF works and what its expectations are when you're moving data around. Yes, coding is much simpler if you include foreign keys in your classes, but they're your classes and you're the best judge of what should and shouldn't be in them. Nevertheless, if your code was my responsibility, I would surely force you to convince me that your reasons for excluding foreign key properties outweighed the benefits of including them. EF will do some of the work for you if the foreign keys are there. But if they're absent, as long as you understand what EF expects and how to satisfy those expectations, you should be able to get your disconnected applications to behave the way you want.

About the Author

Julie Lerman is a Microsoft MVP, .NET mentor and consultant who lives in the hills of Vermont. You can find her presenting on data access and otherMicrosoft .NET topics at user groups and conferences around the world. She blogs at thedatafarm.com/blog and is the author of "Programming Entity Framework" (2010) and "Programming Entity Framework: Code First" (2011), both from O'Reilly Media. Follow her on Twitter at twitter.com/julielerman.

comments powered by Disqus

Featured

  • Random Forest Regression and Bagging Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the random forest regression technique (and a variant called bagging regression), where the goal is to predict a single numeric value. The demo program uses C#, but it can be easily refactored to other C-family languages.

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

Subscribe on YouTube