Restricting Columns Retrieved in Entity Framework

A couple of months ago, I wrote a column on how to avoid downloading columns in a table that has hundreds of columns or columns containing large objects (or, at least, only downloading those columns when you want them). But that solution only makes sense when getting the columns you want is something that you'll be doing frequently.

If, on the other hand, you have exactly one place in your application where all you want to get is, for example, the Customer's first and last names then there's a simpler solution: Just define a class that has the columns you want.

Two caveats: First, you can't do updates through the objects you've retrieved using this technique. Second, don't expect to get a huge performance gain from this unless you're avoiding retrieving many other columns or the columns you're avoiding are blob columns.

As an example, to get the Customer's first and last name columns I'd begin by defining a class, outside of my Entity Framework model, to hold just those columns:

Public Class CustFirstLastName
   Public Property FirstName As String
   Public Property LastName As String
End Class

Now, I write a LINQ query to retrieve just those two columns by instantiating the class in my LINQ statement's Select clause and setting its properties with values retrieved through Entity Framework. This code assumes that my DbContext object (in the db variable in this example) has a collection called Customers:

Dim lastFirstNames = From c In db.Customers
                     Select New CustFirstLastName With {
							    .FirstName = c.FirstName,
							    .LastName = c.LastName

The SQL generated by Entity Framework to get the data from the database will just grab the FirstName and LastName columns because that's all that's been used in the Select statement.

If you're new to Entity Framework, you probably consider this obvious -- that's what EF should do. But, in the early days of EF, this wasn't the behavior you got: EF always retrieved all the rows specified in the entity class (in this case, whatever class makes up that Customers collection). EF's gotten smarter since then and you can take advantage of it.

But, as I said, my CustFirstLastName class is not part of my entity model. If I make changes to the CustFirstLastName object's properties and call my DbContext object's SaveChanges method those changes will not be transferred back to the database. To update the database, I need to make my changes to whatever objects are in the Customers collection in my sample code.

Posted by Peter Vogel on 12/02/2014 at 1:51 PM0 comments

Retrieving Multiple RecordSets in a Single Trip to the Database

I know that I keep going on about this, but: The best way to speed up your application is to retrieve all the data you need on each trip to the database and make as few trips to your database as you can. One way to do that when retrieving rows is to retrieve multiple sets of rows on each trip.

This means that you can reduce trips to the database in a stored procedure, by returning multiple sets of rows from a single stored procedure with a single call. If you're using ADO.NET, you can combine multiple Select statements in your Command object's CommandText property (just make sure you put a semicolon between the statements):

Dim cmd As New SqlCommand
cmd.CommandText = "Select * from Customers; Select * from Countries;"

When you call ExecuteDataReader to get your DataReader, the DataReader will be processing the first set of records returned from your stored procedure or from the first Select statement in your CommandText:

Dim rdr As SqlDataReader
'Processing customers
rdr = cmd.ExecuteReader()

You can process that recordset or not -- your choice. When you're ready to process the next set of rows, just call DataReader's NextResult method. This command moves the DataReader to process the Countries that I retrieved:


Because of the way that ADO.NET talks to your backend database will vary from one database engine to another and on how much data you're retrieving, I can't guarantee that each NextResult won't trigger another trip to the database (ideally, all of the data will come down to the client in one trip).

But you're guaranteed that you'll only make one trip to the database when you make the initial request and that's a good thing.

And, as I mentioned in another tip, if you want to mix some update commands in with your Select statements, you can do that, too -- saving you even more trips. I wouldn't suggest that combining these tips eliminates the need for stored procedures; I would, however, suggest that you only use stored procedures when you need some control logic mixed in with your SQL statements.

Posted by Peter Vogel on 11/20/2014 at 1:51 PM0 comments

Passing Exception Information

In the bad old days, when an application threw an exception, we frequently extracted the system-generated message and put it on the screen for the user to read. Often it included information that we'd prefer not to share with the outside world (table names and details of the connection string, for instance).

A better practice is to generate an application-specific message that reveals just what you want. And, unlike most system messages that describe what's wrong, your message could tell the user something useful: what to do to solve the problem. A unique message will also help you identify where things have gone wrong in your application. The right answer is to create your own Exception object with a unique message:

Catch Ex As Exception
  Throw New Exception("Something has gone horribly wrong")
End Try

However, when you're debugging, the information you need to prevent the exception from happening again is in the original exception object.

As some readers pointed out to me in comments to an earlier tip, the right answer is to pass the original exception as the second parameter to the Exception object's constructor. Enhancing my earlier code, the result looks like this:

Catch Ex As Exception
  Throw New Exception("Something has gone horribly wrong", Ex)
End Try

The Exception object you pass as the second parameter will show up in the InnerException property of the Exception object you're creating.

Posted by Peter Vogel on 11/13/2014 at 1:51 PM0 comments

Best Tip. Ever: Change Your Visual Studio Development Default

When you installed Visual Studio, you picked a group of default settings, primarily driven by what language you intended to use (C#, Visual Basic, etc.).

The odds are that you'll never need to change those settings ... but it does happen. If, for example, your company changes from Visual Basic to C#, you'll find that all of your C# project templates are buried under Other Project Types.

Here are the steps to change your settings:

  1. From the Visual Studio Tools menu, select Import and Export Settings to open the Import and Export Settings wizard.
  2. In the first panel of the wizard, select the Import selected environment settings option. Click the Next button.
  3. Select No, just import settings option and click the Next button.
  4. On the Choose a Collection of Settings to Import page of the wizard, select the settings you want. Click the Next button.
  5. On the final page of the wizard, click the Finish button, the OK button on any warning dialogs, and the Close button on the final page of the wizard.

It's as easy as that.

Posted by Peter Vogel on 11/06/2014 at 1:51 PM0 comments

Simplify Your Code with TransactionScope

In an earlier column, I referenced using TransactionScope instead of the ADO.NET Transaction object. The problem with an ADO.NET Transaction object is that it must be associated with each ADO.NET Command object that's involved in the transaction. In addition, all of those Command objects must use the same Connection object. If those Command objects are spread over multiple methods, then you end up having to pass the Transaction object to each method. And, unless you've declared your Connection object globally (not a great idea), you'll also have to pass the Connection to those methods.

For example, you end up writing code like this:

  Dim trans As SqlTransaction
  Dim cn As new SqlConnection("…")
  Dim cmdNewClaim As SqlCommand = New SqlCommand("…")

  cmdNewClaim.CommandType = CommandType.StoredProcedure
  cmdNewClaim.Connection = cn
  cmdNewClaim.Transaction = trans
  GetRecordOwner(processableInvoices.First.RecordOwner, cn, trans)
Catch Ex As Exception
End Try

Using TransactionScope simplifies this code tremendously by enclosing every ADO.NET call within a transaction -- even when the call is inside another method. You need to add a reference to System.Transactions and include an Imports/Using statement for System.Transactions. Wrapping the previous code in TranscationSocpe object declared in a using block eliminates some lines of code and simplifies the call to the enclosed method. You don't need to include a rollback call because if the code reaches the end of the Using block without a call to Complete, your transaction is automatically rolled back:

Using trns As TransactionScope = New TransactionScope
    Dim cn As new SqlConnection("…")
    Dim cmdNewClaim As SqlCommand = New SqlCommand("…")

    cmdNewClaim.CommandType = CommandType.StoredProcedure
    cmdNewClaim.Connection = cn
End Using

As an extra benefit, if the enclosed method uses a different connection, the connection should be automatically elevated to a distributed transaction.

Posted by Peter Vogel on 11/03/2014 at 1:51 PM0 comments

Speed Up Apps by Doubling Up on Database Access

The slowest thing you can do in your application is read or write to your hard disk.

The second slowest thing you can do is issue a request to another computer. This means, of course, that whenever you access your database you're doing the two slowest things you can manage. Which means that one of the simplest things you can do to speed up your application is to reduce the number of trips you make to your database, even if you don't make any changes to the amount of data you update or retrieve.

Imagine, for instance, you have this code that first adds a record and then retrieves the number of records present after the insert:

cmd.CommandText = "Insert Into ... ;"

cmd2.CommandText = "Select Count(*) from ... ;"
Dim res As Integer
res = cmd.ExecuteScalar()

As it's written, this is going to involve two trips to the database. There's no reason, however, that the two SQL commands can't be combined into a single request, executed by calling ExecuteScalar:

cmd.CommandText = "Insert Into  ...;" & "Select Count(*) from  ...;"
Dim res As Integer
res = cmd.ExecuteScalar()

Part of the problem is that the ADO.NET method names (ExecuteReader, ExecuteScalar and ExecuteNonQuery) suggest there's only one kind of SQL statement you can use with any method. But, in fact, the three method names really reflect what's returned: ExecuteReader returns a DataReader that will let you work through the rows returned by a Select statement, ExecuteScalar returns the first column of the first row returned by a Select, and ExecuteNonQuery returns the number of rows updated.

You're free to pass any kind of SQL statement (or combination of SQL statements) to any of these methods and it will probably work out for you. If you need, for example, to issue some updates and then retrieve the results, then combine your Update/Insert/Delete statements with a Select statement, execute the commands with a call to ExecuteReader and then use the resulting DataReader to process the rows you get back.

Posted by Peter Vogel on 10/10/2014 at 1:51 PM0 comments

Files and Classes: Organize by Use

The default organization method Visual Studio uses when adding classes is to put each class in a separate file. That doesn't mean you should do the same thing. This is especially true in Visual Studio 2012, which combines Class View with the Solution Explorer standard File View (and adds a search capability on top of that).

Even in earlier versions of Visual Studio, you can always get to a class's code just by clicking on the class name in your code and pressing F12. With those tools in place, it's hard to see much advantage in being able to scroll to the class file in the Solution Explorer file view. Keeping each class in a separate file is an option, not a requirement.

Off the top of my head, I can think of at least two occasions where it makes sense to put multiple classes in the same file. First, where you have a class that's used only by one other class, it's probably easier for everybody if you keep those two classes in a single file. Second, EventArgs classes that are generated in one class and returned from that class's events might be best kept in the same file with the class that generates it.

I bet some of you can think of other occasions where it makes more sense to put two classes in the same file than it does to put them in separate files. That's OK.

Posted by Peter Vogel on 10/07/2014 at 1:51 PM0 comments

NimbleText: An Editor, Only Better

I've admitted it before: Regular expressions defeat me.

NimbleText gives me an editor and an "English-like" way of writing templates that will convert a list of data values into something more useful. Under the hood, it uses regular expressions to identify the data to change but I'm insulated from that. NimbleText isn't a Visual Studio add-in so you have to leave Visual Studio to use it, but even with that limitation NimbleText lets you do wonderful things.

NimbleText is relatively user-friendly: It's well-documented and has menus for selecting and inserting NimbleText keywords into your templates (though many of my templates don't need them). More importantly, NimbleText has a library of snippets to help get you started.

This snippet, for instance, integrates NimbleText keywords and JavaScript to generate C# properties from a list of datatypes and property names:

      <% $0.toLowerCase() %>private $0 <% $1.toCamelCase() %>;

public $0 <% $1.toPascalCase() %> {
  get { return <% $1.toCamelCase() %>; }
  set { <% $1.toCamelCase() %> = value; }

I've also used NimbleText to process text files of data (eliminating duplicate rows, for instance). I can't tell you that the learning curve is zero, but it's pretty darn flat.

Posted by Peter Vogel on 10/02/2014 at 1:51 PM0 comments

Return Types for Methods that Return Collections

In a tip from a couple of months ago, I suggested that if you have a method or property that returns a collection, then your method should return only one of three interfaces: IList, IQueryable or IEnumerable. (Returning an interface allows you to change the type of the collection used inside your method without breaking the clients that use your method.)

Not surprisingly, I got some interesting feedback from readers. One reader pointed out that if you're going to allow the client to update your collection, Microsoft recommends your method return ICollection. ICollection allows the client to add to the end of your collection, to clear your collection, and to remove specific objects from the collection; ICollection doesn't allow the client to insert or delete items at specific positions in your collection. In other words, ICollection allows the client to update the collection you return only by adding to it or by removing specific objects. I can see how that could be a better choice than IList, which allows the client to arbitrarily insert and remove items by position.

Side note: Interestingly, neither ICollection or IList supports one of my favorite methods, AddRange, which allows you to add a collection of items to the end of another collection (AddRange is only available on the List class itself). I'm fond enough of AddRange that I've created my own extension method that adds the method to any class that implements IList. I'll switch that extension method's definition over to using ICollection.

No one objected to my recommendation for using IQueryable with methods that return Entity Framework results. However, I realized I was being narrow-minded. You should use IQueryable to return the result of any LINQ query where you want to give the client the ability to extend the query by using the result in another LINQ query (this gives the compiler more options in optimizing the final query). That's going to be an Entity Framework result 90 percent of the time, but I should be as inclusive as possible.

However, another reader did suggest that, if you were only going to let the client read your collection, IReadOnlyList is a better choice than IEnumerable. Just like IEnumerable, IReadOnlyList allows the client to retrieve items at specific locations and to build For…Each loops to process the list. Declaring your method as returning IReadOnlyList does have the advantage of being more obvious than IEnumerable in communicating that your collection is read-only, though. Unfortunately, IReadOnlyList is only available in the Microsoft .NET Framework 4.5 and newer.

So here's the latest and greatest version of my advice: If your method or property is returning a collection and you want to allow the client to add or remove objects in a controlled way (add only at the end, remove only by object reference), return ICollection; if you want to give the client the ability to work with the items in your collection by position, return IList. If you're returning the results of a LINQ query and want to give the client the ability to use the result in another LINQ query in an efficient way, use IQueryable. For everything else (which just leaves read-only collections) use IReadOnlyList; if IReadOnlyList isn't available, use IEnumerable.

Posted by Peter Vogel on 09/23/2014 at 1:51 PM0 comments

Set a Breakpoint and Start Debugging in Visual Studio

I didn't know I could do this until a few weeks ago: While still in Edit mode, you can right-click on a line of code and select Run to Cursor. Visual Studio will compile your application (if necessary), start your application in Debug mode and stop on the line you've selected. If your cursor is already on the line where you want to stop, you don't need to touch your mouse -- just press Ctrl+F10 to get the same result. Once Visual Studio stops on your line, you can set more permanent breakpoints by pressing F9.

In retrospect, this was obvious: I've used Run to Cursor while in Debug mode for years. And, to add insult to injury, every time I've right-clicked on a line of code in Edit mode, Run to Cursor has been right there on the shortcut menu. I just never thought to use it until recently.

Posted by Peter Vogel on 09/09/2014 at 1:51 PM0 comments

To Sync or Not to Sync

One reader's comments in an article I wrote about Entity Framework's async programming  turned into an interesting discussion on the role of asynchronous programming in the modern world (interesting for me, at any rate). Back in the day, I used to tell developers that the surest way to create a bug they'd never be able to track down was to write what we used to call "multi-threaded applications." I gave seminars on designing multi-threaded application where, perversely, I spent the first five minutes explaining why you shouldn't do multi-threading unless you absolutely had to. And then, of course, I'd go home and write multi-threaded applications for my clients: do as I say, not as I do.

Obviously, multi-core processors and the new async tools in the .NET Framework have changed the environment tremendously. But I discovered during the discussion that I still think of asynchronous programming as something you do when you have specific issues in your application that only asynchronous programming will solve. In my mental model, I would write my code synchronously, see where I had responsiveness issues, and then rewrite those parts of the application to use asynchronous code (which, on occasion, could trigger some architectural changes to the application). Only if I could see the responsiveness problems in advance would I start by writing asynchronous code. But I was always doing it as an alternative to my default mode: writing synchronous code.

The commenters challenged that, effectively saying that (in many cases) asynchronous programming should be the developer's default choice. As one reader pointed out, a blocked thread can take up to a megabyte of memory while it's idling. Integrating async programming can eliminate that loss.

Of course, there is a question of whether you care about that megabyte: for about $10,000, I can get a server with 256 gigabytes of RAM -- that's over a quarter of a million of those megabytes that we were worrying about saving. The fully loaded chargeback for a computer programmer would swallow that ten grand in a couple of days; so if the programmer is spending much "extra" time to save the odd megabyte, it doesn't make fiscal sense.

But here's the point: If the costs of writing the code asynchronously from the start is "trivial" (to quote another reader), shouldn't I be writing asynchronously from the beginning? You wouldn't need to buy the server, and while the incremental cost for the developer time in writing async from the start might not be zero, it could easily be negligible.

It's a powerful argument. I don't think I'm there yet (I still see the async tools as making it easier to do asynchronous programming when I need to do it), but I may be coming around. I still worry about those bugs you'll never track down, though. The exception EF raises when executing multiple asynchronous processes on the same context seem to me to be the kind of problem that wouldn't occur during development, but would raise its ugly head in production, for example. But I may be worrying unnecessarily.

Posted by Peter Vogel on 07/08/2014 at 7:29 AM0 comments

Return an Interface from Your Methods for Maximum Flexibility

If I'm writing a method that returns a collection, I can, of course, declare my method's return type using a class name, like this:

Public Function GetCustomersByCity(City As  String) Returns List(of Customer)

But by declaring my return type as class in this way, I restrict my method to only returning that class (in this example, a List). I might eventually want to rewrite the method to return some other class, but my overly-specific return type will prevent me from doing that. A much better practice is to just specify an interface name when returning a collection. That allows me to return any class I want, provided I pick a class that implements the interface I choose.

You want to choose an interface that applies to the maximum number of classes (giving you maximum flexibility in deciding what class to use), while also exposing all the functionality that someone using your method will want to use (giving your clients exactly as much flexibility as you want). There's going to be some conflict here because, presumably, the most common interface is going to be the one with the least functionality. Microsoft gives you at least three choices: IQueryable, IList and IEnumerable.

From the point of view of supplying functionality, if you just want to give your users the ability read the entries (i.e. loop through the collection with a For…Each loop or apply LINQ queries to it), any of these interfaces will do. If you want to give the application that's calling your method the ability to add or remove items from a collection, you'll want to return the IList interface (that does restrict your method to returning classes that support adding and removing items, which means, for example, that you won't be able to return an array from your method).

From the point of view of giving yourself maximum flexibility, IEnumerable is your best choice (both IList and IQueryable inherit from IEnumerable). A quick, non-exhaustive survey suggests to me that IQueryable is your most limiting choice (you can't return a List from a method with a return type of IQueryable). But performance matters also: IQueryable is the right choice for LINQ queries running against Entity Framework, because IEnumerable doesn't support server-side processing or deferred execution the way IQueryable does.

Summing up, my current best advice is: Use IList if your clients need to change the collection; IQueryable if your method is returning an Entity Framework result; IEnumerable for everything else.

I bet I'm going to get comments about this advice …

Posted by Peter Vogel on 06/28/2014 at 4:36 AM0 comments

Upcoming Events

.NET Insight

Sign up for our newsletter.

I agree to this site's Privacy Policy.