Entity Framework: Use with Caution

I've talked about the Entity Framework before and I'm certainly an advocate for it when using it makes sense, but for me it tends to be unsuitable more often than not.  When does it make sense to me?

  1. A large project with a single developer that will be responsible for everything
  2. A small project with 1 - 3 developers
  3. A large project with a team of developers that will develop the application in horizontal slices (i.e. there will be at least one developer dedicated to the data layer)

I think #3 is really the Entity Framework's sweet spot, but outside of that I'd argue that you're better off using something like Massive or Dapper.  Again, this all comes down to using the right tool for the problem at hand and how your development team operates, but that's getting a little off topic.

I recently used the Entity Framework for a project that matches the description of #2 listed above.  I thought it would be a good idea to capture all of the issues that came up using the Entity Framework:

  1. SQL Server 2005 support is fragile unless the EDMX file is generated off of a SQL Server 2005 database instance from the start, and for all subsequent updates.  If you don't, you'll find yourself manually updating the EDMX file's XML a lot.
  2. Explicit use of TransactionScope escalates to a Distributed Transaction on SQL Server 2005.  Upgrading to Entity Framework 6.0 allowed a workaround using a Transaction object.
  3. UDFs are treated as query language enhancements.  Manual code needs to be written to access them in your C# code.  This is so you can use them in C# query statements.
  4. The database needs to have proper Primary and Foreign Keys.  This doesn't sound like an issue, but it is if you need it to work with legacy systems that you don't control.
  5. A true understanding of what’s happening (i.e. when queries are run and what queries they are) is not apparent without advanced knowledge or tracing.  This means that unless a developer really knows what's going on, they could be triggering lots of database calls as they dot into objects.
  6. Views with no defined Primary Key return the same row over and over again.  The correct query gets run, unique records are transferred over the network, but the framework is trying to be "smart" about what data it hands your code... which ends up being the wrong data.
  7. Maintaining a customized EDMX file is a huge pain.  Merging is a nightmare when multiple developers are making concurrent changes.

For the average developer the Entity Framework is a black box that magically gets you your data.  This leads to two problems:

  1. When and where database interactions happen are not clear to the developer, which can lead to lots of odd performance problems
  2. If there is a problem with how the "magic" works, it involves a lot of troubleshooting and you may discover that there's no good way to fix the problem and be forced to work around it

When working on a team of full-stack developers that are also the database experts, the hoops that it forces them to jump through in order to do something that should be simple just doesn't make sense.  I tend to favor simple tools that are very clear about how they operate; this results in far less surprises midway through a project.

Your Data Layer Matters!

Of course it matters.  What you use for your data layer is a fundamental decision that will affect a good chunk of your application and the development workflow that your team will need to follow.  This is one of those foundational elements that I don't think developers give enough thought.  I've come across too many developers that seem to have a "favorite" data layer and subsequently force it into every project that they work on.  I'm looking at you Entity Framework lovers!

Okay, that's unfair.  It's not just the Entity Framework, and frankly it gets a bad wrap, but that's because it gets misused a lot.  However, since I brought it up I'll continue to pick on it, but I could be picking on any data layer or technology in general.

The Entity Framework has some very specific use-cases that it's very good at.  One of those is getting an application up and running from scratch with very little effort.  Since it's so easy to get up and running at the start of development, and it works really well for small projects (i.e. most developers' proof-of-concept projects), I see a lot of developers commit to this technology without fully considering the greater implications of doing so.

The Entity Framework (or similar heavyweight ORM) can actually become quite a burden as a project grows and multiple developers are making changes to the data layer in parallel.  It's great if you develop your application in horizontal slices (i.e. one developer is responsible for the entire data layer), but if you develop your application in vertical slices (i.e. one developer is responsible for a feature end-to-end) then working with a heavyweight ORM just becomes a tax on your developer's productivity.  The developers either need to identify overlap and coordinate all of their changes upfront, or deal with the nightmare that is merging EDMX files after the fact.

If you're about to start a project where you have multiple developers working concurrently and each developer is responsible for a full feature (vertical slice of your application), then a heavyweight ORM is likely going to be more trouble than it's worth.  I would urge you to consider a lightweight or dynamic data layer like Massive or Dapper.

I don't think anything I'm saying here is revolutionary or anything you likely haven't heard before.  This is really about choosing the right tool to solve the problem at hand.  I just wanted to remind you that it's not just the problem that needs to be considered; you also need to consider what the division of labor will be among your development team.  Unfortunately, that means you can't blindly use your favorite technology on every project; always pick the right tool to solve the problem at hand.