LINQ to SQL vs. DBA’s
Why is it that in Oracle world, it’s much more custom to have a database administrator (DBA) on your project than when you’re working with SQL-Server? I believe it brings great value to a project to have a dedicated experienced DBA working in the team. And I think part of the problem is the fact that Oracle is much more complex to manage, whereas SQL-Server figures out a lot by itself. That’s of course a lot of power coming from SQL-Server, but it has its negatives that most developers don’t think they require a DBA that often.
I’m writing this because since LINQ and specifically LINQ to SQL has been released, I’ve had some discussions on wether the query engine of LINQ to SQL is smart enough to create solid and performing queries. There are a lot of things to say here, because when is a query fast enough? This also depends on the requirements of your architecture. Otherwise it’s just a gut feeling of when it’s not fast enough anymore.
But someone stated recently that
a LINQ to SQL query can never be as fast as a stored procedure written by a DBA
LINQ to SQL could never optimize your query because it doesn’t know enough about the database, whereas a DBA does. The conclusion was that you should stick to regular ADO.NET. Of course this is nonsense. For multiple reasons
- Dynamic (aka ad-hoc) queries are as fast as stored procedures
There’s been a great debate in the past about this, check out the this article by Rob Howard, its comments and the response from Frans Bouma. I won’t go into details there anymore about what is fast and/or more secure, but the result isn’t too friendly towards Stored Procedures. It’s all about execution path, which is cached for both types of queries.
Personally I don’t really like Stored Procedures because a lot of people put way too much logic in these. Never in my life have I seen business logic or conditional statements in dynamic (aka ad-hoc) queries. And I also hate editing them, but that’s a personal thing. - LINQ to SQL must be optimized, as would a DBA do with his queries
It’s really obvious, but some seem to forget this. Although LINQ to SQL seems very smart to me, you should know what’s happening where and when and how to optimize this. Charlie Calvert posted a nice article on deferred execution in LINQ to SQL. When you know how this works, a lot of the examples are obvious. Until he reaches a point where he gives an example on displaying the row-count three times in a row. When you don’t pay attention, the query is executed three times! If you’re more likely to make this mistake, you’re probably better of using your own T-SQL queries. - LINQ to SQL supports Stored Procedures
During a presentation I gave, I once got the question if we could still use the power of stored procedures with LINQ to SQL. I responded that we could indeed use stored procedures, but would rather leave ‘the power of’ out of the sentence. You can use Stored Procedures, forget about the query engine and only use the feature of mapping relational data onto objects in LINQ to SQL.Why I’d rather leave out ‘the power of’ is first because of non existing performance differences, as explained in bullet 1 in this article, and because of the following reason. Imagine you have a database called “Northwind” with a table filled with Customers who all live in a certain city. Image we’d have a Stored Procedure called CustomersByCity that wants a city as parameter and returns all columns for all customers living in that city. We can execute this in LINQ to SQL like this:
var query = db.CustomersByCity(“London”);
Creating a smaller view would look like this:
var query = from c in db.CustomersByCity(“London”)
select new { c.ContactName, c.City };
And adding another where clause to filter out some more customers:
var query = from c in db.CustomersByCity(“London”)
where c.ContactName.Contains(‘A’)
select new { c.ContactName, c.City };
The truth is that our ‘powerful’ Stored Procedure is returning the same results in every single example. In the second code example, the new view is created from a result where still all columns are returned. And in the third example a new LINQ to Objects query is executed over the results coming from the Stored Procedure. The above examples are very likely to happen when you only use Stored Procedures and most developers won’t know what really happened. In a normal LINQ to SQL query, this would’ve been optimized into a smaller query.
- Not many projects have (dedicated) DBA’s
I’ve done my share of projects, for both large and small companies, customers and projects. At the ones that I was lucky enough to have a DBA available, we wouldn’t likely have database performance issues. But unfortunately most projects I’ve been on did not have the luxury of a DBA. On some of those, I’ve seen developers create chaos with T-SQL. For those projects, simple LINQ to SQL queries are probably much better than some of the stuff that developers can create.
I hope I’ve explained why it’s not logical to easily ditch LINQ to SQL in favor of the regular ADO.NET and Stored Procedures, simply because of performance or other issues. I’m not saying you should use LINQ to SQL everywhere and I’m not saying it’s the silver bullet. I’m also not saying you can never write more performing queries in Stored Procedures than LINQ to SQL can. When you’re working with and querying large sets of data, you’re probably better of using Stored Procedures instead of retrieving thousands of rows into your application. But not using it at all because of the wrong reasons and not giving it a try without some investigation or proof of concept, might be even worse. It can save you a serious amount of time in your development.
I don’t buy the whole ad-hoc queries are just as fast as stored procs, especially complex ones. From my understanding is is all about caching/compiling execution plans, which you again from my understanding you can’t do with ad hoc queries. As far as being against logic in your data tier, that is you choice. I think in heavy data centric environments you almost have to put some logic into stored procs for various reasons. One being the database might be more available for changes then the code base. Two with really complex normalized data models good luck trying to put all that logic in code rather then in a stored proc. Often times you will end up making six or seven trips to a database so you can accomplish what could be done with one trip in a stored proc.
My two cents. I don’t claim to be either a code nor SQL guru. I think as far as a development choice…more LINQ and less Stored Procs, or vice versa, is really a matter of choice for most developers. Let’s face it, most of us aren’t pushing the limits of acceptable performance for most projects anyway. Its always a best practice to code like we are, but sometimes cost/time/comfort level/knowledge plays into the equation as well.
@infocyde : about compiling and caching execution plans, read SQL Books Online and you’ll know. It’s there somewhere, I don’t feel like looking it up 🙂
Your DBMS is better in working with large sets of data. If that’s what you need to do, keep it in the database. Don’t get many MB’s of data into code to enumerate over the results.
I put 99.9% of all logic in my code, simply because I can. Only for large dataset optimizations I turn to the database.
Of course its still a choice. But since O/RM and/or Linq-to-SQL, I don’t see any business logic anymore in my database. Now even if inline queries were slower (which they aren’t) I’d sacrafice that to having all business logic in code.
One other element you aren’t factoring into your test is the client (meaning web server) processing hit to handle business logic that might be done faster in a stored proc. I see that missed a lot in test. I also see a lot of speed test run with really simple data manipulations, I wish I could see more complex test posted. Plus for some operations if you are doing things in code you will have to make multiple trips to the db, which could degrade performance signifigantly.
I know some execution plans might be faster if generated on the fly rather then cached depending upon various parameters. For complex operations I would bet cached execution plans would almost always be faster, but I guess I could be wrong.
Anyway, if your apps do what they should do and the users of your apps are happy with them (and it sounds like they are), you are doing a great job and no one has the real right to question your design choices, so I will shut up. Success is my measurement of what works best and what doesn’t.
#4 should be first, and it should be described as, IF your shop doesn’t have a DBA, then continue. otherwise, setup a meeting.
I am a DBA + .Net developer. I got certified in SQL 2005 and .Net 1.1. I can tell you that the stored procedure offers much more functions than LINQ
1. Use of Lock Hint. A common example will be inserting an unique email into the login table. Most people do the following:
IF NOT EXISTS(SELECT * FROM Login WHERE EMAIL = @email)
INSERT INTO LOGIN(email) SELECT @email
the correct way to do is
BEGIN TRAN
SELECT * FROM Login WITH (UPDLOCK) WHERE EMAIL = @email
IF @@ROWCOUNT=0
INSERT INTO LOGIN(email) SELECT @email
COMMIT
The reason is to stop duplicate email to put into Login table. There are many other scenarios where LOCK hint is the ONLY solution.
2. Prevent of deadlock in stored procedure. Writing deadlock free query requires the queries to be runnning in “specific order”
3. Use of recursive SQL with CTE
4. Use of SQL 2008 new data type
5. You can assign a query plan to a specific query or specify query hint. It can solve the Parameter sniffing issue
6 You want to do a SELECT sql using ORDER BY COLLATE the column by different collation in sql. This important especially working with 2 different db with 2 different collation. Especially in different language, the sorting order is different
uggg..SQL code in a .NET program? So everything I want to change the logic in the SQL, I need to redeploy the app, meaning that I need to schedule downtime? uggggg. Maybe LINQ can be useful in small projects, but then maybe its just best to use Access.
@Bob : You’re Bob and you’re a DBA’r? That’s too good to be true! 😉
But anyway, I’ve never seen a system that could change logic inside an application by changing its T-SQL (either SProc, Views or whatever). If your application does, it’s a bad practice and I can guarantee that it’s WAY less maintainable than my applications are.
I’m with Bob on this one.
Stored procs expose an interface to the application, and as long as the dba maintains that interface, you can change anything you like behind it without having to redeploy the app. That’s been immensely useful in my experience.
It’s been a while since I wrote this, but I suddenly remember a quote.
“we want the database to be an essentially reliable data store and keep as much stuff as possible in logic so that we have more control over it. The same reason why we don’t use stored procedures: anything that happens in the database is invisible to developers”
By Dan Pritchett on architecture at eBay.