Newest Viewed Downloaded

Common mistakes developers make in SQL Server“Amateurs work until they get it right. Professionals work until they can't get it wrong.”

Common mistakes developers make in SQL Server

“Amateurs work until they get it right. Professionals work until they can't get it wrong.” The samples in this presentation should not be taken explicitly. Use them as a guideline. Actual numbers and figures may vary between different machines. The quote above is coined by GMC or NFL ad/speaker. ‹#›

Who am I?

Self-employed 6 years, Visual Basic 3.0 to 6.0, Access 1.0 to 95 Long time consultant 14 years, Visual Basic 6.0, SQL Server 4.21 to 2008. Employed 2 years, SQL Server 2000 to 2008R2. MVP – Microsoft Valuable Professional SQL Server since 2009. Active on several forums www.sqlteam.com www.sqlservercentral.com msdn.microsoft.com

Agenda

Faster hardware. Row By Agonizing Row (pronounced Ree-bar) Bad indexing. Index misuse. Triggers. Indeterministic functions. All examples in this presentation is taken from former and current clients of mine. We have now 44 minutes to finish. ‹#›

Faster hardware

The golden bullet You can shoot yourself in the foot. Therefore the ”Bullet” idiom. ‹#›

Faster hardware

You can throw faster hardware at a problem if you want but don't be surprised if it doesn't actually help much. True performance can only be found in the code and the design of the database. Let's stop and think about faster hardware to do the same thing Let's say the CPU speed and disk throughput is doubled. That would mean a 2 hour query might drop to 1 hour. Rewriting the old code could drop the query time to 3 seconds. How much does it cost to buy new hardware, install it and run? Compared to put 2-3 hour to tune the query? ‹#›

Faster hardware

More cores mean more queries can request massive numbers of locks simultaneously and the badly written queries need many locks that may cause performance to deteriorate under faster hardware. SQL Server spends more time waiting and less time working. Would you get any better performance from an inexperienced driver if you gave them the fastest car in the world to drive? Buying better hardware will only help those that actually know how to drive it over the long run. ‹#›

Faster hardware

Inexperienced driver/programmer often think ”First working solution is ok”. But this assumption is based on a small number of records in the development environment and when the code is ported to production server, with maybe 100 or 1,000 times the recordcount, the solution drop through the floor. ‹#›

Faster hardware

What happens when you give the inexperienced driver access to your brand new server? Inexperienced driver may drive ok for the first hundred yards until they hit the same tree. Again. Or same curve. Take your pick. ‹#›

Faster hardware

This is a scenario found at a client in early 2009. Source table has about 20,000 records and Template table has about 15,000 records. As you can see there is a simple one-column join. No index is present, as originally seen. Both queries produce 109,984 records on a single core processor (to the left) and dual core processor (to the right). The query to the left without parallellism (single core) utilize 1,805,000 reads during 217 seconds (216 CPU-seconds). The query to the right with parallellism (dual core) utilize 1,845,000 reads during 113 second (221 CPU-seconds). As expected, the parallell query takes only half the time since workload is spread out between processors. CPU is a little more because the overhead to control parallellism. And of course the about same amount of reads, because it’s still the same amount of records to be read. Due to parallellism, some pages are read twice or more. Let audience vote (raise hand) if this was a good solution or not. ‹#›

Faster hardware

Same query as before, but now with a proper index in place. The query to the left without parallellism (single core) utilize 59,000 reads during 4 seconds (3 CPU-seconds). The query to the right with parallellism (dual core) utilize 59,000 reads during 3 second (4 CPU-seconds). Much, much better! Buying new hardware only brought down the query to 113 seconds. Adding a proper index gave a result of 4 seconds (28 times faster then dual processor)! Now, let the audience vote (raise hand) if is a better solution than buying a second processor. ‹#›

Faster hardware

Rewritten query. I could use the old server as development machine. The rewritten query to left is without index and the query to the right is with proper index. Both queries now only need 120 reads and both runs for 1 second! The query without index uses 0.1 CPU-second and the query with index uses 0.05 CPU-second. Now, let the audience vote (raise hand) if is a better solution than buying a second processor, or adding an index. ‹#›

Faster hardware

What does this tell us? Besided the red bar to the right is after adding a second CPU? At the time, I was really surprised by the very simple 15k table join to 20k table. They added a second CPU and of course the time was cut in half. I was there to fix another problem and they used their old machine as development machine. This example really mark the statement “Performance is where the code is”. Indexing can be used to hide bad queries but neither adding a proper index nor buying faster hardware will ever replace a proper written query. ‹#›

Faster hardware

Query Index Parallellism Reads CPU Duration Old query No No 1,805,000 216.00 217.00 Old query No Yes 1,845,000 221.00 113.00 Old query Yes No 59,000 3.00 4.00 Old query Yes Yes 59,000 4.00 3.00 New query No No 120 0.10 1.00 New query Yes No 120 0.05 1.00 Overview for old query and rewritten query. Parallellism indicates that this is the upgraded machine with two CPU’s. There are no data for parallellism with the rewritten query because parallellism was not needed. Adding an index made the query use 30 times less reads. Rewriting the query made it use 15,000 times less reads. The difference between proer index and rewritten query is 500 times less reads. What does it mean if you see the number of reads like a rent you pay the network administrator to use his/hers disks? What if you had to pay $0.10 for every 1,000 reads you make. In this case the original query made you pay $180.50 every time you run the query! The rewritten query make you pay $0.01 every time you ran the query. You still pay for each CPU-second on mainframes. Let’s say you pay $0.10 per second. Old query fines you $21.60 and new query $0.01. Every time! Adding an index made the query use 70 times less CPU. Rewriting the query made it use 2,000 times less CPU. ‹#›

Row By Agonizing Row

Loops, cursors and triangular joins This is the second hallmark of SQL: it is a declarative language, you tell the computer WHAT you want to do, not how to do it. No need for loops, cursors, IF or WHILE tests. The WHERE clause in each of these statements limits the action to only those rows that meet that condition. ‹#›

Row By Agonizing Row

Phrased by Jeff Moden Good friend and fellow SQL Server MVP from SQLServerCentral. http://www.sqlservercentral.com/articles/T-SQL/61539/ Learn to recognize Triangular Joins They appear in the SELECT list as correlated subqueries usually with a stand-alone inequality of some sort They can also appear in WHERE clauses with the same type of standalone inequality. Not all Triangular Joins are bad With some restraint and the right criteria, Triangular Joins can be used for some pretty remarkable things Make finite schedules Do high speed dupe checks But, you've really got to be careful. Improperly written Triangular Joins are worse than even Cursors or While Loops and can bring a CPU and Disk System right to it's knees. Watch out for "Hidden RBAR". ‹#›

Row By Agonizing Row

Many folks think that set-based programming means one of two things It simply doesn't have a Cursor or a While Loop. It's all done in a single query. Both of those are very bad misconceptions as to what set based programming is Touch each row of data only once if you can and as few times as possible if you can't. A loop Works row-by-agonizing-row. Overrides the natural abilities of the optimizer. Only operates on a single row instead of a set of rows. Most explicit loops are not set based programming and they just crush performance. How much does it cost to buy new hardware, install it and run? As seen in previous chapter, new hardware doesn’t always give better performance compared to rewrite. How much does it cost to put 2-3 hours for tuning the query? ‹#›

Row By Agonizing Row

Cursor-based solution Set-based solution Set based solutions are often 100 – 1,000 times faster. Sometimes even up to 10,000 times faster! You have a cup of coffee (or tea), and you want to put sugar in it. You can use tweezers, or a spoon. Cursors are tweezers. ‹#›

Row By Agonizing Row

SELECT a.RowNum AS RowNum_A, b.RowNum AS RowNum_B FROM TableA AS a, TableB AS b WHERE a.RowNum <= b.RowNum What is set based programming? Obviously, it does involve "avoiding the loop", but it's not that simple. The above example is a Triangular Join. Also known as non-equi-join. Always use Nested Loop Join. ‹#›

Row By Agonizing Row

Good examples from the Phil Factor Speed Phreak competitions http://ask.sqlservercentral.com/questions/92/the-subscription-list-sql-problem Cursor 780 seconds (13 minutes). Set-based 0.3 seconds - 2,500 times faster! http://ask.sqlservercentral.com/questions/826/the-fifo-stock-inventory-sql-problem Cursor 2,400 seconds (40 minutes). Set-based 1.3 seconds – 1,800 times faster! http://ask.sqlservercentral.com/questions/6529/the-ssn-matching-sql-problem Cursor 4,000 – 4,500 seconds (70 - 75 minutes). Set-based 0.5 seconds – 9,000 times faster! SQLCLR 0.4 seconds – 11,000 times faster! All timings are on same hardware for each competition. Goto http://ask.sqlservercentral.com and search for ”Speed-Phreak”. ‹#›

Row By Agonizing Row

Kathi Kellenberger, former SQL Server MVP, provides in-depth descriptions and analyses of the different solutions here http://www.simple-talk.com/sql/performance/writing-efficient-sql-set-based-speed-phreakery/ http://www.simple-talk.com/sql/performance/set-based-speed-phreakery-the-fifo-stock-inventory-sql-problem/ http://www.simple-talk.com/sql/performance/ssn-matching-speed-phreakery/ Kathi’s author profile on Simple Talk http://www.simple-talk.com/author/kathi-kellenberger/ ‹#›

Showing 1 - 20 of 40 items Details

Name: 
Common mistakes
Author: 
Peter Larsson
Company: 
Developer Workshop
Description: 
Common mistakes developers make in SQL Server“Amateurs work until they get it right. Professionals work until they can't get it wrong.”
Tags: 
query | 000 | the | and | row | faster | sql | hardware
Created: 
8/31/2010 6:31:34 PM
Slides: 
40
Views: 
1
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap