You are on page 1of 14

Performance Tip: Find Your Most Expensive Queries

September 13, 2012 12:33 PM



One thing Im asked on a regular basis when working with SQL Server Consulting clients that
dont have much experience with performance tuning, is how to find what their most expensive
queries are. Only, the problem is that the answer to that question can be a bit more complex than
you might think. Happily though, Ive got a pretty decent answer to that question. But first, some
background.
See also, "Estimating Query Costs" and "Understanding Query Plans."
SQL Server Uses a Cost-Based Optimizer
While I dont have time to cover it in any detail here, one of the things that makes SQL Server
special is that it boasts a cost-based optimizer. Meaning, that if you feed it a complex query, itll
iterate over a number of different potential ways to execute your query against the storage engine
in order to find a plan with the least expensive cost. And, to figure out cost, that means that SQL
Server ends up having to know a decent amount about your data such as how many rows are in
each table queried along with the distribution of unique values within those rows (or columns
being filtered against) as well as an understanding of how likely it is that hits will be found for
the joins or filters being specified. (Or in other words, if you fire off a SELECT * FROM
dbo.SomeTable WHERE blah = yadayada; statement, SQL Server has (or will have) a pretty
good idea of not only how many rows are in dbo.SomeTable, but itll also have a rough idea of
how many of them have a blah column with a value equal to yadayada. For more information, I
highly recommend taking a peek at this fantastic MCM presentation on Statistics from
Kimberly L. Trip of SQLSkills.)
Long story short though, as powerful and amazing as SQL Servers cost-based optimizer is (and,
make no mistake, its part of SQL Servers secret recipe), one of the great things about SQL
Server is that we can actually view the costs associated with particular operations simply by
either highlighting the query in question within SQL Server Management Studio and pressing
CTRL+L to have SQL Server either go and generate or fetch (from the cache) and then Display
[the] Estimated Execution Plan for any query or operation, or by executing the query with the
Include Actual Execution Plan option toggled as shown in the following screen capture:

A Very Rough Overview of Costs
Then, once youre able to view an execution plan, one of the great things about it is that youre
able to see the cost of not only the entire execution plan, but each individual operation that makes
up the plan simply by mousing-over it as shown below:

And, again, a key thing to call out here is that these costs (estimated or otherwise) are based on
SQL Servers knowledge of the size of your tables as well as the cardinality and distribution of
your data. Or, in other words, these costs are based upon statistics about your data. Theyre not,
therefore, something tangible like the number of milliseconds associated with an operation. As
such, the best way to think of them is that lower numbers are better unless you want to try and
get into some of the nitty-gritty details about how these costs are calculated (which, again, is
proprietary information or part of SQL Servers secret sauce).
With that said, theres still a way to frame these costs to provide an idea of what costs roughly
mean in the real world.
.003. Costs of .003 are about as optimized as youre going to get when interacting with the
storage engine (executing some functions or operations can/will come in at cheaper costs, but Im
talking here about full-blown data-retrieval operations).
.03. Obviously, costs of .03 are a full order of magnitude greater than something with a cost of
.003 but even these queries are typically going to be VERY efficient and quick executing in
less than a second in the vast majority of cases.
1. Queries with a cost of 1 arent exactly ugly or painfull (necessarily) and will typically take a
second or less to execute. Theyre not burning up lots of resources, but theyre also typically not
as optimized as they could be (or they are optimized but theyre pulling back huge amounts of
data or filtering against very large tables).
5. Queries with a cost greater than 5, by default, will be executed with a parallel plan meaning
that SQL Server sees these queries as being large enough to throw multiple
processors/cores/theads-of-execution at in order to speed up execution. And, if youve got a
web site thats firing off a query with a cost of 5 or more per every page load, for example, youll
probably notice that the page feels a bit sluggish loading maybe by a second or two as
compared to a page that would spring up if it was running a query with a cost of, say, .2 or
lower. So, in other words, queries up in this range start having a noticeable or appreciable cost.
20. Queries in this range are TYPICALLY going to be something you can notice taking a second
or so. (Though, on decent hardware, they can still end up being instantaneous as well so even at
this point, things still depend on a lot of factors).
200. Queries with this kind of cost should really only be for larger reports and infrequently
executed operations. Or, they might be serious candidates for the use of additional tuning and
tweaking (in terms of code and/or indexes).
1000. Queries up in this range are what DBAs start to lovingly call queries from hell though
its possible to bump into queries with costs in the 10s of thousands or even more depending
upon the operations being executed and the amount of data being poured over.
And, in case its not obvious from some of the descriptions above, the thresholds Ive outlined
above REALLY need to be taken with a grain of salt meaning that theyre just rough
approximations to try and give these numbers a bit of context (for those that arent very
experienced with performance tuning in general).
The True Cost of Operations
However, while taking a single query and comparing its cost in isolation is a great way to tune
that operation to get better performance out of it (i.e., by adding/tuning indexes and/or tuning the
code to make it better your GOAL is to decrease costs since LOWER costs are BETTER costs),
it isnt a viable way to know what your most expensive query on a given server or within a given
database is.
For example, which query is truly an uglier query from a performance standpoint? That
big/horrible/ugly report that management likes to run once a day at 7PM with a total cost of
1820.5? Or a single operation with a cost of .23 that gets called over 800,000 times in the same
day? Typically a query with a cost of .23 wont really be scrutinized that much because its
optimized enough. But if its called at highly repetitive rates, then that cost is incurred over and
over and over again typically during periods of peak load on your server. And, in fact, if you
multiply .23 * 800K, you end up with a total, aggregate, cost of 184,000 something that makes
that nightly query look like childs play.
As such, finding your most expensive queries is really a question of juggling execution costs
against execution counts because its only when you consider both concerns that you start
getting a sense for the TRUE costs of your most expensive operations.
Querying SQL Server For Your Most Expensive Queries
Accordingly, a while back I wrote a query that takes advantage of a couple things to be able to go
in and actually ask SQL Server for a list of Top Worst Performing queries. To do this, I took
advantage of the fact that SQL Servers query optimizer KEEPS execution plans in the cache
once it generates a good plan for a query or operation thats been sent in to the server. I also took
advantage of the fact that SQL Server keeps track of how many TIMES that execution plan gets
used (or re-used) by subsequent executions as a means of determining execution counts. Then, I
also took advantage of the fact that SQL Server exposes these execution plans to DBAs as XML
documents that can then be parsed and reviewed to the point where you can actually use
XPATH to interrogate an execution plan an 'extract the actual cost of a given operation.
And with that, I was able to come up with a query that will scour SQL Servers plan cache, grab
execution costs from each plan in the cache, and then multiply that number against the total
number of times that the plan has been executed to generate a value that I call a Gross Cost or
the total cost of each operation being fired over and over again on the server. And with that
information, its then possible to easily rank operations by their true cost (as in execution
cost * number of executions) to find some of your worst queries on your server.
The code itself isnt too hard to follow and its patterned in many ways on some similar-ish
queries that Jonathan Kehayias has made available where he too uses XPath/XQUERY to zip
in and aggressively query full-blown execution plans within the plan cache:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;

WITH XMLNAMESPACES (DEFAULT
'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
core AS (
SELECT
eqp.query_plan AS [QueryPlan],
ecp.plan_handle [PlanHandle],
q.[Text] AS [Statement],
n.value('(@StatementOptmLevel)[1]', 'VARCHAR(25)')
AS OptimizationLevel ,
ISNULL(CAST(n.value('(@StatementSubTreeCost)[1]',
'VARCHAR(128)') as float),0) AS SubTreeCost ,
ecp.usecounts [UseCounts],
ecp.size_in_bytes [SizeInBytes]
FROM
sys.dm_exec_cached_plans AS ecp
CROSS APPLY sys.dm_exec_query_plan(ecp.plan_handle)
AS eqp
CROSS APPLY sys.dm_exec_sql_text(ecp.plan_handle)
AS q
CROSS APPLY query_plan.nodes
('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn
( n )
)

SELECT TOP 100
QueryPlan,
PlanHandle,
[Statement],
OptimizationLevel,
SubTreeCost,
UseCounts,
SubTreeCost * UseCounts [GrossCost],
SizeInBytes
FROM
core
ORDER BY
GrossCost DESC
--SubTreeCost DESC
Limitations of this Approach
Of course, there ARE some limitations with the query Ive pasted above.
First, its an AGGRESSIVE query typically weighing in with one of the WORST costs on
many servers (i.e., it commonly shows up as a top 10 offender on servers that havent been
running for very long or which dont have lots and lots of really huge performance problems.)
And that, in turn, is because while SQL Server can do XML operations, they typically end up
being VERY expensive. And, in this case, this query performs expensive XQUERY iterations
over each and every plan in the cache meaning that it typically takes a LONG time to run.
However, even despite how LONG this query will typically take to run (remember that it has to
grab the raw cost of every plan in the cache before it can calculate a gross-cost based on total
number of executions meaning that theres no way to 'filter out any particular plans), it wont
block or cause problems while it runs.
Another limitation of this approach is that it can only calculate gross-costs against accurate
execution counts meaning that if you have expensive queries (or lots of tiny little queries called
over and over and over again) that get kicked out of the cache, then the execution counts arent
going to be as high as they really/truly are, and youll therefore suffer from lower gross costs
and may, therefore, miss some of your worst performing queries.

But otherwise, the query listed above provides a fantastic way to quickly and easily be able to go
out and query a SQL Server to get a list of your worst performing queries. Then, once you have
them, youre free to analyze the execution plans in question (by simply clicking on the QueryPlan
column in the results pane which should kick you out to a Graphical ShowPlan; if it doesnt Id
recommend this post by Aaron Bertrand for a fix for a stupid bug that Microsoft refuses to
address), and then, armed with information about how FREQUENTLY a particular operation is
being called, you can then spend whatever energy and effort is necessary to try and tune that
operation as needed in order to try and drive its total, overall (or gross) cost down.


--END OF CURRENT ARTICLE

SQL SERVER Find Most Expensive Queries Using DMV
May 14, 2010 by pinaldave
The title of this post is what I can express here for this quick blog post. I was asked in recent
query tuning consultation project, if I can share my script which I use to figure out which is the
most expensive queries are running on SQL Server. This script is very basic and very simple, there
are many different versions are available online. This basic script does do the job which I expect to
do find out the most expensive queries on SQL Server Box.
SELECT TOP 10 SUBSTRING(qt.TEXT, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(qt.TEXT)
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2)+1),
qs.execution_count,
qs.total_logical_reads, qs.last_logical_reads,
qs.total_logical_writes, qs.last_logical_writes,
qs.total_worker_time,
qs.last_worker_time,
qs.total_elapsed_time/1000000 total_elapsed_time_in_S,
qs.last_elapsed_time/1000000 last_elapsed_time_in_S,
qs.last_execution_time,
qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER BY qs.total_logical_reads DESC -- logical reads
-- ORDER BY qs.total_logical_writes DESC -- logical writes
-- ORDER BY qs.total_worker_time DESC -- CPU time
You can change the ORDER BY clause to order this table with different parameters. I invite my
reader to share their scripts.




Script to Identify worst Performing Queries
Posted by Mahesh Gupta on September 3, 2011Leave a comment (1)Go to comments
Being a DBA, to optimize a performance, a DBA need to know
What are most most frequently executed queries on your system.
What are the Queries / Statements which which makes system real busy.
What are the top worst performed queries
How much IO is being caused by a particular query
What is the CPU processing time to execute a particular query
What is frequency of executing these worst performing queries.
We can find most of these information in DMV sys.dm_exec_query_stats, where we can rate SQL
statements by their costs.
These costs can be
- AvgCPUTimeMiS = Average CPU execution time
- AvgLogicalIo = Average logical operations
or the total values of this measures.
/*---------------------------------------------------------------------------------
----------------------------------------------Description : This stored
procedure will send out alert email if there is a blocking which lasted more than
specified duration)
-- Copyright 2011 - DBATAG

-- Author : DBATAG
-- Created on : 09/01/2011
-- Modified on : 09/01/2011
-- Version : 1.0
-- Dependencies :
-- Table Procedure
Permissions
-- No Dependencies No Dependencies View
Server State Permissions Required
-------------------------------------------------------------------
---------------------------------------------------------*/
-- List expensive queries
DECLARE @MinExecutions int;
SET @MinExecutions = 5

SELECT EQS.total_worker_time AS TotalWorkerTime
,EQS.total_logical_reads + EQS.total_logical_writes AS TotalLogicalIO
,EQS.execution_count As ExeCnt
,EQS.last_execution_time AS LastUsage
,EQS.total_worker_time / EQS.execution_count as AvgCPUTimeMiS
,(EQS.total_logical_reads + EQS.total_logical_writes) / EQS.execution_count
AS AvgLogicalIO
,DB.name AS DatabaseName
,SUBSTRING(EST.text
,1 + EQS.statement_start_offset / 2
,(CASE WHEN EQS.statement_end_offset = -1
THEN LEN(convert(nvarchar(max), EST.text)) * 2
ELSE EQS.statement_end_offset END
- EQS.statement_start_offset) / 2
) AS SqlStatement
-- Optional with Query plan; remove comment to show, but then the query takes
!!much longer time!!
--,EQP.[query_plan] AS [QueryPlan]
FROM sys.dm_exec_query_stats AS EQS
CROSS APPLY sys.dm_exec_sql_text(EQS.sql_handle) AS EST
CROSS APPLY sys.dm_exec_query_plan(EQS.plan_handle) AS EQP
LEFT JOIN sys.databases AS DB
ON EST.dbid = DB.database_id
WHERE EQS.execution_count > @MinExecutions
AND EQS.last_execution_time > DATEDIFF(MONTH, -1, GETDATE())
ORDER BY AvgLogicalIo DESC
,AvgCPUTimeMiS DESC


OUTPUT Screenshot










Finding Most expensive queries in SQL server
SQL server has provided a number of DMV's which could be used to find the resources
consumed by different queries. This is very useful feature specially when you would like
to find the queries which needs to be tuned. This can be quite useful for the DBA's which
are proactive in finding the performance related issues.

I will use te
sys.dm_exec_query_stats,sys.dm_exec_query_plan,sys.dm_exec_query_text DMV's to
find the queries which are perfroaming badly on your systems.

Usage of query:

Based on this you can find Top 5,10 or 20 resource consuming queries like top 20
queries by logical reads or cpu reads or physical reads etc..

1. If your system is CPU starved.Try to use the ranking based on CPU time.
2. If you are more concerned with elapsed time.Try to use ranking based on elapsed
time.
3. If your system has IO related issues try to use the ranking based on physical and
Logical reads.

Now how to decide which one should be used Total or average. It depends on various
factors.But most of times Total values you should use as these will give you clear
pictures on which queries are either badly perforaming or are being executed a huge
number of times. In Both cases this metric is better than average one.

However, there might be queries which will use the with recompile and these queries
wont have the cummulative data and thus you would not get very clear picture on
queries eating up your resources if you will use the Total metric.But these might appear
in average metric and thus you should have a look on the queries based on average as
well.

Now this wont give you a kind of cost which sql server provides while estimating the
plans. Thus we can not say that a query which is taking 10 CPU seconds is more
expensive than a query clocking 100000 logical IO's. Thus you have to find the
expensive queries based on each Metric like CPU time,Elapsed Time,Logical reads etc..

Also, there are another tings to take into accounts.

1. The queries which are using hash and sort joins needs memory grant which is not part
of these DMVs but it is in separte DMV sys.dm_exec_query_memory_grant. This is quite
useful when you have memory related issues. Thus you might see low logical and
physical reads and the query might not appear in your top 5,10,20 list but it is one of the
main resource consuming query. Thus when looking for memory related issues.Please
see the details in the memory_grant DMV. However, the cpu time will have the values
for sort done etc..Thus in these cases using CPU time is a better choice.

with PerformanceMetrics

as

(

select

substring

(

dest.text,

statement_start_offset/2,

case when statement_end_offset = -1 then LEN(dest.text)

else statement_end_offset

end /2

) as 'Text of the SQL' ,

deqs.plan_generation_num 'Number of times the plan was generated for this SQL',

execution_count 'Total Number of Times the SQL was executed',

total_elapsed_time/1000 'Total Elapsed Time in ms consumed by this SQL',

Max_elapsed_time/1000 'Maximum Elapsed Time in ms consumed by this SQL',

min_elapsed_time/1000 'Minimum Elapsed Time in ms consumed by this SQL',

total_elapsed_time/1000*nullif(execution_count,0) 'Average Elapsed Time in ms consumed by this
SQL',

total_worker_time 'Total CPU Time in ms consumed by this SQL',

Max_worker_time 'Maximum CPU Time in ms consumed by this SQL',

min_worker_time 'Minimum CPU Time in ms consumed by this SQL',

total_worker_time/nullif(execution_count,0) 'Average CPU Time in ms consumed by this SQL',

total_logical_reads 'Total Logical Reads Clocked by this SQL',

Max_logical_reads 'Maximum Logical Reads Clocked by this SQL',

min_logical_reads 'Minimum Logical Reads Clocked by this SQL',

total_logical_reads/nullif(execution_count,0) 'Average Logical Reads Clocked by this SQL',

total_physical_reads 'Total Physical Reads Clocked by this SQL',

Max_physical_reads 'Maximum Physical Reads Clocked by this SQL',

min_physical_reads 'Minimum Physical Reads Clocked by this SQL',

total_physical_reads/nullif(execution_count,0) 'Average Physical Reads Clocked by this SQL',

total_logical_writes 'Total Logical Writes Clocked by this SQL',

Max_logical_writes 'Maximum Logical Writes Clocked by this SQL',

min_logical_writes 'Minimum Logical Writes Clocked by this SQL',

total_logical_writes/nullif(execution_count,0) 'Average Logical Writes Clocked by this SQL',

deqp.query_plan 'Plan of Query',

DENSE_RANK() over(order by total_elapsed_time desc) 'Rank of the SQL by Total Elapsed Time',

DENSE_RANK() over(order by total_elapsed_time/nullif(execution_count,0) desc) 'Rank of the SQL
by Average Elapsed Time',

DENSE_RANK() over(order by total_worker_time desc) 'Rank of the SQL by Total CPU Time',

DENSE_RANK() over(order by total_worker_time/nullif(execution_count,0) desc) 'Rank of the SQL by
Average CPU Time',

DENSE_RANK() over(order by total_logical_reads desc) 'Rank of the SQL by Total Logical reads',

DENSE_RANK() over(order by total_logical_reads/nullif(execution_count,0) desc) 'Rank of the SQL
by Average Logical reads',

DENSE_RANK() over(order by total_physical_reads desc) 'Rank of the SQL by Total Physical Reads',

DENSE_RANK() over(order by total_physical_reads/nullif(execution_count,0) desc) 'Rank of the SQL
by Average Physical Reads',

DENSE_RANK() over(order by total_logical_writes desc) 'Rank of the SQL by Total Logical Writes',

DENSE_RANK() over(order by total_logical_writes/nullif(execution_count,0) desc) 'Rank of the SQL
by Average Logical Writes',

DENSE_RANK() over(order by execution_count desc) 'Rank of the SQL by Total number of
Executions'

--similarly you can add the ranks for maximum values as well.That is quite useful in finding some of
the perf issues.

from

sys.dm_exec_query_stats deqs

/*F0C6560A-9AD1-448B-9521-05258EF7E3FA*/ --use a newid so that we could exclude this query
from the performanc emetrics output

outer apply sys.dm_exec_query_plan(deqs.plan_handle) deqp --sometimes the plan might not be in
the cache any longer.So using outer apply

outer apply sys.dm_exec_sql_text(deqs.sql_handle) dest --Sometimes the text is not returned by the
dmv so use outer apply.

where

dest.text not like '%F0C6560A-9AD1-448B-9521-05258EF7E3FA%'

)

select

*

from

PerformanceMetrics

where

1=1

--apply any of these where clause in any combinations or one by one..

and [Rank of the SQL by Average CPU Time] <= 20

and [Rank of the SQL by Average Elapsed Time] <= 20

and [Rank of the SQL by Average Logical reads] <= 20

and [Rank of the SQL by Average Physical Reads] <= 20

and [Rank of the SQL by Total CPU Time] <= 20

and [Rank of the SQL by Total Elapsed Time] <= 20

and [Rank of the SQL by Total Logical reads] <= 20

and [Rank of the SQL by Total Physical Reads] <= 20

and [Rank of the SQL by Total number of Executions] <= 20

and [Rank of the SQL by Average Logical Writes] <= 20

and [Rank of the SQL by Total Logical Writes] <= 20