Performance – Kendra Little's Blog

This is one of those little details that confused me a ton when I was first working with execution plans.

One problem with learning to work with plans is that there’s just SO MUCH to look at. And it’s a bit spread out.

So, even when looking at a single tooltip, things can be confusing.

Let’s talk about the nested loop operator, who can be particularly weird to understand.

Meet our nested loop

Here’s a nested loop from a SQL Server Execution plan:

For every row that comes in from the top right index seek, SQL Server goes and does the bottom right index seek. Like this:

I think the best way to explain this was tweeted by Andy Mallon:

@Kendra_Little @sqlstudent144 I thought the outer table was a pile of crackers. The inner a pile of cheese. And nested loops make snacks.

— Andy ᴹᴬᴸᴸᴼᴺ (ᴀᴍ²) (@AMtwo) August 16, 2016

But when you hover over that bottom index seek (the inner input), things may look at first like they’re wrong with our nested cheese and crackers.

We’re trained early to compare estimated vs actual rows in plans

One of the first things we often learn when we’re looking at plans is that SQL Server uses estimates. And sometimes, those estimates are wrong. At first glance, this looks really wrong– it estimated 11.5 rows, and actually got 20,825 rows!

The highlighted numbers look waaaay off

Similarly, we see these same wrong-looking numbers if we hover over the line between the nested loop operator and the “inner” seek:

Read “estimated number of rows” as the estimate per execution

With a nested loop, you have to remember to also look at the number of executions, and do a little math. The number of executions is on the tooltip of the seek itself, but I often have to do a double take to find it, because it’s so crowded. Here it is:

The estimate here is 11.5692 rows per execution * 2,055.56 executions = 23,782.22598 estimated rows.

And that’s not terribly far off from the 29,969 rows that it ended up reading.

When you see what looks like a bad estimate, take a second look!

Check the estimated number of executions and do a little math. SQL Server may have known exactly what it was doing, after all.

I recently mapped out my thought process for how I approach a new instance of SQL Server when it comes to index tuning. It now looks like this:

Highlight: Can I use Query Store?

One of the first things I think about is whether the new 2016 Query Store option is available to collect query runtime statistics and execution plans. Information on query duration, reads, cpu use, and execution plans are so critical to index tuning that I care a ton about this new feature.

And it is a new feature. It’s even had its first big bugfix — if you’re running it on something other than Enterprise or Developer Edition, make sure you’ve tested and installed CU1, which contains this fix for query store cleanup.

I’m a big fan of SQL Server’s plan cache and index management dynamic management views — but I love that Query Store takes away the mystery of wondering what might be missing from the cache, or which missing index requests might have been cleared by an index rebuild.

Observation: Tuning indexes is most effective when you analyze the top execution plans to design your indexes — not the missing index DMVs

When I first began tuning indexes in SQL Server, I largely reviewed and followed missing index suggestions in the missing index DMVs. I learned to combine those suggestions together and with indexes on the tables.

My tuning style has evolved from this, though. SQL Server’s index recommendations are useful, but they’re very rough – sometimes they suggest columns for the includes which you don’t absolutely need. Sometimes they suggest a column as an include that would be better in the key. Sometimes they overestimate the benefit the index would provide. And sometimes you just don’t get a suggestion at all.

It’s not that the Missing Index feature doesn’t work, it’s simply that the missing index feature is not designed to fine-tune an index workload. And that’s totally fair – those index requests are generated during query optimization, and that’s definitely something that we want to be fast!

What I very much prefer these days is to look at the top running queries during the periods I want to tune. I like to examine the execution plans and CPU, reads, and duration for the statements along with the plans.

I do still like to look at missing index suggestions, I just prefer to do it in the context of the plan asking for it.

The reason that I love the whole concept of Query Store is that it’s a built in way to make this a whole lot easier!

Every query tuner wants to explain exactly how much faster we made a query.

But sometimes SQL Server Management Studio adds noticeable overhead to the query duration. For relatively fast queries that return more than a few rows, just the overhead of displaying the results can skew your duration metric.

Spoiler, for those short on time: executing your queries from the SQL Sentry Plan Explorer client can be handy because it always discards results, shows you duration, CPU, reads, and writes, and will display a message if you mess up your TSQL and get a parsing error. It’s free, and it’s a client tool with no need to install anything on the SQL Server itself.

SQL Server Management Studio bloats duration a bit

I frequently use the command SET STATISTICS TIME ON to tell SQL Server to print information about how long my query took to the ‘Messages’ tab in SSMS. This is an easy, convenient tool to use interactively as you rewrite a query or tweak indexes in a test environment.

I typically look at the ‘CPU time’ metric when tuning instead of ‘elapsed time’ (duration). This can work well for tuning because you’re measuring how much more efficient you made the query in terms of CPU cycles.

But ‘CPU time’ isn’t perfect, and it can get a little weird for reporting results to users, because:

If the query uses parallelism, CPU time can be higher than the duration — which may make the query seem “slower” than it actually is to anyone reading a report
‘elapsed time’ includes all the time that it takes to display the results in Management Studio, which is probably a different duration than it would take to return the results to an application server. If you’re just returning a few rows, this may be negligible– but once it gets into the thousands of rows, it can be very noticeable.

What about ‘discard results after execution’ in SSMS?

SSMS has a setting in the “Query” menu’s “Query Options” panel that tells it to throw away the query results.

Unfortunately, this also discards everything that displays on the ‘Messages’ tab, too.

In other words, you won’t see your SET STATISTICS TIME information.

And worse, if you make a change to your TSQL so that it doesn’t even parse, you won’t see any errors. It’ll just look like it suddenly became very very fast. If you’re like me, that will fool you for a second, and you’ll be like, “I AM THE SMARTEST PERSON IN THE WORLD!”

Until you realize that your query is just silently failing.

What about ‘Include Client Statistics’?

There’s a little button on the SSMS toolbar that tells it to start tracking some metrics for your session:

You can combine this with ‘discard results after execution’. Your client statistics won’t get discarded, and it will even track your average metrics across multiple runs of your query, like this:

I like the concept of this, but…

I don’t love having to discard results after execution (because I fat-finger the TSQL a lot and would really like to see those parsing errors, thanks)
I don’t have the option to see logical reads (which I could get with SET STATISTICS IO), and it doesn’t know about CPU time. It’s just measuring client statistics, like the name says.

And so I never find myself using Client Statistics much. If it was Server Statistics for the query, that’d be awesome! Oh well.

SQL Sentry Plan Explorer makes this a bit easier

I’ve been opening query plans in SQL Sentry’s ‘Plan Explorer’ tool for years, but I never ran a query from it until this week. I’d just never really thought about it before.

To run a query from Plan Explorer, click the ‘Get Actual Plan’ button. A pop up lets you know it’s going to discard the results, but it will really be running that query (so don’t go insert/update/delete/truncating if you don’t really mean it).

When the results come back, the first thing you’ll probably notice is a picture of the execution plan. But if you look up at the top of the window in the results pane, you can find Duration and CPU time there:

What if your query fails?

And if you’re like me, and you occasionally get a little too creative in your TSQL, it doesn’t just silently fail and tell you the query was fast. It’ll clue you into your error at the bottom of Plan Explorer:

I’m not breaking up with Management Studio

SQL Server Management Studio will still remain my primary client tool. It’s really great for general purpose use – that hasn’t changed. I’ll still use SET STATISTICS TIME for a lot of general purpose tuning: it’s quick, it’s easy, it’s built in.

For fine tuning queries were performance is really important, and where I want a more accurate view of duration while I’m looking at execution plans, Plan Explorer is super helpful.

Thanks for making it completely free, SQL Sentry!

What tools in SQL Server will notify you about blocking and help track the queries behind your toughest blocking and deadlocking problems?

Watch the 21 minute video, subscribe to the podcast, or read the episode notes and links below.

Dear SQL DBA,

What is the best way to set up blocking and deadlock alerts on the server? I want to be notified automatically without any impact on the prod server.

I have tried alerts with SQL server performance condition alerts with SQL server agent. They do not show the queries or tables involved etc?

Thanks,

All Blocked Up

Woo hoo, I love this question!

So first off, I’m going to answer this discussing the free, built-in tools with SQL Server. If you have a budget for custom monitoring tools, you can buy fancy tools that have customized notifications for blocking and which capture the queries and plans involved. If that’s the case, set up free trials against a test system.

But not everyone has budget for every single SQL Server instance. So it’s extremely useful to know what SQL Server offers to help you with this.

And by the way, if you’re going to be at the SQLPASS Summit in Seattle in just under a month, I’m giving a session that shows different blocking scenarios. Come see my session, The Great Performance Robbery: Locking Problems and Solutions on Thursday, Oct 27th at 10:45AM in room 6C.

Free, simple blocking notifications

I like to set up blocking notifications with a simple SQL Server agent alert on the “SQLServer: General Statistic: Processes Blocked” performance counter.

This will not give you the queries involved in the blocking — don’t worry, we’ll cover tracking that in the next step.

This alert is low impact and it will let you know when you need to look at the SQL Server.

To get this alert to work, you’ll need to:

Configure the SQL Server Agent service to auto start
Set up database mail, enable it on the SQL Server Agent, and then restart the SQL Server Agent
Configure an Operator in the SQL Server Agent
Create a new ‘performance’ style alert, base it on the “Processes Blocked” counter, and tell it who to notify

A few things to note:

The SQL Agent doesn’t poll counters constantly – and we want this to be lightweight, so that’s a good thing. It will only poll every 15-30 seconds, and there’s no published guaranteed SLA on that polling frequency.
If you really need something more sensitive and reliable, you need a monitoring system fully independent / outside of the SQL Server to be polling in and checking it for availability and performance.
You can configure the alert to only fire every X minutes. I highly recommend that, so you don’t get an inbox of alerts every 20 seconds

Create some blocking in a test database or in tempdb and make sure the alert works.

I have example code to create blocking and deadlocks for your dev environments in my post, Deadlock Code for the WorldWideImporters Sample Database.

For production databases, you can create a temp table and write similar code to create blocking in those.

Finding the queries involved with the Blocked Process Report

OK, we’ve got notifications. We need SQL Server to give us more information on who is involved in the blocking.

I like to use the built-in Blocked Process Report for this. This has been in SQL Server for a long time, and it’s extremely useful.

The Blocked Process Report shows you the “input buffer” of the commands involved – it may be partial information and not the full text of the query. It will also show you the login name for who is running what, and the type of lock requests involved.

The Blocked Process Report is pretty lightweight, because SQL Server has to frequently wake up and look for blocking situations that can’t resolve themselves. By default, the deadlock monitor wakes up every 5 seconds and looks around to see if there is a deadlock which it needs to break. You may enable a feature called the ‘Blocked Process Report’ that tells SQL Server to additionally issue a report on any blocking which it finds.

To get this to work, you need to:

Enable the sp_configure option for the blocked process threshold. This defaults to ‘0’, which is off. You configure the threshold to the number of seconds you’d like the threshold to be. This should be a value of 5 or higher, because making the deadlock monitor run constantly could tank your performance. A good ‘starting’ value is 30 seconds.
You also need to set up a trace to collect an event called the ‘blocked process report’. Setting the threshold causes the event to be output, but SQL Server won’t collect it for you unless you start a SQL Trace or an Extended events trace that collects that event.

Once you have the trace file, you can copy it off of the production server to interpret it.

Michael J Swart has written a great free tool called the Blocked Process Report Viewer to help interpret the blocking chains. It’s free at https://sqlblockedprocesses.codeplex.com.

The viewer makes it easier to see the root of the blocking chain and who was blocking whom.

This trace is pretty lightweight, but with any trace you want to make sure that you don’t add a bunch of events that you don’t need, and that you periodically clean up the files and don’t let it impact drive space.

When I talk about running traces, I don’t mean running Profiler

We’re entering dangerous territory here. Whenever you talk about tracing in SQL Server these days, someone gets offended.

Here’s what you need to know. There’s two main ways to run a trace:

SQL Trace. This is the old school option. You can run this using…
1. The Profiler client (I don’t like this option)
2. A Server Side trace scripted out from Profiler (much better!). You can get up to speed on Server Side Traces reading this article on generating a service side trace by Jen McCown. (Note that she wrote this article back in 2009. That’s fine, SQLTrace hasn’t been changing since then.)
Extended Events. This is much easier to use on SQL Server 2012 and higher than in previous versions because a GUI was introduced in Management Studio for it under Object Explorer.

I do not like leaving the Profiler application running because I’ve seen it do everything from slowing down performance to filling up drives over the years. And creating Server Side traces isn’t very hard if you do want to use SQL Trace.

I personally only like to have a trace running if I know I need it and am going to look at it. So I only enable this when I have a blocking problem. Whenever you choose to leave a trace running, you need to periodically check in on the files its created and clean up after it.

Detecting and handling deadlocks

What about locking problems where SQL Server has to step in and kill one of the queries?

You have a few built in options about how to get info on this. There are some trace flags that you can turn on which cause some information about who is involved in the deadlock to be printed to the SQL Server Error Log. This isn’t my preferred option because the information is very hard to parse through and read.

I find it more helpful to get a ‘deadlock graph’, which is a picture of how the locking fight went down.

On older versions of SQL Server, you can capture the deadlock graph with a server side trace.

On newer versions of SQL Server, you can capture this with an Extended Events trace.

A great resource for deciding how to capture deadlock information is Jonathan Kehayias’ excellent Simple Talk article, Handling Deadlocks in SQL Server. He covers how to collect the graphs, shows examples of how they look, and gets you started tackling them.

If you get through this point and need to get really fancy with deadlocks, Michael J Swart recently wrote about using Event Notifications to collect execution plans related to deadlocks in his post, “Build Your Own Tools“. Just don’t try to run before you walk: this is pretty advanced stuff and you need to be comfortable using Service Broker, which is part of Event Notifications behind the scenes.

Quick recap – getting more info on blocking

A fast rundown of the free, built-in tools we covered:

I like to use a simple, light, performance alert in the SQL Server agent for notification about blocking
I like the Blocked Process report to find out who’s blocking whom – collected by a server side SQL Trace or Extended events
I find collecting deadlock graphs with either a server side SQL Trace or Extended Events to be the most helpful way to figure out who’s involved in the nastiest blocking tangles.

Want to submit a Dear SQL DBA Question?

Want clarification on some of the basics? Got a question that jumped into your mind reading or listening to this? I’d love to hear it– asking is always free and easy!

Should you look at automatically created statistics on your tables in SQL Server to help you design better indexes? Learn why in this 20 minute video, or subscribe to the Dear SQL DBA podcast.

No time to watch? Scroll on down, everything is written in article form below the video.

Here’s this week’s question:

Dear SQL DBA,

I’ve noticed that many indexes in my data warehouse aren’t used frequently. Is there a way to use the automatically generated statistics to make useful indexes?

… (insert witty pun about indexes)

I’ve been asked this question several times, and I even remember once wondering this myself.

There’s no good way to analyze the column based statistics that SQL Server automatically generates for the purpose of index design. Let’s talk about why.

First: Why do we have ‘statistics’, anyway?

Let’s say we have a table named dbo.Grades. It has columns for GradeId, ClassId, StudentId, Year, Term, and Grade.

We run a query looking for all StudentIds where Year=1990.

The table has a Clustered Primary Key on GradeId — so the whole table is stored sorted by GradeId. That doesn’t help our query at all, we want the StudentIds where Year = 1990.

The table has a nonclustered index on Year and GradeId. That nonclustered index does NOT contain StudentId, though.

So SQL Server has two main options:

Is it better for SQL Server to go look at the nonclustered index and then loop back into the base table and look up the StudentId using the GradeId?
Or it might be easier to just scan the whole dbo.Grades table and check the StudentId on each row. Maybe the school was only open for the year 1990, and all the rows have that value!

How does SQL Server know which is better?

SQL Server uses statistics to guess which way is the most efficient to execute your query.

Statistics vs Indexes (and their relationship)

Statistics are little, tiny pieces of metadata that describe the data in the table — things like an approximation of the number of rows equal to 1990. Statistics don’t take up a lot of space in the database.

When you create an index, a statistic is automatically created for the key columns in the index. This can be a multi-column statistic if you have multiple key columns, but the statistic describes the first column in the index in the most detail.

If you use a predicate like Year = 1990 against a column that isn’t the leading column in an index, SQL Server will create a single-column statistic. (Fine print: automatic stats creation can be disabled using a database setting.)

On the other hand, indexes are COPIES of the data itself for the columns defined in the index. The index on Year and GradeId on the dbo.Grades table takes up extra space on disk and on memory and has copies of all the rows for Year and GradeId.

SQL Server uses statistics to optimize an execution plan.

SQL Server uses indexes within the execution plan to find data.

What does the existence of a system created column statistic tell us?

We’ve talked a lot so far about how much statistics and indexes are related. This is why it seems like statistics might be useful for designing indexes!

But here’s the thing — SQL Server doesn’t track and report on how many times a statistic was used during optimization.

I didn’t write the optimizer, but I’m not sad about this at all, I think it’s fine, because:

Optimization needs to be very fast. SQL Server wants to start your query as soon as it can. Everything that has to be written out during that process costs CPU and time.
Just considering a column statistic in optimization doesn’t necessarily mean that a single column index on that column would be useful. There might be a multi-column index that would be useful. Or it might actually be better for it to be scanning the table! We wouldn’t know just by the fact that the statistic had been examined by the optimizer.

To have useful information about which single-column statistics might be needed in an index, SQL Server would have to do a lot of work– and it’s already got a feature in optimization for this at a higher level.

Whenever there’s more than one way to run a query, SQL Server thinks about whether an index would help the query run faster. If it thinks an index would help, it records this using the “Missing Indexes” feature.

I wouldn’t make too many assumptions on a column that lacks statistics, either

If a column does NOT have a statistic on it, that isn’t proof that the column is unused.

Statistics are automatically created on columns where you use joins and ‘where’ predicates.

Some columns may just be returned to the user without being filtered. They wouldn’t need a statistic generated for that. But the columns are still in use– and in some cases, using them as “included columns” in indexes might be useful.

So I wouldn’t use the existence OR non-existence of column statistics to make decisions about indexes.

And anyway, we have other options!

What’s the best way to design indexes?

There’s two main approaches you can take. After you’ve been doing this a while, you’ll probably mix the approaches.

Identify the most expensive queries that run in each database, and tune indexes for those. You can find the “most expensive” queries using any of the following tools: SQL Server Query Store, the SQL Server Execution Plan cache, or a monitoring tool that watches execution and persists data from the plan cache and memory for longer analysis (and doesn’t get cleared out by memory pressure or restarts).
Use SQL Server’s “Missing index requests” to see where SQL Server is asking for an index. This isn’t perfect — it has to make those requests in a hurry, and the requests get cleared out when you take the database offline (or rebuild an index on the table).

Want to learn more?

Check out my article “Updating Statistics in SQL Server: Maintenance Questions and Answers,” or my article and podcast episode “Teach Yourself SQL Server Performance Tuning“.

I’m mixing things up a bit in this episode. I want to talk about a question that keynotes and sessions at the SQL PASS Summit got me thinking about last week. Let’s talk about Adaptive Query Processing.

Watch the 24 minute video, scroll down to read the transcript, or subscribe to the podcast.

This post includes a lot of speculation

I’m headed to the Microsoft MVP Summit next week. The cool thing about the MVP Summit is that you get to learn some things that aren’t public yet. The downside is that once you get some secret info, that closes off your ability to speculate a bit… because you don’t want to speculate too close to something that is “secret”.

Everything I’m talking about today was either revealed publicly last week at the Summit, or is speculation on my part (and is pure speculation, I have no privileged insights on this stuff).

I’ll do my best to be completely clear about what’s speculation and what isn’t here.

Keynote focus: predicting the future

Perhaps speculation feels like the right topic today because Microsoft folks talked a lot about the importance of prediction in the keynotes at the PASS Summit last week.

SQL Server 2016 features R Services. This brings the ability to learn patterns and make predictions into the database engine.

Using this new feature came up a lot in the keynote. And not just for performing predictions for a user application, either: there were quite a few references about using SQL Server’s predictive powers to make SQL Server itself smarter.

So what might that mean?

We’re used to SQL Server optimizing a query before it runs

When you execute a query, the SQL Server optimizer has to quickly analyze what all its options are for executing the query with the data structures it has at hand. It uses ‘statistics’ to help it estimate how much data it will get back from each structure.

It has a lot to figure out: what types of joins should it choose? Should it use a single core or multiple cores? How much memory should it allocate for operators that need to do things like sorting and creating hash tables in memory?

It has to figure it out fast. Every microsecond taking in optimizing a query is a microsecond the user is waiting.

Once a query starts executing, SQL Server doesn’t (currently) do “re-optimization”

Once the optimizer chooses a plan, the query goes off to the races. SQL Server doesn’t have the option for it to turn back.

Some of us have wondered for a while if we might get a feature where SQL Server can change a query plan after it starts running if it looks like estimates from statistics weren’t accurate.

Oracle has a feature called “Adaptive Query Optimization” which stretches the optimization process out into the query execution phase. Oracle can start a query with a “default plan.”

I’m no Oracle expert, but here’s how their docs describe Adaptive Query Optimization:

Once the query is running, if it looks like estimates were way off, it can change portions of the plan based on what it’s finding.
It can change joins, parallelism, and even create “dynamic statistics” to get more detailed information where things looked fishy.
Oracle can also use what it learns about rowcounts after a query executes to help it optimize future executions of that query.

I’m not going through this to suggest that SQL Server will implement the same features. But it can be useful to think about what competitors are doing in terms of optimization to open up our view a little when we’re thinking about what’s possible. And of course, SQL Server can go beyond this.

Things have been changing in Azure with automatic index tuning in the SQL Database Advisor

tuning-forks This isn’t your old Database Tuning Advisor. You have a newer option called (similarly) SQL Database Advisor when you use Azure SQL Database.

The SQL Database Advisor in hosted databases can recommend indexes to create and drop, and it’ll note when queries aren’t parameterized or are getting a lot of recompiles to end up with the same plan.

You have the option to tell the SQL Database Advisor to automatically manage indexes. In this case, it’ll not only apply the index changes but watch performance after it makes the change. If things get slower, it’ll revert the change.

How well does this work in practice?

Honestly, I have no idea

But I’m starting to get really curious after the Summit this year, so I’m planning to start exploring this more.

Announced last week: Adaptive Query Processing

I attended a session called “What’s New in Azure SQL Database?” at PASS last week. This was presented by Lindsey Allen and other program manager on the SQL Server Engineering team.

There was a lot of cool stuff discussed in the session, but two bullet points in particular jumped out at me:

Performance insight and auto-tuning
Adaptive query processing

Adaptive query processing is basically a subset of what’s being called “performance intelligence”. We saw a very cool demo video that explained that Adaptive Query Processing is focusing on three things:

Row estimates for “problematic subtrees”
Adjusting memory grants
Fixing join types when needed

How is Adaptive Query Processing going to work?

I have no idea. This is a totally new area, and it was a fast moving session that quickly moved on to other new features.

I got two somewhat conflicting ideas about how this might work, and I’m looking forward to sorting it out in the future.

Count this all as pure speculation, because I may have a very skewed understanding of what I heard at this point.

This might be based on collecting information by observing a workload of queries — say, queries collected in Query Store– and using R Services to find queries where optimization needs to be improved, then giving feedback for future runs of the query.
- Simple example I can think of when it comes to memory grants: if SQL Server always requests a lot more memory than it actually uses for a frequent query, this could be learned and the grant could be reduced. This could help avoid low workspace query situations on very busy systems (aka RESOURCE_SEMAPHORE waits)
This might also involve some dynamic optimization at runtime. One slide I saw was talking about joins, and used the phrase “Defer the join choice until after the first join input has been scanned.”
- That sounds a lot like optimization may be stretching out into the execution of the query, right?
- I also saw the sentence “Materialize estimates for problematic subtrees“, which sounds like getting extra statistics for parts of the plan where estimated rows and actual rows differ. But no idea yet if this could happen on first execution of the query or would be observed across a workload after a bunch of things have run.

Speculation: to optimize a “herd”/ workload of queries, wouldn’t Query Store need wait stats?

If I did understand correctly that Adaptive Query Optimization at least in part requires using data collected from a workload of queries and analyzing it in R, then the 2016 Query Store feature seems like it’d be a big part of the picture. Query Store collects runtime statistics and execution plans.

But to do this well, wouldn’t the analysis also need to know why a query was slow? Perhaps it just couldn’t get started because it was waiting on a lock. That doesn’t necessarily mean it needs to have different joins or its memory grant changed.

This is pure speculation, but if Adaptive Query Processing uses Query Store data, this makes me think we might see Query Store collecting Wait Statistics sometime soon.

Will Adaptive Query Processing be Cloud-Only, or part of “boxed” SQL Server?

The session I was attending was specifically on Azure SQL Database.

I didn’t hear an announcement about whether this feature might be available outside of the cloud. But I also didn’t hear anything that sounded like it would prevent this feature from working in the the “install and manage it yourself” boxed version of SQL Server.

A lot of times we don’t get a clear answer on this until they start to ship previews of new major versions of SQL Server — so treat anything you hear as speculation unless it’s directly from a Microsoft Program Manager.

You can sign up for the preview of Adaptive Query Processing

Check it out yourself! https://aka.ms/AdaptiveQPPreview

Got your own speculations? Or even (gasp) some facts?

Tell me all about it in the comments!

Unless you can’t tell me because of a non disclosure agreement. Then keep it to yourself

filters-are-delicious SQL Server has two types of filtered indexes:

The “classic” filtered nonclustered rowstore index, introduced in SQL Server 2008, available in all editions
The newfangled filtered nonclustered columnstore index, introduced in SQL Server 2016, available in Enterprise Edition

These two filtered indexes are very different – and the SQL Server optimizer can use them very differently!

While classic filtered nonclustered rowstore indexes must reliably “cover” parts of the query to be used to the optimizer, filtered nonclustered columnstore indexes may be combined with other indexes to produce a plan returning a larger range of data.

This sounds a little weird. I’ll show you what I mean using the WideWorldImporters database.

Filtered nonclustered rowstore indexes (“Filtered Indexes”)

A filtered index is a nonclustered rowstore index with a “where” clause. This index contains only rows from Sales.Invoices which were last edited before February 1, 2013:

CREATE INDEX ix_nc_filter_LT
on Sales.Invoices (LastEditedWhen)
    INCLUDE (CustomerID)
    WHERE (LastEditedWhen < '2013-02-01');
GO

SQL Server may use this index if I run it for a query that also specifies LastEditedWhen < ‘2013-02-01’ (although possibly not if my query is parameterized and may be used for a date outside this range).

What if I am querying all CustomerIds, and I force SQL Server to use this index with a hint?

SELECT CustomerID
FROM Sales.Invoices WITH (INDEX (ix_nc_filter_LT));
GO

SQL Server could potentially pick up some of the rows from the filtered index, then find the rest of the rows in the base table and combine them. It’d be expensive, but it’s theoretically possible.

However, SQL Server can’t. Instead, I get this error:

Msg 8622, Level 16, State 1, Line 21
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

My query wants a larger range of data than is in my filtered rowstore index, and SQL Server won’t use the filtered rowstore index and then go find the rest of the data in another index. The optimizer just isn’t written to do this.

We’ve been used to this for years with filtered indexes. But filtered nonclustered columnstore indexes behave differently!

Filtered nonclustered columnstore indexes (“Filtered NCCI”)

Let’s create a filtered nonclustered columnstore index on the same table:

CREATE NONCLUSTERED COLUMNSTORE INDEX ix_ncci_filter_LT
on Sales.Invoices (LastEditedWhen, CustomerID)
    WHERE (LastEditedWhen < '2013-02-01');
GO

I’m not saying this little demo table needs a nonclustered columnstore, I’m just reusing it for simplicity.

Now, I force SQL Server to use this index to get all the CustomerIDs:

SELECT CustomerID
FROM Sales.Invoices WITH (INDEX (ix_ncci_filter_LT));
GO

This time, the query doesn’t fail to get a plan! I get a plan, and I get all the 70,510 rows back. The execution plan looks like this:

filterednonclusteredcolumnstore-combined

This isn’t an awesome plan. SQL Server scanned the columnstore index, then scanned the clustered index of the table to find the rows that weren’t in the columnstore index, then combined them. It did this because I forced it to use the columnstore hint.

But SQL Server can make this plan. SQL was able to do this because it understands the filter in the nonclustered columnstore index. Hover over the clustered index scan in the plan, and you can see it figured out how to find the rest of the data in the clustered index:

filterednonclusteredcolumnstore-combined-predicate

Why are filtered nonclustered columnstore indexes smarter?

Columnstore indexes shine when it comes to scanning and aggregating lots of data. While nonclustered columnstore indexes are updatable in SQL Server 2016, it’s expensive to maintain them.

SQL Server is smarter about optimizing plans with filtered nonclustered columnstore indexes so you can design your filter so that “cold” data which is unlikely to be modified is in the columnstore index. This makes it cheaper to maintain. The optimizer has the ability to use the filtered NCCI and combine it with other indexes behind the scenes.

You do want to be careful with your filter and make sure that it doesn’t have to do a clustered index scan every time it’s going to do this trick, of course!

Read more about this feature on the SQL Server database engine blog in Sunil Agarwal’s post, “Real-Time Operational Analytics: Filtered nonclustered columnstore index (NCCI).”

I recently got a fantastic question from a reader regarding lock usage in SQL Server. Here’s the question:

One of my production databases has a total lock count around 25,000 (select count(*) from sys.dm_tran_locks). The configuration setting for locks is set to the default of zero. This lock count is due to multiple procedures which frequently run and use the same 2-3 tables, repeatedly taking out and releasing locks. Do I need to change the configuration for locks or look into the SP’s so they can finish more quickly, rather than creating locks?

Our friend is correct to leave the ‘locks’ setting in sp_configure alone

The default setting of zero lets SQL Server manage the memory for locks dynamically. When use this dynamic setting, SQL Server will:

Allocate memory for 2,500 locks on startup
Acquire more memory for locks when it needs to do so (unless you’re in a memory pressure situation and it would cause paging)
Not allocate more than 60% of the memory allocated to the instance for locks
Trigger lock escalation when lock memory hits 40% of the memory allocated to the instance

If you change the setting for ‘locks’, you’re giving SQL Server a maximum number of locks that it can use.

Microsoft recommends leaving this at zero.

You could change ‘locks’ to raise the number of locks allowed. But is it a good idea to use more than 60% of your memory for locks? Even if you’re using that as a temporary band-aid, leaving less than 40% of your memory for buffer pool (data cache), execution plans, and everything else is going to be a world of hurt for most instances..

You could change ‘locks’ to lower the number of locks allowed. But not allocating locks means that the queries asking for locks are going to throw errors and fail. That’s not attractive, either.

So if you’re concerned about the number of locks you have, changing the ‘locks’ configuration setting isn’t likely to help you out.

The ‘locks’ configuration is also marked as slated to be removed in a future version. Microsoft doesn’t want you to be dependent on it.

What’s the memory overhead of those locks?

locks Each lock uses 96 bytes of memory. On the instance in question, 25,000 locks = 2,400,000 bytes.

That’s only 2.3 MB of memory devoted to locks. Even though 25K sounds like a lot, the memory footprint for that is pretty darn small.

I checked back with our questioner, and their instance has 32GB of memory. That’s a pretty small amount in the grand scheme of things (as of SQL Server 2014, Standard Edition can use up to 128GB of memory for the Buffer Pool), but 2.3 MB isn’t anything to worry about, percentage wise.

Do you have a high number of locks because you need better indexes?

Good indexes can dramatically reduce your lock overhead. Here’s a simple example using the SQLIndexWorkbook sample database.

For this sample query, run under the default read committed isolation level:

SELECT COUNT(*) FROM agg.FirstNameByYear WHERE FirstNameId=1;
GO

When this needs to do a clustered index scan, it requires 5,061 page locks.

After creating a nonclustered index on FirstNameId, the query requires only one page lock.

Indexes that help SQL Server find rows more quickly can dramatically reduce the number of locks that are taken out.

Are you waiting for locks because of blocking?

SQL Server is fast at acquiring locks — unless there’s a conflict, and you have to wait to get the lock because someone else is already using it.

In this case, the first step is to figure out when there is blocking, and who is blocking whom. I like to use alerts and the Blocked Process Report to help figure this out.

Do you have a high number of locks because of estimate problems?

One reason you might get a high number of locks is an inefficient execution plan based on poor estimates. If SQL Server thinks it’s only going to get a small number of rows, it may design a plan based on “lookups”. If it turns out that it’s got a lot more rows than it thought, it might have to execute this loop over and over and over– slowly looping and acquiring lots of little locks.

In this case, the stored procedures using this database are making heavy use of table variables. Table variables often lead to incorrect estimates in complex procedures, and could result in inefficient plans.

In this case, I wasn’t too worried about the 25,000 locks, but I thought it was possible that the performance of the procedures might be able to be improved if they have better estimates. I recommended:

Testing out the procedures in a dev environment with temporary tables instead of table variables
Evaluating how the procedures use indexes after the change — they likely will need different indexes for the new execution plans

If you have heavy use of table variables and can’t test out temporary tables, you can test out Trace Flag 2453 on SQL Server 2012 SP2 and higher. This trace flag doesn’t give table variables the full statistics support which temporary tables have, but does try to make SQL Server smarter about the number of rows in the table variable.

Disclaimer: changing from table variables to temporary tables doesn’t always make things faster. I wasn’t doing live troubleshooting here and I didn’t have actual execution plans– it’s possible that the rowsets are small and the table variables were doing well. You never know until you test!

Sometimes you should just take out a tablock

I don’t think this is the case for the person asking this question, but there are some cases when you just want to go ahead and take out an exclusive lock on a table. Not only can it simplify the number of locks for the table, but it can help make data loading more efficient.

Today’s question is about why a query might be slow at first, then fast the next time you run it.

Watch the 26 minute video or scroll on down and read the written version of the episode instead. If you enjoy the podcast, I’d appreciate your review on iTunes! Info on how to ask questions and subscribe is here.

Dear SQL DBA,

Whenever I create a complex view, the first time I use it in a query, the query is slow. The same thing happens after I alter the view. What is the reason behind this?

This is a great question — because when you ask this question, you’re about to discover a lot of interesting, useful things about how SQL Server runs queries.

There are two ways that SQL Server can use memory to make a query faster the second time you run it. Let’s talk about how the magic happens.

1) It takes CPU time to figure out how to run a query. SQL Server uses memory to cache execution plans to save time the next time you run the query.

slow-then-fast The first time you run a query using the view, SQL Server has to ‘compile’ an execution plan to figure out the best way to run the query.

SQL doesn’t compile the plan when you create the view– it compiles a plan when you first use the view in a query. After all, you could use the view in a query in different ways: you might select only some columns, you could use different ‘where’ clauses, you could use joins.

Secretly, it doesn’t matter too much that you’re using a view. When you run a query that references it, behind the scenes SQL Server will expand the TSQL in the view just like any other query, and it will look for ways to simplify it.

So SQL Server waits to compiles a plan for the exact query you run.

Depending on how many joins are in the view and how many ways SQL Server could run the query, it may take it a while to compile the query execution plan.
SQL Server tries to come up with a decent plan quickly. But I have seen some cases where query compile time took 5+ seconds, and query execution time was much smaller.

SQL Server is designed to store the execution plan for a query in memory in the execution plan cache, in case you run it again. It would be very costly for the CPUs to generate a plan for every query, and people tend to re-run many queries.

If you re-run a query and there is already an execution plain in the plan cache, SQL Server can use and save all that compile time.

When you alter a view, this will force SQL Server to generate a new execution plan the first time a query uses the view afterward. Something has changed, so SQL Server can’t use any plans that were in cache before.

Restarting the SQL Server, taking the database offline and online, memory pressure, and many server level settings changes will also clear execution plans out of cache, so you have to wait for a compile.

How much time are you spending compiling?

There are a few ways to see this:

If you are testing and want to see how much time is spent on compiling, you can run in your session: SET STATISTICS TIME ON; After that, SQL will show you the ‘parse and compile time’ for each query in your ‘Messages’ window for that session.
If you’re looking at execution plans, ‘compile time’ is stored in the XML. You can see it in the properties on the upper left-most operator. It’s reported in milliseconds and is the same as the ‘elapsed time’ that appears under parse and compile time from SET STATISTICS TIME.

Query Store collects compilation time statistics, including compile duration. You can see some of the details in this post I wrote on Query Store and recompile hints.

Views aren’t really a problem. Sometimes, lots of joins are a problem, but SQL Server still has tricks to compile a plan quickly.

When people use complex views, particularly nested views, they often end up with a LOT of joins in each query.

When SQL Server has a lot of joins, optimization gets harder. There’s a ton of ways the query would be executed.

The SQL Server optimizer doesn’t want you to wait a long time while it looks at every single thing it could possibly do. It takes some shortcuts. It wants to get to a good plan fast.

Generally, SQL Server tries to keep optimization times short, even when you have a lot of joins.

But there are cases where sometimes compilation takes longer than normal.

What if you see multiple seconds of parse and compile time?

Usually compilation time is a number of milliseconds, but I have seen some cases where it’s seconds. This could be for a few reasons:

SQL Server had to wait for CPU when it was trying to compile the query. It’s not the query’s fault, there just weren’t processors available. I would look at SQL Server’s wait statistics to identify this. If there were notable SOS_SCHEDULER_YIELD waits in that time period, the issue is more server load than the specific query.
You’ve hit an identified bug for compile time. Some cumulative updates for SQL Server fix long compilation times. It’s worth looking at the version of SQL Server you’re on, setting up a fully patched test instance, and checking if applying updates might improve your compilation times.
You’ve hit an unidentified bug for compile time. SQL Server works pretty hard to compile plans quickly, so multi-second compile time usually looks like a bug to me, if it’s not due to server load. For views and complex queries, I would:
- Look to see if I could simplify the query where-ever possible, to make things easier for the optimizer.
- Check if indexes might simplify the plan. Good indexes can make choosing a good plan easier and faster.
- Try query hints as a last resort. The problem is that hints are really directives, and force a behavior — what’s beneficial to force today may not be so great after the next upgrade, or even if my data sizes change over time.

2) It takes time to read data from disk. SQL Server uses memory to cache data in the Buffer Pool so it doesn’t have to go to disk the next time you use that data.

There are more reasons that the second run of a query might be faster.

The first time you run the query it may be using data that is on disk. It will bring that into memory (this memory is called the “buffer pool”). If you run the query again immediately afterward and the data is already in memory, it may be much faster — it depends how much memory it had to go read from disk, and how slow your storage system is.

When you are testing, you can see if your query is reading from disk (physical reads and read ahead reads) by running: SET STATISTICS IO ON;

One difference with this type of memory is that your buffer pool memory is not impacted by ALTERING the view. SQL Server does not dump data from the buffer pool when you change a view or procedure. Instead, it keeps track of how frequently different pages of data are used, and ages out the least useful pages from memory over time.

So this might be part of the answer in your case, but it wouldn’t necessarily explain why the query would be slower on the first run after an ALTER — unless the data pages that you’re using just hadn’t been queried a while and were no longer in the buffer pool cache by chance.

Takeaways for testing queries

I usually tune queries with a “warm cache”

Most queries that reference commonly used tables have a good chance of the data they need being in cache.

To test against a warm cache, I run the query once, and don’t pay a ton of attention to that first run.

I run the query again and measure the duration with the execution plan cached and the data pages in the buffer pool.

You can tune with a “cold cache”, but be careful

If I have a reason to believe that the data won’t be in cache when the query is run, then I will test it against a “cold cache”. I might need to do this if it’s a nightly query that runs, and the data it uses isn’t relevant at all to the daytime workload– so the pages are likely to not be in the buffer pool when it’s time for the job to run that night.

To test against a cold cache, you have to do some things that are NOT friendly for a production server — so make sure you only use this approach against test instances:

Run DBCC DROPCLEANBUFFERS to flush unmodified pages from the buffer pool. (This will slow down your whole instance because EVERYBODY then has to start reading from disk)
If modifications have been happing as part of the test, run a manual CHECKPOINT to flush dirty pages to disk from the buffer pool
Use RECOMPILE hints or a variant of DBCC FREEPROCCACHE to make sure I measure compilation time each time

stopwatch You’ve got an important stored procedure that you think needs index help– but it runs in environment with lots of other queries. How do you focus in and discover exactly what indexes need tuning for that procedure?

The best way to tune indexes in a stored procedure

The best way is to run the stored procedure yourself to generate and save an “actual” execution plan, which contains the estimates SQL Server used when it generated the plan as well as actual rowcounts, actual memory granted, etc. It will also contain a green tooltip with a “missing index request” if SQL Server thinks an index would help.

Tips on how to tune procedures with actual execution plans:

If you execute the stored procedure from a free tool like SQL Sentry Plan Explorer, you can even save off the plans (which will have green missing index requests where SQL Server thinks they would help), along with runtime stats about CPU usage, reads, and duration.
Runtime stats are really important and can save you tuning time. When looking at execution plans, remember that all references to “costs” are estimates. A query could have a high estimated cost, but only take 2 milliseconds of duration. Usually that’s not worth your time to tune!
Most procedures are parameterized. You want to run the procedure with real world values for those parameters. You also want to test what happens when you run it with a different set of parameters and re-use the execution plan generated by the first set.
Always look at those missing index requests as suggestions. You may be able to improve upon them, and they may not always show up when an index change is needed! Look for expensive operations in queries with a long duration and ask yourself if an index might help.

The biggest problem is that you can’t always run a procedure against production. Sometimes the procedure modifies data, sometimes it just impacts performance too much. You can work around this by restoring a recent backup to a different environment where it’s safe to test.

If you can’t get actual execution plans with runtime stats, there are a couple of alternatives

Nearly as good: enable SQL Server 2016’s Query Store feature. It will collect runtime stats (cpu usage, duration, reads, writes) and execution plans from production. They aren’t actual plans (no actual rowcounts, etc), but it’s still a lot of useful information. However, you have to be on SQL Server 2016, and Query Store is per-database — so you need to enable it in every database the procedure touches.

One way that sometimes works is to get the cached execution plan from memory in production along with the query execution stats (cpu usage, duration, reads, writes). You can do this using a free script like sp_BlitzCache from Brent Ozar Unlimited, or Glenn Berry’s free scripts from SQL Skills. However, the plan isn’t always in memory when you look, and these aren’t actual plans, either.

Methods I would avoid

I wouldn’t spend much time looking at the “Missing Index DMVs” for this particular problem. In theory, you can record what indexes SQL Server has asked for, let things run, and then see what requests are new. But in practice, I’ve always found this information to be too incomplete and frustrating, because it doesn’t tell you how long things took, SQL Server doesn’t always ask for an index, and the requests aren’t always perfect. I find it much easier to have the execution plans and queries for reference — ideally actual plans, but at least estimated plans with runtime stats.

I would avoid running a trace to collect execution plans in production. Whether with Profiler, Server Side Trace, or Extended Events, collecting execution plans unfortunately slows down your SQL Server– so you’re impacting performance for the server, and also skewing your measurement. (Vote up this Connect bug on the issue if you like traces and wish this was better. The bug is closed, but sometimes closed bugs are reopened and fixed.)

manual-stats-update Whether I’m working as a DBA, a consultant, a teacher, or just answering questions in my inbox, I always end up needing a script to inspect statistics one way or another.

Here are some freshly written scripts for a classic DBA question: what’s going on in my stats?

How to get statistics details on SQL Server 2008 R2 and higher

For most modern versions of SQL Server, I like to join to sys.dm_db_stats_properties() — you can get a LOT of detail in a single query! (This works with SQL Server 2008 R2 SP2+ / SQL Server 2012 SP1+ / All higher versions)

Here’s the query, looking at a sample table in the WideWorldImporters database:

SELECT 
	stat.auto_created,
	stat.name as stats_name,
	STUFF((SELECT ', ' + cols.name
		FROM sys.stats_columns AS statcols
		JOIN sys.columns AS cols ON
			statcols.column_id=cols.column_id
			AND statcols.object_id=cols.object_id
		WHERE statcols.stats_id = stat.stats_id and
			statcols.object_id=stat.object_id
		ORDER BY statcols.stats_column_id
		FOR XML PATH(''), TYPE
	).value('.', 'NVARCHAR(MAX)'), 1, 2, '')  as stat_cols,
	stat.filter_definition,
	stat.is_temporary,
	stat.no_recompute,
	sp.last_updated,
	sp.modification_counter,
	sp.rows,
	sp.rows_sampled
FROM sys.stats as stat
CROSS APPLY sys.dm_db_stats_properties (stat.object_id, stat.stats_id) AS sp
JOIN sys.objects as so on 
	stat.object_id=so.object_id
JOIN sys.schemas as sc on
	so.schema_id=sc.schema_id
WHERE 
	sc.name= 'Warehouse'
	and so.name='StockItemTransactions'
ORDER BY 1, 2;
GO

The output looks like this:

Click to see a larger image

Can you guess why the top row has NULL values for last_updated, modification_counter, rows, and rows_sampled? (Once you have your guess, the answer is here.)

What if my SQL Server is a dinosaur? Here’s a script for SQL Server 2005 and 2008.

If you’re using SQL Server 2008 or prior, you don’t have the luxury of sys.dm_db_stats_properties().

For these instances, it’s not a big deal to get the date statistics were last updated – we can call to the STATS_DATE() function. But we’re really guessing when it comes to how many rows have been modified, we have to use a column called rowmodctr in sys.sysindexes which is a guess at how many rows have changed. It also was wildly inaccurate in some versions of SQL Server 2005.

But a guessed estimate is better than no information, as long as you know it’s a guess!

Here’s the query:

SELECT 
    stat.auto_created,
    stat.name as stats_name,
    STUFF((SELECT ', ' + cols.name
        FROM sys.stats_columns AS statcols
        JOIN sys.columns AS cols ON
            statcols.column_id=cols.column_id
            AND statcols.object_id=cols.object_id
        WHERE statcols.stats_id = stat.stats_id and
            statcols.object_id=stat.object_id
        ORDER BY statcols.stats_column_id
        FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'), 1, 2, '')  as stat_cols,
    stat.filter_definition,
    stat.is_temporary,
    stat.no_recompute,
    STATS_DATE(stat.object_id, stat.stats_id) as last_updated,
    ISNULL(
    /* Index stats */
    (SELECT rowmodctr
        FROM sys.sysindexes as sysind
        JOIN sys.indexes as ind on 
            sysind.id=ind.object_id
            and sysind.indid=ind.index_id
        where sysind.id=stat.object_id
        and ind.name=stat.name 
    ),
    /* Column stats */
    (SELECT rowmodctr
        FROM sys.sysindexes as sysind
        where sysind.id=stat.object_id
        and sysind.indid in (0,1)
    )) 
        AS estimated_modification_counter
FROM sys.stats as stat
JOIN sys.objects as so on 
    stat.object_id=so.object_id
JOIN sys.schemas as sc on
    so.schema_id=sc.schema_id
WHERE 
    sc.name= 'Warehouse'
    and so.name='StockItemTransactions'
ORDER BY 1, 2;
GO

And here’s how the output looks:

Click to see a larger image

See how the estimated modification counter is totally off in the top row, and for the column statistic? That’s because this table has a clustered columnstore index — and ye olde rowmodctr column totally doesn’t understand what’s going on with that!

This script is going to be wrong about things like Columnstore indexes. But if you have Columnstore indexes, you’re on a version of SQL Server that lets you use the non-Dinosaur script above, which is more accurate.

TLDR;

Make sure you use the script for the correct version of SQL Server that you’re running. The “dinosaur” script is less accurate, but better than nothing when you need this.

Short answer: the SQL Server optimizer will know that the table was truncated, but statistics might not update when you expect.

For the long answer, let’s walk through an example using the WideWorldImporters sample database. I’ll be using Trace Flag 3604 and 2363 to get SQL Server to print information about how it optimized my query out to the messages tab. (Thanks to Paul White for blogging about this trace flag.)

First, a fresh restore of WideWorldImporters

USE master;
GO

IF DB_ID('WideWorldImporters') IS NOT NULL
ALTER DATABASE WideWorldImporters SET OFFLINE WITH ROLLBACK IMMEDIATE

RESTORE DATABASE WideWorldImporters FROM DISK=
    'S:\MSSQL\Backup\WideWorldImporters-Full.bak'
    WITH REPLACE
GO

USE WideWorldImporters;
GO

Before we do anything, what do the statistics look like on Sales.OrderLines?

Here’s the query that I’m using to inspect the statistics:

SELECT 
    sp.last_updated,
    stat.name as stats_name,
    STUFF((SELECT ', ' + cols.name
        FROM sys.stats_columns AS statcols
        JOIN sys.columns AS cols ON
            statcols.column_id=cols.column_id
            AND statcols.object_id=cols.object_id
        WHERE statcols.stats_id = stat.stats_id and
            statcols.object_id=stat.object_id
        ORDER BY statcols.stats_column_id
        FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'), 1, 2, '')  as stat_cols,
    sp.modification_counter,
    sp.rows,
    sp.rows_sampled
FROM sys.stats as stat
CROSS APPLY sys.dm_db_stats_properties (stat.object_id, stat.stats_id) AS sp
JOIN sys.objects as so on 
    stat.object_id=so.object_id
JOIN sys.schemas as sc on
    so.schema_id=sc.schema_id
WHERE 
    sc.name= 'Sales'
    and so.name='OrderLines'
ORDER BY 1 DESC
GO

Statistics were last updated on June 2, 2016. We’ll be mostly looking at the statistic on Quantity throughout the example, so I’ve highlighted it:

Let’s run a query that loads the statistic on Quantity

Before we truncate the table, let’s take a peek into how SQL Server optimizes a query that cares about rows in Sales.OrderLines with Quantity > 10. I’m using trace flags 3604 and 2363 to make SQL Server print information about how it used statistics to optimize this to my messages tab.

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

Here’s the info on the messages tab:

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7

Selectivity: 0.44231

Stats collection generated: 

  CStCollFilter(ID=2, CARD=102356)

      CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

End selectivity computation

Estimating distinct count in utility function

Input stats collection:

    CStCollFilter(ID=2, CARD=102356)

        CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

Columns to distinct on:QCOL: [WideWorldImporters].[Sales].[OrderLines].OrderLineID


Plan for computation:

  CDVCPlanUniqueKey

Result of computation: 102356


(102035 row(s) affected)

Highlights: one of the first thing SQL thinks about is the number of rows in the table

Right at the beginning, we see: “CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)”

That ‘CARD’ number is the optimizer thinking about how many rows are in this table. If you glance back up at the table statistics, the most recent statistic to be updated was on the ‘LastEditedWhen’ column. When that statistic was updated, there were 231,412 rows in the table.

SQL Server decides that it wants detail on the Quantity column to figure out how to run this query, so we see that it loads that statistic up to use: “Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7”

Alright, let’s truncate this table

I wipe out all the rows with this command:

TRUNCATE TABLE Sales.OrderLines;
GO

Now, I wouldn’t expect truncating the table to automatically update the statistics.

SQL Server updates statistics when they’re used to optimize a query — so if nobody queries this table for six months, I wouldn’t expect the stats to update for six months.

Let’s re-run our query, trace flags and all:

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

The messages tab has less info this time- it’s much more concise!

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcFixedFilter (0.3)

Selectivity: 0.3

Stats collection generated: 

  CStCollFilter(ID=2, CARD=1)

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

End selectivity computation


(0 row(s) affected)

SQL Server knows that we blew away all those rows

This time we see “CARD=1 TBL: Sales.OrderLines”

SQL Server doesn’t like to estimate 0 for empty tables. It likes to estimate 1. It knows this table is empty.

With this information, it chooses a different plan for computation. The plan doesn’t require looking at the quantity column this time– we don’t have any lines about that at all.

But the statistics don’t look any different

You might expect to see that the statistic on Quantity had updated. I expected it, before I ran through this demo.

But SQL Server never actually had to load up the statistic on Quantity for the query above. So it didn’t bother to update the statistic. It didn’t need to, because it knows that the table is empty, and this doesn’t show up in our column or index specific statistics.

To verify, I just rerun my metadata query above, and things look the same:

What if the table has exactly one row?

Let’s insert one and find out:

INSERT INTO [Sales].[OrderLines] (OrderLineID, OrderID, StockItemID, Description, PackageTypeID, Quantity, UnitPrice, TaxRate, PickedQuantity, PickingCompletedWhen, LastEditedBy, LastEditedWhen)
     VALUES (1, 45, 164, '32 mm Double sided bubble wrap 50m', 7, 50, 112.00, 15.000, 50, '2013-01-02 11:00:00.0000000', 4, '2013-01-02 11:00:00.0000000')
GO

Now we run our familiar query, with all its merry trace flags:

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

And here’s what SQL Server has to say about optimizing that…

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=2, CARD=1)

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

End selectivity computation


(1 row(s) affected)

One row is enough to use our column statistic

Looking at the beginning, CARD=1 for Sales.OrderLines, just like it did after we truncated the table. But SQL Server does something different this time, indicating that it now knows that the table isn’t really empty.

It goes back to the CSelCalcColumnInInterval plan to optimize. And it loads up the column stat for the Quantity column.

Since this statistic was loaded into memory, it should have auto-updated based on my database settings. Sure enough, it did:

SQL Server knows when you’ve truncate a table

And the fact that the table has been truncated may mean that it doesn’t need to use statistics on the table when optimizing queries. After all, it’s an empty table, so it can take shortcuts!

So don’t get too confused if statistics look way out of date for a truncated table. Instead, ask yourself, “why am I querying a truncated table?” (Related disclaimer: I only tested this on SQL Server 2016.)

Want to learn more about statistics in SQL Server? Start here.

One of the coolest things to come to SQL Server Management Studio in a long time might be hard to see at first: it’s tucked away in the Properties Window.

But once you see it, it might just be something that you use all the time.

SQL Server now shows Actual Elapsed CPU Time and Actual Elapsed Time (duration) for each operator in an Actual Execution Plan

For SQL Server 2016 and 2014 SP2 and higher, actual execution plans contain a bunch of new information on each operator, including how much CPU they burn, how long it takes, and how much IO is done by that operator. This was a little hard to use for a while because the information was only visible in the XML of the execution plan.

But now you can see this information with just a few clicks in the properties window of a plan. Here’s the announcement from the SSMS team.

Watch an example: Finding Actual Time Statistics in an Execution Plan

Here’s an animated gif of running a query with ‘Actual Execution Plans’ turned on, then right-clicking to see the Properties window and looking at Elapsed CPU Time (ms) and Elapsed Time (ms) for specific operators:

Click to see a larger image

Some things to note:

The query took just over 8 seconds to execute. (It’s shorter in the .gif because I sped up the video)
The total CPU time on the index scan shows as 20+ seconds. That’s because it’s adding up the CPU time for all the threads– this query is running at MAXDOP 4. The threads were pretty evenly distributed on the index scan and the longest one took about 5.2 seconds
Notice how Thread 0 doesn’t accrue time? It’s the “watcher thread” that just waits for the worker threads to complete. It’s supposed to be lazy like that.
The threads on the Stream Aggregate operator took some time as well– 8.4 seconds on the longest thread

Wait, Kendra, those don’t add up!

No, they don’t. That’s OK! As data flows through this execution plan, multiple operators can be busy at the same time. SQL Server doesn’t have to wait for the index scan to finish before it works on the Compute Scalar and the Stream Aggregate operator – in this case, these operators are non-blocking. (Craig Freedman writes about blocking vs. non-blocking operators in this post.)

Let’s watch the same query with Live Query Statistics turned on

Here’s the same query running with Live Query Stats. You can see that the data is flowing through and that multiple operators are working at the same time.

We get a lot of cool information from this view, but we don’t get the same detail that we get for Actual Elapsed CPU Time and Actual Elapsed Time in the properties window of an actual execution plan. These two tools complement each other nicely.

Click to see a larger image

You can also see Actual IO Statistics

It’s not just about the CPUs, for each operator you can also see how many logical reads, read ahead reads, and physical reads done by that operator. And it’s all right there in the properties window now. Enjoy!

unblocked-clean-up I’m a big fan of the built-in Blocked Process Report in SQL Server. It’s come in handy for troubleshooting blocking situations for me many times.

I wanted a friendly way to share code to configure and manage the Blocked Process Report, so I’ve created a gist on GitHub sharing TSQL that:

Enables the Blocked Process Report (BPR)
Collects the BPR with an Extended Events trace
Collects the BPR using a Server Side SQL Trace (in case you don’t care XEvents or are running an older version of SQL Server)
Lists out the Extended Events and SQL Traces you have running, and gives you code to stop and delete traces if you wish

View or download the code from GitHub, or get it below.

Extra: If you find this code useful, you’ll probably also enjoy Michael J. Swart’s Blocked Process Report Viewer on Codeplex.

View this code snippet on GitHub.

You can’t do everything with a columnstore index — but SQL Server’s optimizer can get pretty creative so it can use a columnstore index in ways you might not expect.

You can’t put a computed column in a columnstore index

If you try to create a nonclustered columnstore index on a computed column, you’ll get error message 35307:

Msg 35307, Level 16, State 1, Line 270

The statement failed because column 'BirthYear' on table 'FirstNameByBirthDate_1976_2015' is a computed column. Columnstore index cannot include a computed column implicitly or explicitly.

But SQL Server may still decide to use a Columnstore index for a query specifying a computed column!

I went ahead and created a nonclustered columnstore index on the other columns in my table, like this:

CREATE NONCLUSTERED COLUMNSTORE INDEX col_dbo_FirstNameByBirthDate_1976_2015
    on dbo.FirstNameByBirthDate_1976_2015 
    ( FakeBirthDateStamp, FirstNameByBirthDateId, FirstNameId, Gender);
GO

Then I ran this query against the table, which groups rows by the computed column, BirthYear:

SELECT TOP 3
    BirthYear,
    COUNT(*) as NameCount
FROM dbo.FirstNameByBirthDate_1976_2015
WHERE BirthYear BETWEEN 2001 and 2015
GROUP BY 
    BirthYear
ORDER BY COUNT(*) DESC;
GO

Looking at the execution plan, SQL Server decided to scan the non-clustered columnstore index, even though it doesn’t contain the computed column BirthYear! This surprised me, because I have a plain old non-clustered index on BirthYear which covers the query as well. I guess the optimizer is really excited about that nonclustered columnstore.

Click for larger image

The columnstore index isn’t the best choice for this query:

Duration using nonclustered rowstore index on computed BirthYear: 2.722 seconds
Duration using nonclustered columnstore index: 5.5 seconds

Where’s BirthYear? Let’s look at the Compute Scalar farthest to the right

Clicking on that compute scalar operator and looking at the properties window, we can see that SQL Server looked up the definition for the computed column, and figured out that the computation is based on columns in our nonclustered index– so it could scan that index, then run the computation for each row.

SQL Server is waiting until the third operator, a filter, to filter out the rows for BirthYear between 2001 and 2015:

The cost estimate on that Compute Scalar is waaaayyy low…

This is an actual execution plan, so I have Actual Time Statistics, and I can see exactly how much CPU was burned to compute BirthYear for every row. Scrolling up in the properties window, I find that this took almost five seconds for each thread that worked on the compute scalar. That’s more than 80% of the query’s duration just to figure out BirthYear.

Oops!

I can rewrite my query a bit to push that filter down…

My original query has the predicate, “BirthYear BETWEEN 2001 and 2015″. Let’s change that predicate to a non-computed column:

SELECT TOP 3
    BirthYear,
    COUNT(*) as NameCount
FROM dbo.FirstNameByBirthDate_1976_2015 
WHERE FakeBirthDateStamp >= CAST('2001-01-01' AS DATETIME2(0))
    and FakeBirthDateStamp < CAST('2016-01-01' AS DATETIME2(0))
GROUP BY 
    BirthYear
ORDER BY COUNT(*) DESC;
GO

I’m still using the computed column BirthYear in my SELECT and GROUP BY.

SQL Server still chooses the columnstore index for this query, but now there is a predicate on the columnstore index scan itself:

This means far fewer rows are flowing into the compute scalar operator — we don’t have to calculate BirthYear for any of the rows from 1976 through the end of 2000.

Sure enough, it’s faster

Making this change to the query text makes our nonclustered columnstore index highly competitive with Ye Olde covering rowstore b-tree index:

Duration using nonclustered rowstore index on computed BirthYear: 2.722 seconds
Duration using nonclustered columnstore index with original query: 5.5 seconds
Duration using nonclustered columnstore index with predicate re-written to not reference computed column: 2.2 seconds

If we couldn’t re-write the predicate easily for whatever reason, we might choose to keep the non-clustered rowstore index on BirthYear around and use OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) in our query.

Be careful with computed columns and columnstore

matrix-95-percent-computed-columns I had assumed the optimizer would be reluctant to create a plan for a computed column, since that column can’t be in the columnstore index. But it turned out to be pretty eager to do it.

If you’ve got computed columns and are testing out columnstore, look carefully at your queries and check to make sure you don’t have any super-expensive compute scalar operators showing up in your plans where you might not want them.

Vote to allow computed columns in columnstore indexes

Wouldn’t this all be easier if you could just put the computed column in the columnstore, anyway? Vote up this Connect item.

baguettes-tempdb-files-france I’m sometimes asked if the number of CPU cores used by a query determines the number of tempdb files that the query can use.

Good news: even a single threaded query can use multiple tempdb data files.

First, tempdb is a tricksy place!

My first version of this post used a flawed methodology: I configured an instance of SQL Server 2016 with four equally sized, small tempdb files. I then tested a set of queries that qualify for parallelism alternately using 4 processors, and running them single threaded (using option maxdop 1 as a query hint).

I observed that the queries always made all four files grow.

However, in that first version of the post, I forgot that the default behavior in SQL Server 2016 is to grow all files in tempdb simultaneously when one grows. Basically, one small feature of SQL Server 2016 is that trace flag 1117 is always enabled by default for tempdb. Just because all the data files grew doesn’t mean they all got used!

(For more on tempdb behavior changes in SQL Server 2016, read Aaron Bertrand’s post here.)

Can we just turn off TF 1117 for tempdb?

For most databases, you can control whether all the files in a filegroup grow at once by changing a setting on the filegroup.

But not tempdb. If you run this command:

ALTER DATABASE tempdb MODIFY FILEGROUP [PRIMARY] AUTOGROW_SINGLE_FILE;
GO

You get the error:

Msg 5058, Level 16, State 12, Line 1
Option 'AUTOGROW_SINGLE_FILE' cannot be set in database 'tempdb'.

Let’s use science. Otherwise known as Extended Events.

I admit: I didn’t do this in the first place because it was a bunch of work. The devil is in the detail with stuff like this. A lot of events seem like they’d work, but then you try to use them and either you don’t get the data you’d expect, or the trace generates such a massive amount of data that it’s hard to work with.

I used the sqlserver.file_read event and a histogram target

I love the histogram target because it doesn’t generate a huge file of data to confuse myself with. I set up my new test this way…

Extended Events trace for sql server.file_read

Collect file paths (just to make it easy to see the data file names)
filter on database_id=2 (tempdb)
filter on session_id = 57 (the session I happened to be testing under)
The histogram target was set to “bucket” file_read by each file path

I set up a bunch of queries in a batch, and for each query:

Dumped everything out of memory (ran a checkpoint in tempdb and dbcc dropcleanbuffers)
Ran a short waitfor in case the checkpoint took a bit of time
Started the trace
Ran the query
Dumped the data from the histogram into a table
Stopped the trace (this clears the histogram)

I ran the whole thing a couple of times to make sure I got consistent results. And I did!

Results: single threaded queries CAN use multiple tempdb data files

One of my queries does a large sort operation. Using maxdop 4 and maxdop 1, it still evenly used my tempdb data files.

I saw similar patterns with a query using a merge join, a query using a lazy spool, and a query with a sort that was engineered to be under-estimated (and has a memory spill).

Results: queries may not use tempdb files in the same ways

As you might guess, things may not always get evenly accessed, even if you have evenly sized tempdb files. One of my queries did a select into a temp table. Although it used all four tempdb files whether or not it went parallel, there were more file_read events against the first tempdb file than against the other four.

Could my methodology still have problems?

It’s possible. I haven’t used the sqlserver.file_read event much, or tested it super thoroughly. It seemed to give me consistent results.

So while it’s possible that I’ll suddenly realize that there’s a better way to measure this, I think at least my results are better than they were the first time I wrote this post!

I recently received this question from a reader…

I just moved from an in-house software development company to a new environment that most of the software used here are COTS (Commercial off-the-shelf)

This is totally new to me. I’m a little bit lost since I don’t know anything on the applications, users (security), or the schema. What is your recommendation to manage such an environment and perform performance tuning?

Part of the confusion is terminology

I hadn’t heard the acronym “COTS” before — but that doesn’t surprise me. Companies have wildly different terms they use for this. If you manage these types of databases, you many need to use multiple search terms to find more resources:

Third party vendor database / application
Independent Software Vendor database / application (ISV)

And also COTS!

Here’s a great article on administering COTS databases for DBAs

Tim Ford wrote a terrific article for this – it just doesn’t come up if you search for “COTS SQL Server”, because he used the term “third party” in the heading. But it’s terrific advice, and it covers a ton of bases:

Read 17 Questions Every SQL DBA Should Ask Before Supporting a New Third-Party Database by Tim Ford over on SQLMag.com.

What about Performance Tuning these databases?

Some additional questions I ask when getting to know a COTS environment is:

How important is performance for each application — ranked by the users. Performance tuning can be time consuming, and this will help you prioritize and focus your efforts.
- List out all the applications and ask users to rank how important performance is for them where one is highest and one is lowest (no ties for rank)
- You’ll need to figure out the name of the applications and how they map to the database names to do this (often they are different). That’ll come in VERY handy for you down the road when someone reports that “the Zebra app is slow”, so it’s totally worth it.
Which vendors let you create non-clustered indexes in the databases, and which don’t? Index tuning is a major tool, and some vendors don’t mind if you add non-clustered indexes, but some forbid it.
- First, I’d check if I have documentation from the vendor and if it covers this. If it doesn’t tell you, I’d ask the vendor.
- Even if a vendor says “no indexes allowed”, if you hit a case where an index fixes a major performance problem when run against a restored backup of the database where you made the change, you can bring it up again with them then. Sometimes “no” turns into “just this once.”
Which vendors let you create plan guides in the databases, and which don’t? Plan guides are like duct tape for query plans — they can help you change a behavior when you can’t change the query itself.
- Many vendors may not know what a plan guide is and react with “what?” to this question.
- If you do create plan guides without permission, make sure to remove them before running any upgrade scripts. Not that I suggested you do that
Does the vendor make suggestions for parallelism settings, or other SQL Server configuration settings?
- Don’t assume that the suggestions are being honored on the instance. It’s worth checking what the vendor recommends, comparing it to existing settings, and investigating if it is different
Does the vendor forbid you from turning on Query Store (SQL Server 2016+)?
- Honestly, I wouldn’t even ask this question straight out — I’d review the vendor documentation and just see if turning on Query Store is explicitly forbidden. If it’s not forbidden, I’d enable Query Store. If it is forbidden, I’d ask why.
- Query Store is a little different from a schema change like creating indexes — you should be careful about plan freezing / pinning on sensitive vendor databases, but I can’t think of a reason why a vendor wouldn’t allow you to turn this on. It’s basically like asking, “Can I monitor this database?” Generally, yes, absolutely.

When performance tuning COTS databases (aka third party databases), I use the same performance tuning methodology I do for other databases — but I know that re-writing the queries is typically a last resort as a fix, as it’s something I’d need to request from the vendor which may take a very long time to roll out (if it happens at all).

Sometimes you know a query is out there, but it’s hard to find the exact query.

SQL Server stores query execution plans in cache, but it can be difficult to query the XML it stores. And there’s always a chance that the query plan won’t be there, due to memory pressure, recompile hints, or the plan cache being cleared by setting changes or other administrative actions.

I can’t guarantee that your query can always be in the plan cache. But I can make it a bit easier to find the query you’re looking for.

In this post I give example code to find queries using a specific index, or using an index hint. But you’ll find that it’s pretty easy to adapt these queries for whatever you’re looking for.

How I like to Search the Plan Cache

If I’m looking in SQL Server’s Execution Plan Cache, I like to use the sys.dm_exec_text_query_plan dynamic management view. This stores those XML query plans as text.

I learned about using this DMV from Grant Fritchey in his post, “Querying the Plan Cache, Simplified.” Grant points out that while doing wildcard searches in the text version of a query plan isn’t fast, querying it as XML is often even slower.

Things to remember when searching the plan cache:

Queries will be missing from the plan cache
The larger your plan cache and the slower your CPUs, the longer this will take: handle with care

Search Query Store, if You’ve Got It

If you’ve enabled the SQL Server 2016+ Query Store on your databases, you’ve got something better to search than the plan cache. I’m including code to search Query Store as well.

How to Find Queries Using a Specific Index

Search for queries in the execution plan cache

Simply plug the name of the index you’re looking for into this query. If you have multiple databases with the same index name, you’ll need to add additional criteria to get just the database you’re looking for.

In many cases, this will return more queries than you’re looking for, because inserts and deletes will reference all nonclustered indexes on the table. You can either add additional predicates to this query, or just look through everything on the list, depending on what you’re doing.

/* Execution plan cache */
SELECT 
    querystats.plan_handle,
    querystats.query_hash,
    SUBSTRING(sqltext.text, (querystats.statement_start_offset / 2) + 1, 
                (CASE querystats.statement_end_offset 
                    WHEN -1 THEN DATALENGTH(sqltext.text) 
                    ELSE querystats.statement_end_offset 
                END - querystats.statement_start_offset) / 2 + 1) AS sqltext, 
    querystats.execution_count,
    querystats.total_logical_reads,
    querystats.total_logical_writes,
    querystats.creation_time,
    querystats.last_execution_time,
    CAST(query_plan AS xml) as plan_xml
FROM sys.dm_exec_query_stats as querystats
CROSS APPLY sys.dm_exec_text_query_plan
    (querystats.plan_handle, querystats.statement_start_offset, querystats.statement_end_offset) 
    as textplan
CROSS APPLY sys.dm_exec_sql_text(querystats.sql_handle) AS sqltext 
WHERE 
    textplan.query_plan like '%PK_Sales_Invoices%'
ORDER BY querystats.last_execution_time DESC
OPTION (RECOMPILE);
GO

Find queries using the index in Query Store

Here’s a starter query to get you going in Query Store when you’re looking to see who’s using an index.

This query groups by the query_id and query_hash because Query Store records runtime stats for a query over multiple intervals.

Similar to the previous query, just plug in the index name you’re looking for into the query plan text:

/* Query Store */
SELECT
    qsq.query_id,
    qsq.query_hash,
    (SELECT TOP 1 qsqt.query_sql_text FROM sys.query_store_query_text qsqt
        WHERE qsqt.query_text_id = MAX(qsq.query_text_id)) AS sqltext,    
    SUM(qrs.count_executions) AS execution_count,
    SUM(qrs.count_executions) * AVG(qrs.avg_logical_io_reads) as est_logical_reads,
    SUM(qrs.count_executions) * AVG(qrs.avg_logical_io_writes) as est_writes,
    MIN(qrs.last_execution_time AT TIME ZONE 'Pacific Standard Time') as min_execution_time_PST,
    MAX(qrs.last_execution_time AT TIME ZONE 'Pacific Standard Time') as last_execution_time_PST,
    SUM(qsq.count_compiles) AS sum_compiles,
    TRY_CONVERT(XML, (SELECT TOP 1 qsp2.query_plan from sys.query_store_plan qsp2
        WHERE qsp2.query_id=qsq.query_id
        ORDER BY qsp2.plan_id DESC)) AS query_plan
FROM sys.query_store_query qsq
JOIN sys.query_store_plan qsp on qsq.query_id=qsp.query_id
CROSS APPLY (SELECT TRY_CONVERT(XML, qsp.query_plan) AS query_plan_xml) AS qpx
JOIN sys.query_store_runtime_stats qrs on qsp.plan_id = qrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsrsi on qrs.runtime_stats_interval_id=qsrsi.runtime_stats_interval_id
WHERE    
    qsp.query_plan like N'%PK_Sales_Invoices%'
    AND qsp.query_plan not like '%query_store_runtime_stats%' /* Not a query store query */
    AND qsp.query_plan not like '%dm_exec_sql_text%' /* Not a query searching the plan cache */
GROUP BY 
    qsq.query_id, qsq.query_hash
ORDER BY est_logical_reads DESC
OPTION (RECOMPILE);
GO

How to Find Queries Using an Index Hint (any index hint)

Sometimes you just want to know if index hints are in play. If code is around hinting specific indexes, that means you need to be careful dropping or renaming those indexes– or queries may fail.

Search the execution plan cache for index hints

To find forced indexes in the plan cache, look for plans that contain ‘%ForcedIndex=”1″%’, like this:

/* Execution plan cache */
SELECT 
    querystats.plan_handle,
    querystats.query_hash,
    SUBSTRING(sqltext.text, (querystats.statement_start_offset / 2) + 1, 
                (CASE querystats.statement_end_offset 
                    WHEN -1 THEN DATALENGTH(sqltext.text) 
                    ELSE querystats.statement_end_offset 
                END - querystats.statement_start_offset) / 2 + 1) AS sqltext, 
    querystats.execution_count,
    querystats.total_logical_reads,
    querystats.total_logical_writes,
    querystats.creation_time,
    querystats.last_execution_time,
    CAST(query_plan AS xml) as plan_xml
FROM sys.dm_exec_query_stats as querystats
CROSS APPLY sys.dm_exec_text_query_plan
    (querystats.plan_handle, querystats.statement_start_offset, querystats.statement_end_offset) 
    as textplan
CROSS APPLY sys.dm_exec_sql_text(querystats.sql_handle) AS sqltext 
WHERE 
    textplan.query_plan like N'%ForcedIndex="1"%'
    and UPPER(sqltext.text) like N'%INDEX%'
OPTION (RECOMPILE);
GO

I also specify that the text of the query needs to have the word ‘INDEX’ in it (which is part of an index hint), to rule out false positives in the plan cache of queries running against system tables.

Find index hints in Query Store

To find forced indexes in Query Store, you can similarly look for plans with’%ForcedIndex=”1″%’, like this:

/* Query Store */
SELECT
    qsq.query_id,
    qsq.query_hash,
    (SELECT TOP 1 qsqt.query_sql_text FROM sys.query_store_query_text qsqt
        WHERE qsqt.query_text_id = MAX(qsq.query_text_id)) AS sqltext,    
    SUM(qrs.count_executions) AS execution_count,
    SUM(qrs.count_executions) * AVG(qrs.avg_logical_io_reads) as est_logical_reads,
    SUM(qrs.count_executions) * AVG(qrs.avg_logical_io_writes) as est_writes,
    MIN(qrs.last_execution_time AT TIME ZONE 'Pacific Standard Time') as min_execution_time_PST,
    MAX(qrs.last_execution_time AT TIME ZONE 'Pacific Standard Time') as last_execution_time_PST,
    SUM(qsq.count_compiles) AS sum_compiles,
    TRY_CONVERT(XML, (SELECT TOP 1 qsp2.query_plan from sys.query_store_plan qsp2
        WHERE qsp2.query_id=qsq.query_id
        ORDER BY qsp2.plan_id DESC)) AS query_plan
FROM sys.query_store_query qsq
JOIN sys.query_store_plan qsp on qsq.query_id=qsp.query_id
CROSS APPLY (SELECT TRY_CONVERT(XML, qsp.query_plan) AS query_plan_xml) AS qpx
JOIN sys.query_store_runtime_stats qrs on qsp.plan_id = qrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsrsi on qrs.runtime_stats_interval_id=qsrsi.runtime_stats_interval_id
WHERE    
    qsp.query_plan like N'%ForcedIndex="1"%'
GROUP BY 
    qsq.query_id, qsq.query_hash
ORDER BY est_logical_reads DESC
OPTION (RECOMPILE);
GO

Today I was working on some code samples for a user question, and I hit a weird roadblock.

There was a bunch of garbage in my execution plan that I couldn’t explain. And by ‘garbage’, I mean a nested loop to a whole branch of code that I hadn’t asked SQL Server to run — and a warning about an implicit conversion possibly causing problems with the quality of my execution plan.

It took me a while to figure out the issue, and along the way I asked the following questions:

Am I really querying a table, or am I accidentally querying a view? (It’s a table! I checked at least three times.)

Is there some weird computed column in the table that I don’t know about? (Nope, nothing involved was a computed column, I checked that twice.)

Am I awake right now? (Pinched myself, appeared awake. I have had weird dreams about SQL Server before, though.)

Do I actually know anything about SQL Server? (Just about everyone has imposter syndrome sometimes.)

Having gone through this checklist, I decided that I was awake, and that I could figure things out by looking through the execution plan carefully.

Sure enough, calming down and stepping through the plan did the trick. Just like it always has.

Watch the 18 minute video to find out the mysteries in this execution plan, or scroll on down to read about what I discovered. If you enjoy the video, you might like to subscribe to the podcast. A review on iTunes will help other folks find out about the show.

Here’s what the weirdness looked like

I was using the WideWorldImporters sample database from Microsoft. And I was running a stupid simple query.

Stupid simple like this:

SELECT 
    CustomerName
FROM Sales.Customers;
GO

I’m selecting one column. There are no predicates at all. Sales.Customers is a normal old rowstore table, and CustomerName is nvarchar(100).

For a query like this, I’d expect a very simple plan: an index seek or scan operator to pull back the data, and a SELECT operator.

Instead, I saw an execution plan with 20 nodes, and a big old warning on the SELECT operator.

Here’s what it looked like in SQL Sentry Plan Explorer:

Click to see a larger image

If you’d like to play along in a more interactive version, here’s the query over at Paste the Plan. (The website view doesn’t show al the operator properties I’m going to talk about here, but you can grab the XML and use it in SSMS or Plan Explorer to see the detail if you’d like.)

Hovering over the warning on the SELECT operator, here’s the warning (this is the tooltip from SSMS):

Weird, it’s warning about the SalesTerritory column. I didn’t ask for it to do anything at all with SalesTerritory. Why is it doing a type conversion on that column?

Let’s start at the top right of the query

When I’m in doubt about an execution plan, I like to just start at the top right of the query plan and work my way through. I think of that top right operator as the “driver”.

In this case, it’s a clustered index scan of Sales.Customers. That makes sense: I asked for all the customer names. Hovering over that operator, though, there is something funny. When I look at the ‘output’ columns, it is outputting not only CustomerName, but also DeliveryCityID!

So what’s it doing with DeliveryCityID?

Moving one step to the left in the plan, there’s a nested loop operator. Hovering over that operator, it says that it outputs the CustomerName column to the select operator. (Good, because that’s what we asked for!)

It also says that the Outer References for the nested loop are based on DeliveryCityID. OK, so it’s pulling back that column because it needs it to run the nested loop. We still don’t know why, but if we hunt around in that branch of the plan, maybe there’ll be a clue.

At this point, I started hovering over operators in that branch of the plan

As in life, when you’re lost in an execution plan, move around slowly and carefully, observe your surroundings, and look for your mom. I mean, look for inspiration.

I could see that the query was pulling from the Cities and StateProvinces tables. And there were a bunch of filter operators as well.

Click to see a larger image

Here’s what the filters are doing:

is_rolemember(N’db_owner’)<>(0)
is_rolemember([Expr1011]+N’ Sales’)<>(0)
[Expr1012]=session_context(N’SalesTerritory’)
original_login()=N’Website’

This is security garbage! Err… a security feature!

Aha! This is a definite clue. Some sort of security wizardry has been applied to this table, so that when I query it, a bunch of junk gets tacked onto my query.

I have no shame in admitting that I couldn’t remember at all what feature this was and how it works. A lot of security features were added in SQL Server 2016, and the whole point of a sample database like this to kick the tires of the features.

I did a little remembering, and a little searching, and figured out that this is the Row Level Security feature (RLS) in SQL Server 2016.

Row Level Security (RLS) adds predicates to your query execution plans

Here’s how row level security works. You create a table valued function that can be “inlined” (meaning merged) into your query execution plan. That function determines who can see the data.

Then you create a Security Policy for Row Level Security which defines when the table valued function will be applied to queries against a table.

The whole point of Row Level Security is, actually, that it adds these predicates to your execution plans.

How do I tell if Row Level Security changed my plan?

There’s a really easy way to tell if your plan was modified by RLS, I just didn’t know to look for it.

Click on the ‘SELECT’ operator in the plan and look down in the properties pane. If you see ‘SecurityPolicyApplied’ = True, then parts of your execution plan may have come from a table valued function that Row Level Security added in.

Should you use row level security?

Wellll…. maybe. If you’re interested, read up on the possible loopholes in RLS as it stands now, and consider if those would impact you or not. Aaron Bertrand has a great article to get you started: read SQL Server 2016 Row Level Security Limitations, Performance and Troubleshooting.

Everyone feels dumb looking at Execution Plans sometimes

I look at plans a lot, and still, they had me questioning my sanity today. When I first started doing performance tuning in SQL Server, I understood so little about plans that I gave up pretty easily.

I’m really happy that I kept going, though. Because as confusing as they are, most of the time the answers you’re looking for are right there in the plan. Somewhere.

Just keep swimming.

This week’s question is about a longstanding feature in SQL Server that sounds really cool: full-text search. If you’re learning performance tuning, how much time should you invest in researching and learning about full-text indexes?

Watch this 18 minute video, or scroll on down to read the written scoop on full-text search.

Dear SQL DBA…

I have been doing performance tuning for about 9 months now. It puzzles me that one type of index never gets much attention: full text indexes. Are fulltext indexes a cool feature that can really help performance (all that LIKE ‘%blabla%’ predicates application developers seem to love ) or are they quite the opposite and not worth investing time in ?

Best regards,

Puzzled about fulltext

The “dirty little secret” about full-text search indexes is that they don’t help with ‘%blabla%’ predicates.

Well, it’s not a secret, it’s right there in the documentation.

A lot of us get the impression that full-text search is designed to handle “full wildcard” searches, probably just because of the name. “Full-Text Searches” sounds like it means “All The Searches”. But that’s not actually what it means.

What is full-text search good for?

Full-text indexes can help with:

Prefix searches. It’s good for ‘bla%’
Phrases containing words. So it’s good for ‘So blabla to you’
Different forms of a word / synonyms (is there a synonym for blabla? I don’t know!)
Words near one another in a document (‘bla’ is in a document in proximity to ‘blop’)

Full-text search also has special features like stoplists and stopwords to keep the index from becoming more bloated than it has to be, and help searches be more efficient.

One way to think about this is that full-text search is designed to be smart about language: it thinks about phrases, synonyms, how words are used, things like that.

A pure wildcard search of ‘%blabla%’ isn’t really about language. That’s just looking for a pattern somewhere in a string.

For wildcard searches and regular expression queries, secondary applications like Lucene are attractive, and these days in the cloud there are options like Lucene Query in Azure Search.

Aside: Azure Search is easy to play with for free

A while back I wrote a post called Wildcard vs Regular Expressions – Lucene Query in Azure Search.

It shows how easy it is to play around with texting non-sargable wildcard searches like ‘%blabla%’ against online sample data in Azure. All you need is a browser, it’s totally free and you don’t even have to create an Azure account.

Fulltext indexes and performance

I’ve run into quite a few companies using full-text search. Most of them were using it pretty lightly, and it rarely was something they asked me for help with: they set it up following the documentation, and it just worked. There were quite a few cases where I’d say something about seeing a full-text index when looking over an instance, and my client laughed and said they’d forgotten they even used full-text. (If you think about it, that’s a compliment to the feature.)

I’ve also run into some folks who’ve used full-text search so heavily that they pushed the boundaries of the feature: very large multi-terabyte databases pulling in large volumes of data.

Keeping data in sync with heavy update rates

With heavy to ultra-heavy usage, one issue with full-text indexes is that they don’t update synchronously with the base table. This is helpful for performance for inserts, updates, deletes into the base table, because updating a large full-text index can take time. But it does mean that if your application allows both queries of the base table AND the full-text index, people could see different, contradictory data if the full-text index is behind.

What if corruption strikes?

And as with any other index, you can get corruption in a full-text index. That’s not necessarily the SQL Server’s fault: corruption can come from the storage subsystem. If your full-text index gets corrupt, you’re probably going to have to rebuild it.

If you’re working with giant full-text indexes, recreating the index can add up to a lot of downtime. Thinking about how your tables are laid out and breaking your indexes into manageable chunks becomes very important at scale.

I think full-text search is here to stay, it’s just getting interesting company

This is an older feature, so there’s always that question as to how “fresh” it is.

Microsoft has invested in making full-text indexes perform better over the years. The feature was revamped in 2008 and has received a variety of performance fixes over the years. A new DMV, sys.dm_fts_index_keywords_position_by_document , was added in SQL Server 2016 and also backported to previous versions.

Full-text search is well maintained by Microsoft. I don’t think it’s going anywhere.

What about semantic search?

In SQL Server 2012, Microsoft added the semantic search feature built on top of full-text search. Semantic search helps identify the main phrases in a document and can find and compare similar/related documents.

Semantic is one of those features that dropped in and then seemed to disappear from the conversation, though.

I haven’t heard of its capabilities being strongly expanded in later versions, and I know people who evaluated it in SQL Server 2012 who found it to be too much of a “v1 feature” to fit their needs, compared to features offered by third-party vendors with semantic search tools. (Of course, they were evaluating native semantic search because not everything was perfect with their third party app, either.)

Here is one such investigation into semantic search by Joe Sack – Exploring Semantic Search Key Term Relevance.

If you use semantic search in production and know about improvements that I’m unaware of, I’d love to hear about it in the comments!

How much time should you invest in learning full-text indexes?

To sum up, full-text indexing is fairly widely used, but most of the folks using it are doing so on a small scale where it “just works.” Those companies are unlikely to have a high bar on full-text index skills when it comes to hiring, and they may not even ask you questions about it at all in a job interview.

For most folks, I think it’s worth knowing the basic limitations of full-text and what the feature does.

A one-time investment of an hour to read and make notes for yourself is generally enough to get you to a point where you can identify potential use cases. If you ever find those use cases, at that point you can invest more time in evaluating how well full-text fits that implementation.

After getting the big picture from this post, reading the Books Online page on full-text search is probably good enough for most people. That’s where I’d spend the rest of your hour.

After that, I wouldn’t invest a bunch of time learning about full-text indexes unless you’ve got a specific reason. You’re better off investing your time learning about wait statistics, tuning TSQL using execution plans, rowstore indexes, columnstore indexes, Query Store, and In-Memory indexes.

Some fun related topics: building your own type of full-text index, and querying with regular expressions

Aaron Bertrand writes about building your own word-part index: One way to get an index seek for a leading %wildcard

Dev Nambi created an open-source project, sql-server-regex, that uses the SQLCLR “lets you run regular expressions in T-SQL queries using scalar and table-valued functions.” I know for a fact that Dev is crazy good at this stuff, because I worked with him for several years out there in the real world. He’s a unicorn.