en-UShe-IL
You are here:   Blog
Register   |  Login

Blog Archive:

Maximize
* Can be used in order to search for older blogs Entries

Search in blogs


Blog Categories:

Maximize
* Can be used in order to search for blogs Entries by Categories

Blog Tags:

Maximize
* Can be used in order to search for blogs by keywords

TNWikiSummit


Awared MVP

 


Microsoft® Community Contributor 


Microsoft® Community Contributor


 Read first, before you you use the blog! Maximize

Recent Entries

Minimize
מאי25

Written by: ronen ariely
25/05/2015 13:10 RssIcon

Introduction

In some cases the SQL Server query optimizer engine fail to build the best execution plan. This is the place where our abilities as DBAs can be handles. One of these cases is when we are using windows functions like ROW_NUMBER, which use "ORDER BY" the clustered index. Since our data is already sorted by the clustered index, there is no reason to sort the data again. In simple query the SQL Server query optimizer "understand" it, but in more complex queries it might fail. In this short post, I will show a small trick to improve our query and make sure that the windows function do not waist resource on sorting the data.

* This post followed an answer that was suggested in forum, and it is based on their question case study (therefore the columns names for example, which use a reserved name)

DDL+DML

 Let's create new database for our testing, with new tables and some sample data.

create database Test
GO
 
use Test
GO
 
/***************************************  DDL - queries to create the tables! */
 
CREATE TABLE tblWords(
    ID       INT Primary key,
    name     NVARCHAR(100)
)
GO
 
CREATE TABLE tblFiles(
    ID       INT Primary key,
    name     NVARCHAR(100)
)
GO
 
CREATE TABLE tblWordsAndFiles(
    word_id       INT FOREIGN KEY REFERENCES tblWords(ID) NOT NULL,
    [file_id]     INT FOREIGN KEY REFERENCES tblFiles(ID) NOT NULL -- This is very bad name for column! This is a reserved name
)
GO
 
/***************************************  DML - queries to insert some sample data */
insert tblWords (ID,name)
values
(1, 'hello'),
(2, 'world'),
(3, 'something')
GO
 
insert tblFiles (ID,name)
values
(1, 'myfile.txt'),
(2, 'file.txt')
GO
 
insert tblWordsAndFiles (word_id,[file_id])
values
(1, 1),
(2, 1),
(2, 2),
(3, 2)
GO
 
create CLUSTERED INDEX IX_tblWordsAndFiles_fileId
    ON tblWordsAndFiles([file_id]);
GO
 
select * from tblWordsAndFiles
select * from tblWords
select * from tblFiles
GO


Simple solution which use redundant sorting operation

 

/***************************************************************   */
-- The Windows function sort the data even if we have cloustered index!
-- it is not recognize that the data is alredy sorted :-(
select *,
    tblWords.ID, ROW_NUMBER() over (partition by tblWordsAndFiles.[file_id] order by tblWordsAndFiles.[file_id]) RN
from tblWords
left join tblWordsAndFiles on tblWordsAndFiles.word_id = tblWords.ID
GO


 

Small trick to improve the query

In the original query the SQL Server query optimization engine fails and use sorting for the windows function. but once we use a simple query with the windows function the SQL Server optimization engine find that we do not need to sort the data, since the data ordered by the clustered index. Therefore, we can use sub query for the windows function and use the result for the JOIN operation.

-- Next trick :-)
select *, tblWords.ID
from tblWords
left join (select *, ROW_NUMBER() over (partition by [file_id] order by [file_id]) RN from tblWordsAndFiles) as tblWordsAndFiles
    on tblWordsAndFiles.word_id = tblWords.ID
GO

 

Comparing Execution Plan

The execution plan show us that the second query use 28% vs 72% resources comparing to the original query. There is no Sorting operation, which is the most expensive operation in the first query. 

Execution Plan