en-UShe-IL
You are here:   Blog
Register   |  Login

Blog Archive:

Maximize
* Can be used in order to search for older blogs Entries

Search in blogs


Blog Categories:

Maximize
* Can be used in order to search for blogs Entries by Categories

Blog Tags:

Maximize
* Can be used in order to search for blogs by keywords

TNWikiSummit


Awared MVP 


 


Microsoft® Community Contributor 


Microsoft® Community Contributor


Read first, before you you use the blog

Maximize
פבר11

Written by: ronen ariely
11/02/2019 08:02 RssIcon

Good day guys,

In this post I want to present several tips which can help in daily queries. This post does not present an advanced topic and it does not bring some high insights, but these tips can be very useful and I already wrote them, so why not present them.

* You can understand more about what I mean by "I already wrote them" in the off-topic section, or simply skip the stories and move to the technical part.


Off-topic: I already wrote it, so why not present it

Well, if you follow my work online and my contribution to the technical community, then you probably know that my main focus is helping people in forums. In fact, I serve as Moderator in different forums from 1999. I just noticed that I am celebrating 20 years as Moderator in forums.

You might think that in 20 years I have seen all sorts of people, all kinds of habits, but believe me that from time to time I see something new. It might be something unique which someone does good for the community, unfortunately it can also be a new low level of behavior, which drives me crazy, and it might be a result of misunderstanding the way online community works. This is basically what happened to me which led me post this blog.

yesterday, I saw a question at stackoverflow and I decided to help the OP (OP = Original Poster = the person who start a new thread in forum). The content of the question implied that the OP has a lot to learn, and I love to teach. I wanted to provide him a detailed answer with explanation.

I spent several hours to prepare the answer (spent some time to format the answer in a nice readable way, collect images, external links which can help, and writing the content), and I posted my first version of the answer. I waited for the OP to respond and fill some missing information in his original question, while I already prepare more materials that I would love to provide him.

This is probably not the worst behavior which I encountered in the forums during the last 20 years, but it's certainly on the top. What happened next was annoying, and it caused to the time I spent writing the answer to be a wasted.

I want to clarify an important point! When we write an answer and/or participate in discussions in communities' forums, the assumption is that not only the OP will see the answer but other people as well. This mean that our contribution has more impact than answering a question in private, which a big reason to why I prefer to help in forums. Deleting a thread after the discussion started takes from the community that option.

Well, I already wrote it, so I thought to myself why not present it, hoping it will be useful for others.


Technical scenario - Original Question

The OP provided DDL+DML(well done!) and a query which according to him is working well but he needed a second step or in other words he need to get more information in the result SET which he did not know how to get.

This is the DDL+DML which the OP provided

CREATE TABLE [test_table] ([ID] INTEGER NULL,[Name] VARCHAR(MAX) NULL,[Price] VARCHAR(100) NULL,[Quantity] INTEGER NULL, Total as Price * Quantity
)
GO
INSERT INTO test_table([ID],[Name],[Price],[Quantity]) VALUES
(1,'hafi','0200',3),(2,'nec','0210',4),(3,'non','0542',6),(4,'Integer','0122',4),(5,'eget','0789',2),(6,'ultrices','0392',15),(7,'dui.','0168',3)
,(8,'elit.','0655',21),(9,'semper','0042',3),(10,'cursus','0551',4),(11,'Nunc','0674',10),(12,'Mauris','056',14),(13,'facilisis.','063',2)
,(14,'eget','0170',10),(15,'egestas.','0574',26),(16,'aliquet','0276',8),(17,'Morbi','0108',3),(18,'diam.','0073',6)
,(19,'amet','0120',7),(20,'sociis','0160',9),(21,'mauris','0220',12),(22,'magna.','0549',16),(23,'et','0570',15);
GO

And this is the original query which the OP provided

declare @target int = 700
 
--first i will filter items with total less than or equal to target
declare  @table table (Name varchar(50), Price money, Quantity int, Total money, ID int)
insert into @table (Name, Price, Quantity, Total) select Name, Price, Quantity, Total
from dbo.test_table where total <= @target
 
--then i will set quantity lower for items with total more than target but price less than or equal to target
--ex. quantity in value that quantity*price <= @target
insert into @table (Name, Price, Quantity, Total) (select Name, Price,
(case when
(Price *(SELECT ROUND(@target / Price, 0))) > @target then (SELECT ROUND(@target / Price, 0)-1)
else ((SELECT ROUND(@target / Price, 0)))end),
(case when
(Price *(SELECT ROUND(@target / Price, 0))) > @target then (Price *(SELECT ROUND(@target / Price, 0)-1))
else (Price *(SELECT ROUND(@target / Price, 0))) end
)from dbo.test_table where Price <= @target and Total > @target)
 
--declare @id int = 0
--update @tableZ set ID = @id, @id = @id+   case when ... then 1 else 0
 
select * from @table

The Result SET which the query returns:

In addition to the above results the OP requested to get another column, which present a calculated integer values using the above result SET, based on the order of the result SET we see in the image above.

The final requested result was presented in the following image:

So, can you point what is/are the problem(s) with this question, from the technical perspective?

I highly recommend the readers to take a two minutes to think about it, before you keep reading.
"Don't cheat! you only cheating yourself" - this is what my teachers always used to say in school. It did not helped🤫

Well, this what I what to discuss with you in the next chapter, since this the above question includes several very problematic points which people should understand and avoid


Point 1: using Sub-Queries for constant is a BAD idea!

You can noticed that the OP query include sub-queries like "(SELECT ROUND(@target / Price, 0))". 

These sub-queries will lead the SQL Server Query Optimizer Engine to use a very bad Execution Plan, since it assume that these are a real tables which needed to query and . 

The original query includes several statements.  The above image shows the Execution Plan (EP) of the second statement which includes multiple sub-queries "(SELECT ROUND(@target / Price, 0))". This sub-query does not point any table or SET of data but simply returns the calculated value "ROUND(@target / Price, 0)". Therefore, we can use this calculation directly.

Check what happens when I simply remove the word "SELECT" and use the value directly:

Instead of using "(SELECT ROUND(@target / Price, 0))" we simply need to use the content without the "SELECT" like this: "(ROUND(@target / Price, 0))".


Point 2: Do not re-invent what has already been invented!

The bellow complex code is simply equivalent of using the built-in function FLOOR

(
    case
    when (Price *(SELECT ROUND(@target / Price, 0))) > @target then (SELECT ROUND(@target / Price, 0)-1)
    else ((SELECT ROUND(@target / Price, 0)))
    end
)

We can replace the above code with:

FLOOR(@target / Price)


Point 3: By default, don't add steps when you can solve the issue with in-line query! 

SQL Server does not "understand" English and it does not Executes the query as it is. First, the server needs to parse the query, build multiple optional Execution Plans (EP), and select the EP based on built-in algorithms and multiple parameters (like which indexes exists, statistics of the data in the table and in the indexed, session properties, hints, and more). 

The EP is built according to the entire in-line statement. The server tries to find an Execution Plan which give good performance according to the entire in-line statement. By spiting the query into multiple statements, the server does not evaluate the entire solution but each statement is evaluated separately. The Execution Plan which the server create will focus on performance of the specific statement ignoring the other statements and probably the final EP for the entire query will be poor (there are exceptions like always, but this is the default you should follow). 

In the original post the OP used a variable table, and he filled it twice using two separate statements which is very bad idea.


Conclusion of points 1-3: here is my equivalent solution with DRAMATICALLY better performance

declare @target int = 700
select
    ID , Name, Price,
    -- set quantity lower for items with total more than target but price less than or equal to target
    Quantity = CASE
        WHEN Price <= @target and Total > @target THEN FLOOR(@target / Price)
        ELSE Quantity
    END,
    Total  = CASE
        WHEN Price <= @target and Total > @target THEN Price * FLOOR(@target / Price)
        ELSE Total
    END
from dbo.test_table
where (total <= @target) or (Total > @target and Price <= @target)
GO

And the result SET which I get is the same as the one the OP posted in his image


Note! in the above image of my result set I re-ordered the result using Excel, so it will be simpler to compare and see that we got the exact same result SET. If you will execute my query, then you will noticed that the order of the rows is different from the order of the rows in the image, which the OP provided. This takes us to the next IMPORTANT POINT!


Point 4! Order is not guaranteed without ORDER BY

"The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified."

SQL Server like other RDBMS is designed to work with SET of data and NOT row by row. In order to provide best performance, the server will not sort the data in the result SET, unless we tell him to do so EXPLICITLY using the statement "ORDER BY"!

Moreover, the data is not necessarily sorted in the data file in specific order, and in addition when the server read the data it might use a single thread or multiple threads. The server does not read the data in specific order and even if 99.999% of the times he did read the data in specific order, it does not guaranty that it will do the same next time!

What next?

The original request of the OP was to provide another column in the result SET named ID, which include ranking which is based on the order of the result set. But the result SET which we have does not include any column which fit the order which the OP present in the image. The requirement based on calculation of the values in specific order!  

but, I have just said that "The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified". 

This mean that there is no way to solve this request using only the information that we got from the OP! We must have a clear logic to sort the rows BEFORE we can calculate the new column.

Let me summarize the conclusions which we should take from this case


Conclusions

First of all let me ask you NOT to change a message in forums once it got responses. Edit the message might lead the responses to be seem irrelevant. Moreover, it reduce the value of the discussion and prevent reader to follow the progress of the discussion. But even if you had to edit your original message DO NOT CLEAN/REMOVE THE MESSAGE and delete the thread! Respect the free support you got and remember to say thanks even if you think that it was useless.

From the Technical side of the conclusions let's remember these four tips

  1. using Sub-Queries for constant is a BAD idea!
  2. Do not re-invent what has already been invented - this is why we have search Engines.
  3. By default, don't add steps when you can solve the issue with in-line query!
  4. Do not count on the order in which rows are returned in a result set unless an ORDER BY clause is specified.

And last but not least...
keep reading blogs 😍