51

I've to add row number in my existing query so that I can track how much data has been added into Redis. If my query failed so I can start from that row no which is updated in other table.

Query to get data start after 1000 row from table

SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select 1)) as rn ) as X where rn > 1000

Query is working fine. If any way that I can get the row no without using order by.

What is select 1 here?

Is the query optimized or I can do it by other ways. Please provide the better solution.

gotqn
  • 36,464
  • 39
  • 145
  • 218
lucy
  • 2,650
  • 4
  • 17
  • 36

4 Answers4

91

There is no need to worry about specifying constant in the ORDER BY expression. The following is quoted from the Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions written by Itzik Ben-Gan (it was available for free download from Microsoft free e-books site):

As mentioned, a window order clause is mandatory, and SQL Server doesn’t allow the ordering to be based on a constant—for example, ORDER BY NULL. But surprisingly, when passing an expression based on a subquery that returns a constant—for example, ORDER BY (SELECT NULL)—SQL Server will accept it. At the same time, the optimizer un-nests, or expands, the expression and realizes that the ordering is the same for all rows. Therefore, it removes the ordering requirement from the input data. Here’s a complete query demonstrating this technique:

SELECT actid, tranid, val,
 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM dbo.Transactions;

enter image description here

Observe in the properties of the Index Scan iterator that the Ordered property is False, meaning that the iterator is not required to return the data in index key order


The above means that when you are using constant ordering is not performed. I will strongly recommend to read the book as Itzik Ben-Gan describes in depth how the window functions are working and how to optimize various of cases when they are used.

gotqn
  • 36,464
  • 39
  • 145
  • 218
  • so query, SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select NULL)) as rn ) as X where rn > 1000 is always give the correct result. means order remains same for every exection of query – lucy May 22 '17 at 07:01
  • 2
    @lucy No, the order does not remain the same for every execution of the query. Using `(SELECT constant)` means that there is NO ORDER. You are selecting a specific amount of the data - you can execute the query 1 million times and it to return the same data, but there is no guarantee for this. Without specific ordering, the result is not deterministic. – gotqn May 22 '17 at 07:34
11

Try just order by 1. Read the error message. Then reinstate the order by (select 1). Realise that whoever wrote this has, at some point, read the error message and then decided that the right thing to do is to trick the system into not raising an error rather than realising the fundamental truth that the error was trying to alert them to.

Tables have no inherent order. If you want some form of ordering that you can rely upon, it's up to you to provide enough deterministic expression(s) to any ORDER BY clause such that each row is uniquely identified and ordered.

Anything else, including tricking the system into not emitting errors, is hoping that the system will do something sensible without using the tools provided to you to ensure that it does something sensible - a well specified ORDER BY clause.

Damien_The_Unbeliever
  • 220,246
  • 21
  • 302
  • 402
  • so I am doing right, select 1.? I did not find anything else for this problem – lucy May 22 '17 at 06:37
  • I partially disagree with the statement, "_Tables have no inherent order._" I've found (when selecting from a Table) it will return its Records using the Clustered-Index Order every time. Now start joining tables and making a more complicated query, then I think the Default-Order may be up to whatever route the Query-Optimizer takes. BTW: The Questioner is **not** asking about leaving the Order-By Clause out all together. You could still have one, she just wanted to know if it could be left unspecified in the `Row_Number()` expression, which it can: `ROW_NUMBER() OVER (ORDER BY (SELECT NULL))` – MikeTeeVee Jul 12 '18 at 12:48
  • 1
    @MikeTeeVee - it's a topic that has been *done to death*. There is no inherent user observable order. *Often* a select from a single table will follow the clustered index order but SQL Server offers no *guarantee* that it's the case. Even in this case, though, you can end up with a carousel scan - your query piggy-backs on another query that has already started a table scan. You query starts getting the results from the middle of the table because that's where the scan was and then will start a second scan to finish off. Out of order results even with just the clustered index defined. – Damien_The_Unbeliever Jul 12 '18 at 13:11
  • Wow, I had no clue. Thanks for the super-informative response! After all these years, I feel like a newb again. – MikeTeeVee Jul 12 '18 at 13:42
9

You can use any literal value

ex

order by (select 0)

order by (select null)

order by (select 'test')

etc

Refer this for more information https://exploresql.com/2017/03/31/row_number-function-with-no-specific-order/

Madhivanan
  • 12,915
  • 1
  • 21
  • 26
  • gotqn said in comment that select 1 means random order. So if add condition in query where row no > 1000. So it may be give previous record because it is giving random order. Or it is correct to use – lucy May 22 '17 at 06:44
  • every time select 0 will give the same order when I execute the query? – lucy May 22 '17 at 06:47
  • @lucy - maybe it will sometimes, but it is not guaranteed since there is then no order. To guarantee the same results, use some data that is unique for each record in the ORDER BY clause. – Reversed Engineer May 11 '19 at 11:04
0

What is select 1 here?

In this scenario, the author of query does not really have any particular sorting in mind. ROW_NUMBER requires ORDER BY clause so providing it is a way to satisfy the parser.

Sorting by "constant" will create "undeterministic" order(query optimizer is able to choose whatever order it found suitable).

Easiest way to think about it is as:

ROW_NUMBER() OVER(ORDER BY 1)    -- error
ROW_NUMBER() OVER(ORDER BY NULL) -- error

There are few possible scenarios to provide constant expression to "trick" query optimizer:

ROW_NUMBER() OVER(ORDER BY (SELECT 1)) -- already presented

Other options:

ROW_NUMBER() OVER(ORDER BY 1/0)       -- should not be used
ROW_NUMBER() OVER(ORDER BY @@SPID)
ROW_NUMBER() OVER(ORDER BY DB_ID())
ROW_NUMBER() OVER(ORDER BY USER_ID())

db<>fiddle demo

Lukasz Szozda
  • 120,610
  • 18
  • 161
  • 197