RSS Feed

ROW_NUMBER()

ROW_NUMBER()

The ROW_NUMBER function returns the number of a row within a result set starting with 1 for the first row.

Syntax of ROW_NUMBER()

Row_NUMBER() OVER ([partition_by_clause] order_by_clause)

partition_by_clause divides the result set into groups to which the ROW_NUMBER() function is applied. The function is applied to each partition separately; computation restarts for each partition.

order_by_clause specifies the order in which the sequential ROW_NUMBER() values are assigned.
[Source: Programming SQL Server 2005 by Bill Hamilton]

Example of ROW_NUMBER()

Lets create a table and insert data into it.
CREATE TABLE Match_IPL
(
  Player_ID INT IDENTITY(1,1) NOT NULL,
  Player_Name varchar(30),
  Team varchar(30)
)

 Insert into Match_IPL values('Sachin', 'Mumbai')
 Insert into Match_IPL values('Mahender', 'Chennai')
 Insert into Match_IPL values('Yuvraj', 'Punjab')
 Insert into Match_IPL values('Nehra', 'Chennai')
 Insert into Match_IPL values('Bhajji', 'Mumbai')
 Insert into Match_IPL values('Ricky', 'Mumbai')
 Insert into Match_IPL values('Patel', 'Punjab')

Select * from Match_IPL

Player_ID Player_Name Team
1 Sachin Mumbai
2 Mahender Chennai
3 Yuvraj Punjab
4 Nehra Chennai
5 Bhajji Mumbai
6 Ricky Mumbai
7 Patel Punjab


Lets write a query using ROW_NUMBER():
SELECT  *,
  ROW_NUMBER() OVER (ORDER BY Player_Name DESC) AS RN,
  ROW_NUMBER() OVER (PARTITION BY Team ORDER BY Player_Name DESC) AS RNP
FROM    Match_IPL
ORDER BY Player_Name DESC

Player_ID Player_Name Team RN RNP
3 Yuvraj Punjab 1 1
1 Sachin Mumbai 2 1
6 Ricky Mumbai 3 2
7 Patel Punjab 4 2
4 Nehra Chennai 5 1
2 Mahender Chennai 6 2
5 Bhajji Mumbai 7 3

Above, sequential numbers are assigned to each partition in a result set. The result set is ordered by Player_Name in descending order. Values in the RN column are simply based on the sort of Player_Name but the values in the RNP column are first grouped by (partitioned by) Team and then numbered by Player_Name, with the row number resetting on change of Team.

MSDN says There is no guarantee that the rows returned by a query using ROW_NUMBER() will be ordered exactly the same with each execution unless the following conditions are true.

a. Values of the partitioned column are unique.
b. Values of the ORDER BY columns are unique.
c. Combinations of values of the partition column and ORDER BY columns are unique.

ROW_NUMBER() is nondeterministic.

As per 'Beginning SQL Server 2012 for Developers by Robin Dewson', Ranking functions like ROW_NUMBER can be used only with the SELECT and ORDER BY statements. Sadly, they can’t be used directly in a WHERE OR GROUP BY clause, but you can use them in a CTE or derived table.

Benefits of ROW_NUMBER
1. ROW_NUMBER, allows your code to guarantee an ascending sequence of numbers to give each row a unique number. Until now, it has not been possible to guarantee sequencing of numbers, although an IDENTITY-based column could potentially give a sequence providing all INSERTs succeeded and no DELETEs took place. This function is ideal for giving your output a reference point, for example, "Please take a look at row 10 and you’ll see...".

2. Another use for this function is to break the data into exact chunks for scrolling purposes in GUI systems. For example, if five rows of data are returned, row 1 could be displayed, and then “Next” would allow the application to move to row 2 easily rather than using some other method.
[Source: Beginning SQL Server 2012 for Developers by Robin Dewson]

3. Removing duplicate records: If you have data loaded into an SQL Server table - possibly a staging table - and you want to remove any duplicate records then you can use the ROW_NUMBER() function and CTE to reduplicate records. [Source: SQL Server 2012 Data Integration Recipes: Solutions for Integration Services by Adam Aspin]

Row_Number VS Identity ()
The new ROW_NUMBER() function provides a much more flexible way to number rows in our result sets that does the IDENTITY() function. The IDENTITY() function (similar to the IDENTITY property in a table column) provides a way to automatically number rows in a result set. However, the IDENTITY() function can only be used in SELECT ... INTO queries. The ROW_NUMBER() function, on the other hand, can be used in any SELECT statement and is quite configurable via the partition and order by clauses. [Source: The Real MCTS SQL Server 2008 Exam 70-433 Prep Kit: Database Design]