Showing posts with label CTE. Show all posts
Showing posts with label CTE. Show all posts

Thursday, May 30, 2013

SQL Query to return a table containing all dates falling within a date range along with their week day names

Following is the T-SQL code to return all dates falling within a given date range. This uses CTE for populating all dates, and store them in a temporary table.

DECLARE @StartDate datetime
DECLARE @EndDate datetime
SET @StartDate = '01-Apr-2013'
SET @EndDate = '30-Apr-2013'

;WITH DatesList
AS
(
SELECT @StartDate [fldDate]
UNION ALL
SELECT fldDate + 1 FROM DatesList WHERE fldDate < @EndDate
)

SELECT fldDate, DATENAME(dw, fldDate) as fldDayFullName INTO #DatesList FROM DatesList option (maxrecursion 0)

SELECT * FROM #DatesList

Tuesday, May 15, 2012

Find pattern-matched duplicate rows within a group using PARTITION BY in SQL Server.


This article aims to explain how to find duplicate rows within a group matching patterns.

Scenario:
Following is the structure and data of “tblPeople” table:

PersonID
CityName
ForeNames    
LastName
1
Ahmedabad
Bhavin
Parikh
2
Ahmedabad
Bharat
Soni
3
Ahmedabad
Jagdish
Shah
4
Ahmedabad
Byomesh
Soni
5
Ahmedabad
Jagdish
Shyam
6
Vadodara
Bhavin
Patel
7
Vadodara
Amay
Kapoor
8
Vadodara
Ajit
Shah
9
Vadodara
Ajit
Shah
10
Vadodara
Soham
Patel
11
Vadodara
Amay
Shah
12
Surat
Rajesh
Desai
13
Surat
Bhavin
Parikh
14
Surat
Bhaumik
Shah

Our SQL should have input parameters for ForeNames, LastName, and CityName to define search pattern for duplicate items search.
Parameters - @foreNames, @lastName, @cityName
Our objective is as follow:
1.       If none of the above parameters are supplied, find all persons having same ForeNames and LastName within same City.
Expected result:

2.       If @cityName is only supplied, then find all persons having same ForeNames within particular City.
Expected result: (input >> @cityName = ‘%Vad%’)


3.       If @foreNames is supplied (with/ without @cityName supplied), but @lastName is not supplied.
In that case, consider all persons matching @foreNames within same city as duplicates, and return them.
Expected result: (input >> @foreNames=’%Bh%’)


4.       If @lastName is supplied (with /without @cityName supplied), but @foreNames is not supplied.
In that case, consider all persons matching @lastName within same city as duplicates, and return them.
Expected result: (input >> @lastName=’%Sh%’)



5.       If @foreNames and @lastName are supplied (with/without @cityName supplied).
In that case, consider all persons matching @foreNames, and @lastName within same city as duplicates, and return them.
Expected result: (input >> @foreNames=’%A%’, @lastName=’%S%’, @cityName=’%Ahm%’)

Solution:
First of all, we cannot use GROUP BY here, because we need also need ID, and other details of the rows which are duplicated. GROUP BY simply cannot allow selecting those columns which are not used in grouping or aggregate.

To achieve this, we need to partition the rows in groups of “City”, “Forenames”, and “LastName”. Here, while partitioning the rows we also need to consider pattern-matching for ForeNames and LastName.

This can be done using PARTITION BY in SELECT query, and within single T-SQL statement only.

Following is the T-SQL serving the purpose:

WITH CTE AS
(SELECT ID, City, ForeNames, LastName,
COUNT(*) OVER(PARTITION BY City,
CASE WHEN @foreNames <> '' THEN @foreNames
ELSE
CASE WHEN @lastName <> '' THEN '' ELSE ForeNames END
END,
CASE WHEN @lastName <> ''  THEN @lastName
ELSE
CASE WHEN @foreNames <> '' THEN ''
ELSE
CASE WHEN @cityName <> '' THEN ''
ELSE LastName
END
END
END) as MAXCount
FROM tblPeople
WHERE ForeNames LIKE  
CASE WHEN @foreNames <> '' THEN @foreNames ELSE '%' END
AND LastName LIKE CASE WHEN @lastName <> '' THEN @lastName ELSE '%' END
AND City LIKE CASE WHEN @cityName <> '' THEN @cityName ELSE '%' END
)

SELECT * FROM CTE WHERE MAXCount > 1

Here, I have used Common Table Expression (but you can use temporary table to store the result of SELECT statement) before further narrowing down the resultset to exclude non-duplicates.

The point of interest in above T-SQL is how the field “MAXCount” gets generated.
COUNT(*) – indicates the total number of rows in each partition. This is what ultimately will be stored in “MAXCount” field.

Now, each partition is built using following values:
City >> Because we need to find duplicates within the same city always, so this column is an obvious choice in partition.

ForeNames >> Partition on this column depends upon @foreNames parameter.
·         If it is not supplied, and neither @lastName is supplied it will create partition by grouping rows having exact ForeNames.
·         If it is not supplied, but @lastName is supplied, it will ignore ForeNames column in partitioning.
·         If supplied, it will create partition by grouping all matching rows in a single partition.

LastName >> Partition on this column depends upon @lastName parameter.
·         If it is not supplied, and not even @foreNames, and @cityName, then it will create partition by grouping rows having same LastName.
·         If it is not supplied, but either @foreNames, or @cityName is supplied, it will ignore LastName column in partitioning.
·         If supplied, it will create partition by grouping all matching rows in single partition.

Wednesday, April 11, 2012

SQL Query to return each date of month in each different row

Sometimes we need a resultset in SQL that should list all dates of a particular month.
For example, you want to return a resultset containing total of all expenses done against each date. But for that your expense table not necessary to contain entry of each and every date.
There may be some dates for which there is no expense, and simply that date will not be in the expense table.
In such scenarios you would always need to have one resultset (virtual table) containing list of all dates, which you can use to join (Left Join, ideally) with your expenses table by dates to get your desired result.

This is possible using CTE by performing recursive SELECT.

Following is the complete SQL Query to generate list of all dates for a particular month:



Image: Resultset after executing below SELECT Query
P.S. - There are actually 31 rows, its just due to screen truncate it showing 27 in image.


DECLARE @MonthName VARCHAR(3)
DECLARE @Year VARCHAR(4)
DECLARE @MaxDay INT

SET @MonthName = 'MAR'
SET @Year = '2012'
SET @MaxDay = 31;

WITH CTE_MonthDays AS (
SELECT 1 AS fldDay,
CONVERT(datetime, '1/' + @MonthName + '/' + @Year) AS fldDate
UNION ALL
SELECT fldDay + 1,
CONVERT(Datetime, CONVERT(varchar(2), fldDay + 1) + '/' + @MonthName + '/' + @Year) AS fldDate
FROM CTE_MonthDays
WHERE
fldDay + 1 <= @MaxDay
)

SELECT fldDate FROM CTE_MonthDays