SQL for Beginners Series - Session 9
GROUP BY | HAVING | DISTINCT in SQL
Please click here for you tube video Answer
--**********************************************************************************
-- Session 9 - GROUP BY / HAVING / DISTINCT data from a table (Microsoft SQL Server)
--**********************************************************************************
DROP TABLE IF EXISTS tblEmployee
CREATE TABLE tblEmployee ( -- Drop table tblEmployee
Empid INT NOT NULL,
EmpName Varchar(20) NULL,
EmpLocation Varchar(20),
DOJ DATE,
Salary INT)
INSERT INTO tblEmployee (Empid,EmpName,EmpLocation,DOJ,Salary)
Values (1,'Smith','India', '2020-01-02',20000),(1,'Smith','India', '2020-01-02',20000),(1,'Smith','India', '2020-01-02',20000)
INSERT INTO tblEmployee (Empid,EmpName,EmpLocation,DOJ,Salary)
Values (2,'George Mark','US', '2019-04-28',50000),(2,'George Mark','US', '2019-04-28',50000)
INSERT INTO tblEmployee (Empid,EmpName,EmpLocation,DOJ,Salary)
Values (3,'Emp3','UK', '2018-04-28',25000),(4,'Emp4','UK', '2016-04-25',60000), (3,'Emp3','UK', '2018-04-28',25000),(4,'Emp4','UK', '2016-04-25',60000)
--> DISTINCT - eliminated the duplicate rows
SELECT * FROM tblEmployee ORDER BY Empid
SELECT DISTINCT Empid,EmpName,EmpLocation,DOJ,Salary FROM tblEmployee ORDER BY Empid
By using Distinct we can get the distinct combination of rows based on the Column list specified
--> GROUP BY / HAVING
SELECT Empid,EmpName,EmpLocation,DOJ,Salary, COUNT(*) AS Dups_Cnt FROM tblEmployee
GROUP BY Empid,EmpName,EmpLocation,DOJ,Salary
HAVING COUNT(*) > 2 --> Apply after the grouping
ORDER BY Empid
In the above query it will provide the Unique combination of records and how many times they are repeated.
Having would filter the grouped rows. In the above query it will get the records which are duplicated more than twice.
SELECT * FROM tblEmployee ORDER BY Empid
SELECT Empid,EmpName,EmpLocation,DOJ,Salary, COUNT(*) AS Dups_Cnt FROM tblEmployee
WHERE EmpLocation <> 'US' --> Apply before grouping
GROUP BY Empid,EmpName,EmpLocation,DOJ,Salary
HAVING COUNT(*) > 1 --> Apply after the grouping
ORDER BY Empid
--> WHERE Clause filters before grouping
--> HAVING Clause filters after grouping
-- Logical processing of SELECT Statement
For this please watch my you tube video to know the order of execution of select statement
SELECT Empid,EmpName,EmpLocation,DOJ,Salary, COUNT(*) AS Dups_Cnt FROM tblEmployee
WHERE EmpLocation <> 'US' --> Apply before grouping
GROUP BY Empid,EmpName,EmpLocation,DOJ,Salary
HAVING COUNT(*) > 1 --> Apply after the grouping
ORDER BY Empid