Structured Query Language (SQL) is a powerful RDMS tool which designed to be used for managing and manipulating data. The “GROUP BY SQL” statement aggregates the data based on specific criteria. In this article, we will explore the Group By SQL statement in detail, including its syntax and practical examples with its most common special use cases.
Group By SQL Statement
The “GROUP BY SQL” is used to group rows that have the same values into summary rows. It’s often used with aggregate functions like COUNT, SUM, AVG, MAX, and MIN to perform calculations on grouped data.
Syntax
GROUP BY SQL Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);
Demo Database
For demonstration purposes, let’s consider a sample database called “Employee” with the following schema:
SQL code to create Database "Employees Table"
CREATE TABLE Employees ( EmployeeID INT, FirstName VARCHAR(255), LastName VARCHAR(255), Department VARCHAR(255), Salary DECIMAL(10, 2), HireDate DATE );
Employee Table
EmployeeID | FirstName | LastName | Department | Salary | HireDate |
---|---|---|---|---|---|
1 | John | Doe | Sales | 50000.00 | 2020-01-15 |
2 | Jane | Smith | Marketing | 60000.00 | 2019-07-10 |
3 | Michael | Johnson | Sales | 55000.00 | 2021-02-20 |
4 | Emily | Brown | HR | 48000.00 | 2020-11-05 |
5 | David | Martinez | IT | 65000.00 | 2018-05-12 |
Let’s explore another special use case of GROUP BY SQL with an example.
1. To find the average salary for each department in our company. How we can achieve this using SQL,
SQL Code
SELECT Department,AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department;
Result
Department | AvgSalary |
---|---|
Sales | 52500.00 |
Marketing | 60000.00 |
HR | 48000.00 |
IT | 65000.00 |
2. Counting the Number of Employees in Each Department:
SQL Code
SELECT Department, COUNT(*) AS NumEmployees,
MAX(Salary) AS MaxSalary
FROM Employees
GROUP BY Department;
3. Finding the Maximum Salary in Each Department:
SQL Code
SELECT *Department, MAX(Salary) AS MaxSalary
FROM Employees
GROUP BY Department;
4. Grouping by Month to Analyze Sales Data:
Suppose we have a Sales table with columns SaleID, SaleDate, and Amount. We can analyze sales data by grouping sales transactions by month and calculating the total sales amount for each month:
SQL Code
SELECT DATE_FORMAT(SaleDate, '%Y-%m') AS Month, SUM(Amount) AS TotalSales
FROM Sales
GROUP BY DATE_FORMAT(SaleDate, '%Y-%m');
5. Grouping by Multiple Columns:
For instance, let’s say we want to count the number of employees in each department and each job title:
GROUP BY SQL Code
SELECT *Department, JobTitle,
COUNT(*) AS NumEmployees
FROM Employees
GROUP BY Department, JobTitle;
6. Use of SELECT, FROM, WHERE, and GROUP BY SQL
We’ll demonstrate how to select data from this table with filtering conditions and group the results.
Let’s say we want to retrieve the total salary for each department where the salary is greater than $50,000. Here’s how you can do it:
GROUP BY SQL Code
SELECT *SUM(Salary)
AS TotalSalary
FROM Employees WHERE Salary > 50000
GROUP BY Department;
This query will return a result set with two columns: Department and Total Salary, where each row represents a department and its corresponding total salary if the salary exceeds $50,000.
Q. How can I use the GROUP BY clause to group by multiple columns?
A. To group by multiple columns, simply list them separated by commas in the GROUP BY clause. For example:
Code
SELECT Department, City, COUNT(*) AS NumEmployees
FROM Employees
GROUP BY Department, City;
Q. What does the GROUP BY clause do in SQL?
A. The GROUP BY clause is used to group rows with identical values into summary rows. It allows you to perform aggregate functions like SUM, COUNT, AVG, etc., on each group.
Q. When should I use the GROUP BY clause in SQL queries?
A. You should use the GROUP BY clause whenever you need to aggregate data based on common attributes or values in a column. It's useful for generating summary reports or performing calculations on grouped data.
Q. How does the GROUP BY clause differ from the WHERE clause?
A. While the WHERE clause filters individual rows based on specified conditions, the GROUP BY clause is used to group rows with similar values together. It's typically followed by aggregate functions to perform calculations on these grouped rows.
Q. What happens if I forget to include all non-aggregated columns in the GROUP BY clause?
A. Forgetting to include all non-aggregated columns in the GROUP BY clause can result in an error. In SQL, all non-aggregated columns in the SELECT list must be included in the GROUP BY clause when using the GROUP BY clause.
Q. Can I use the GROUP BY clause without any aggregate functions?
A. Yes, you can use the GROUP BY clause without aggregate functions, but it's less common. In such cases, the result will be distinct rows for each group based on the columns specified in the GROUP BY clause.