In this tutorial, we’ll be discussing the different aggregate functions in SQL and how we can use them. In SQL, aggregate functions are used to accumulate data from multiple rows into a single summary row.
Aggregate functions in SQL can be very helpful for analysts when doing data summarization.
Commonly Used Aggregate Functions In SQL
Shown below are some of the commonly used aggregate functions in SQL.
Do note that the MIN and MAX functions are commonly used in currencies or integer values. The AVG function returns not only the average but also the mean of a column. Also, the aggregate function COUNT is the only one that doesn’t ignore null values when performing calculations.
Let’s make some sample commands using our five aggregate functions with the sample table below.
First, let’s make a command using the MAX function. Let’s say we want to find the highest valued product under column ProductName. Our command should look something like the image below.
Since we’re finding the highest valued product, we need to get their prices, which is under the column ProductPrice. Always put an alias to provide a name for the output column. If we execute that command, it will show us the highest valued product with only one value.
Next, let’s make a command using the MIN function. Since the MIN function is somewhat similar to the MAX function, we’ll just use the same example we used in the MAX function but this time, we’ll find the lowest valued product.
When we execute this command, it will fetch the lowest valued product in our sample table.
Now, let’s make a command using the COUNT function. Let’s say we want to find the quantity of the products under the column ProductID. Our command should look something like the image below.
If we execute this command, it will show us the quantity of each product under the ProductID column and their respective price.
Aggregate Functions And GROUP BY Clause
Now, let’s discuss how we can use multiple columns with aggregate functions. To do so, we need to use the GROUP BY clause.
The GROUP BY function is mostly used in aggregate functions as it returns one record from each group. Also, all columns in the SELECT statement even without aggregate should still be included.
Let’s discuss how we’ll use aggregate functions with the GROUP BY clause with our sample table below.
First, let’s try to execute a command without using GROUP BY clause to see the difference. Let’s find the total SaleAmount within the ProductName column. Our command should look something like the image below.
As you noticed, the command is similar to the previous examples we mentioned at the beginning of this tutorial. Once we execute this command, it will simply calculate the total amount under column SaleAmount, which will be 28.
Now, let’s have another example using the GROUP BY clause.
Since we’re using GROUP BY clause, we’re now able to use multiple columns. This time, we want to see the total SaleAmount per category.
The GROUP BY clause will make sure that the result will show a summation of sales for each category and if executed, the result will be the image below.
In this example, it’s important to use the GROUP BY clause because if not, an error will occur. This is because the Category is not contained in a GROUP BY clause or aggregate function.
Let’s have the same example but this time, we’ll be getting the total SaleAmount by ProductName as well. Our command should look something like the image below.
Once we execute our command, the result will show the summation of sales per Category and by ProductName.
Remember that you can’t use or access multiple columns without using the GROUP BY clause because if you did, it will cause an error once you run your command.
Aggregate Functions In SQL Server Management Studio (SSMS)
Let’s now discuss how we can use aggregate functions in SQL Server Management Studio. We’ll use the table below in our examples.
We have 31,465 rows of data in our SalesOrderHeader table. For our first example, we’ll demonstrate how we can use COUNT function by counting the rows in our table. Our command should look like the image below.
We’ve successfully counted all the rows there are in our SalesOrderHeader and named it as countofrows. Once we execute this command, we’ll have a result of 31,465 rows.
Let’s go back to our dataset. While using the same aggregate function, let’s count how many SalesPersonID there is in our table. Our command should be:
When we execute our command, it will specifically fetch data under column SalesPersonsID and will give us a column named as countofsalespersons with the result of 3806.
Using Aggregate Functions In A Single Query
Moving on, let’s discuss how we can use multiple aggregate functions in a single query. We’ll be using the SUM, AVG, and MAX functions in aggregating our columns TotalDue and TaxAmt.
We can use multiple aggregate functions in a single query by putting a comma after every function.
Once we execute this command, our result should look like the image below.
Now, we can see the different columns along with their respective value using multiple aggregate functions.
Let’s try and execute a command where it will cause an error to emphasize why we should use GROUP BY clause when selecting columns. We’ll use the same example using multiple functions but this time we’ll be selecting CustomerID.
As you can see, it says the command is invalid because CustomerID is not contained in either aggregate function or GROUP BY clause.
Sample Scenario With GROUP BY Clause
For our next example, let’s fix our error by providing the GROUP BY clause in our command.
We just added CustomerID in a GROUP BY clause and now we can see the summation, average, and maximum value of our TotalDue and TaxAmt by CustomerID. We can also see that we have a total of 19119 rows.
Let’s have our last example by discussing why it’s important for us to make sure that non-aggregate columns and expressions are included in the GROUP BY clause.
In our SalesOrderHeader table, we have the column OrderDate. Let’s say we want to get the summation of TotalDue by year. If we won’t indicate all non-aggregate columns and expressions, we need to use the command from the image below.
Notice how we’re able to get a result rather than an error. However, we have multiple reports for the same year when our goal is to have the reports be summarized into one specific value per year.
To correct our mistake, we have to make sure that we include the expression in our GROUP BY clause as well. Our new command should be:
After executing the command, we can see that we’ve successfully get the summation of OrderDate per year.
***** Related Links *****
HAVING Clause In SQL Aggregate Functions
Stored Procedures In SQL | An Overview
Effective Ways To Use The GROUPBY Function In Power BI
Conclusion
To summarize, you’ve learned some of the basic knowledge regarding the use of aggregate functions in SQL. Aggregate functions can be very ideal when working with large reports or datasets as they provide convenience to analysts during data analysis and summarization.
Hopefully, this tutorial can help you have better practice in using different aggregate functions in SQL. If you’d like to know more about this topic and other related content, you can certainly check out the list of relevant links below.
All the best,
Hafiz