MariaDB - Aggregate functions
In MariaDB Aggregate functions are the functions used to perform calculations on multiple values and they return single value result for example average sum of the row values in the table.
I. Avg() Function:
In MariaDB, we can find the average of the numbers by using the avg() function. This function accepts the set of numbers, and it returns the average of those numbers.
Syntax:
avg([DISTINCT] exp);
- The DISTINCT keyword is used to return the average of the distinct values from the exp.
- If we pass an empty set, this function returns NULL.
- If there are NULL values in the input, it ignores them.
- This is an aggregate function so that it can be used with the GROUP BY clause.
Let's see an example:
We will be using the 'Country' table, which looks like this:
MariaDB [student]> select avg(population) avg_population from country;
In this example we have used avg() function which returns the average of population in the table.
Example with GROUP BY clause:
The avg() function is usually used along with the group by clause. It returns the average value for every group.
MariaDB [student]> select state, avg(population) avg_population from country group by state order by state;
We can see the average population of each state. Here we have grouped the cities based on the state, and then the avg() function calculates the average population of all states. And finally, the output is ordered alphabetically based on the state.
MariaDB [student]> select state, cast(avg(population) as dec(10,0))avg_population from country group by state;
We can make the result more readable; we can use cast expression as above. It truncates all the numbers after the decimal point.
Example with HAVING clause:
MariaDB [student]> select country.state as state_name, avg(population) from country group by state having avg(population) between 100000 and 1000000 order by state;
In the above example, we have used the HAVING clause.
- Initially, the cities are grouped based on the state through the GROUP BY clause.
- Next, The avg() function determines the average population for each state.
- Using having clause, we are displaying only the avg population, which falls between the range mentioned with the BETWEEN keyword.
- And finally, the output is ordered alphabetically based on the state.
Example with other aggregate function
In the following query, we will use the count() function.
MariaDB [student]> select state, count(*) state_count, cast(avg(population) as dec(10,2))avg_population from country group by state having count(*)>2 and avg(population)>100000 order by avg_population;
In the above example
- We count the number of cities in each state using count(*)
- Using group by we group the cities and using having clause we retrieve the states with the cities greater than two and an average population greater than 100000.
- And finally, the output is ordered alphabetically based on the state.
II. BIT_AND Function
Syntax:
BIT_AND(exp) [over_clause]
- This function returns the bitwise AND of all the bits in exp provided.
- The calculations are performed using 64-bit precision, i.e., BIGINT.
- It can be used with the GROUP BY clause
- If there are no matches, it returns a value with all the bits set to 1.
- NULL values do not affect the result until all the results are NULL, which is considered no match.
Let's see an example:
We will use the following table:
MariaDB [student]> select BIT_AND(i), BIT_OR(i) from tab;
In the above table, we have calculated BIT_AND.
Let's see with the aggregate function:
Following table, we will use:
MariaDB [student]> select category, BIT_AND(x) From tab3 group by category;
In the above example, we have grouped the values based on category, and on those groups, BIT_AND() function is imposed.
MariaDB [student]> select BIT_AND(NULL);
In this example, we have passed NULL as an argument, and the result is:
III. BIT_OR(exp)
Syntax:
BIT_OR(exp)
- It will return bitwise OR of all bits in the exp provided.
- The calculations are performed using 64-bit precision, i.e., BIGINT.
- This function can be used with the GROUP BY clause.
- If there are no matches, it returns a value with all the bits set to 0.
- NULL values do not affect the result until all the results are NULL, which is considered no match.
Lets see example:
MariaDB [student]> select BIT_OR(i) from tab;
In the above example, we have calculated the BIT_OR operation on i values.
MariaDB [student]> select category, BIT_OR(x), BIT_AND(x) From tab3 group by category;
In the above example, we have grouped the values based on category, and on those groups, BIT_OR() function is imposed.
MariaDB [student]> select BIT_OR(NULL);
In this example, we have passed NULL as an argument, and the result is:
IV. BIT_XOR Function
Syntax:
BIT_XOR(exp);
- It will return bitwise XOR of all bits in the exp provided.
- The calculations are performed using 64-bit precision, i.e., BIGINT.
- This function can be used with the GROUP BY clause.
- If there are no matches, it returns a value with all the bits set to 0.
- NULL values do not affect the result until all the results are NULL, which is considered no match.
Let’s see example:
MariaDB [student]> select BIT_AND(i), BIT_OR(i),BIT_XOR(i) from tab;
In the above example we have calculate BIT_XOR operation on i values.
MariaDB [student]> select category, BIT_OR(x), BIT_AND(x), BIT_XOR(x) From tab3 group by category;
In the above example, we have grouped the values based on category, and on those groups, BIT_XOR() function is imposed.
MariaDB [student]> select BIT_XOR(NULL);
In this example, we have passed NULL as an argument, and the result is:
V. COUNT() Function
Syntax:
COUNT(ALL | DISTINCT exp))
- This function, written as count(exp) or count(all exp), returns the count of the number of non-NULL values of exp in the rows retrieved by the SELECT statement.
- The result of the function is the BIGINT value.
- This function can be used with the GROUP BY clause.
- COUNT(*) returns the total number of rows in the table.
- COUNT() if there are no matching rows, it returns 0.
Let's see an example:
In the examples, we will be using the following table:
Let's retrieve the row count of the table using count(*)
MariaDB [student]> select count(*) row_count from tab2;
Next to retrieve the rows which are null
MariaDB [student]> select count(*) names from tab2 where names IS NULL;
In the above example, we have retrieved the row count where row values are NULL using where clause and IS NULL clause checks with all the rows whether they are NULL or NOT.
- Next, we will retrieve the number of non_null rows in the table using count(exp)
MariaDB [student]> select count(ID) row_count from tab2;
Here we have passed the id column, and as there is one-row containing NULL, so row count is 9.
- Next, we will retrieve a number of distinct, i.e., unique as well as non-null, rows from the table:
MariaDB [student]> select count(DISTINCT ID) row_count from tab2;
- Example with GROUP BY clause:
In this example, we will use the GROUP BY clause with the count() function.
MariaDB [student]> select ID, count(id) names from tab2 group by id order by id;
Here initially, the rows are grouped based on the ID, then the row count of each group is calculated, and at the end, groups are displayed in ascending order which is performed by ORDER BY clause.
- Example with HAVING clause:
In this example, we will use the HAVING clause with the count() function.
MariaDB [student]> select ID, count(id)count_names from tab2 group by id HAVING count(count_names)>1 order by id;
Here initially, the rows are grouped based on the ID. The row count of each group is calculated. Finally, using the HAVING clause, we display groups having ids greater than 1, And then groups are displayed in ascending order which is performed by the ORDER BY clause.
VI. STD() Function
Syntax:
STD(exp)
- This function returns the population standard deviation of exp.
- This function is an extension to standard SQL to use STDDEV_POP() instead of the function.
- We can use this with GROUP BY clauses.
- If there are no matching rows, it returns NULL.
Let's see an example:
We will be using the following table:
MariaDB [student]> select * from tab4;
MariaDB [student]> select category, STD(x) FROM tab4 GROUP BY category;
In the above example, we have used the GROUP BY clause, which groups data based on the category and calculates the standard deviation.
VII. STDDEV() Function
Syntax:
STDDEV(exp)
- This function returns the population standard deviation of exp.
- This function is provided for compatibility with Oracle. This function is an extension to standard SQL to use STDDEV_POP() instead of the function.
- We can use this with GROUP BY clauses.
Lets see example:
MariaDB [student]> select category, STDDEV(x) FROM tab4 GROUP BY category;
In the above example, we have used the GROUP BY clause, which groups rows based on the category and calculates the standard deviation using the STDDEV() function.
VIII. STDDEV_POP() Function
Syntax:
STD_POP(exp)
This function returns the population standard deviation of exp. We can use STD() or STDDEV(), which are similar but not the standard SQL.
MariaDB [student]> select category, STDDEV_POP(x) FROM tab4 GROUP BY category;
In the above example, we have used the GROUP BY clause, which groups rows based on the category and calculates the standard deviation using STDDEV_POP() function.
MariaDB [student]> select category, STDDEV_POP(x),STD(x),STDDEV(x) FROM tab4 GROUP BY category;
All three functions are equivalent, so the results are the same; the difference is STD(), and STDDEV() are not standard SQL, unlike STDDEV_POP.
IX. VAR_POP() Function
Syntax:
VAR_POP(exp)
It returns the population standard Variance of the exp provided.
The provided rows are considered the whole population and not as a sample, so the number of rows is the denominator.
We can also use the VARIANCE() function, which is similar and not the standard SQL.
Variance is calculated by following:
1. Calculation of the mean of the set
2. For each number in the set, subtracting the means and then squaring the result.
3. And calculating the average of the resulting differences.
Let's see an example:
MariaDB [student]> select VAR_POP(x) FROM tab3;
Let's see with the group by clause:
MariaDB [student]> select category, VAR_POP(x) FROM tab4 GROUP BY category;
X. VARIANCE Function
Syntax:
VARIANCE(exp)
- This function returns the standard Variance of exp. This function is an extension to the standard SQL. The standard SQL function VAR_POP can also be used instead of this function.
- The calculation of Variance is the same as the VAR_POP function.
- This function can be used with the GROUP BY clause.
MariaDB [student]> select category, VARIANCE(x) FROM tab4 GROUP BY category;
Let's see for both VARIANCE() and VAR_POP()
MariaDB [student]> select category, VAR_POP(x), VARIANCE(x) FROM tab4 GROUP BY category;
Here we can see the results are the same, so both the functions are equivalent.
XI. SUM Function:
Syntax:
SUM([DISTINCT] exp)
- This function returns the sum of the exp provided.
- If the return set has no rows, then the function returns NULL.
- The DISTINCT keyword is used to calculate the sum of the distinct values of the exp.
Let's see an example:
We will be using the following table:
- First, we will use sum() to calculate the sum of values of all rows.
MariaDB [student]> select sum(X) from tab4;
This function returns the sum of values, including duplicates.
- Use of DISTINCT keyword in sum() function.
MariaDB [student]> select sum(distinct X) from tab4;
Here sum() function ignores the duplicate and calculates sum of remaining.
- Next, let's use GROUP BY
We will use the following table:
MariaDB [student]> select state, sum(population) sum_population from country group by state order by state;
We can see the sum of the population of each state. Here we have grouped the cities based on the state, and then something () function calculates the sum of the population of all states. And finally, the output is ordered alphabetically based on the state.
- Example with HAVING clause:
MariaDB [student]> MariaDB [student]> select state,sum(population) sum_population from country group by state having sum(population) between 1000000 and 10000000 order by state;
In the above example, we have used the HAVING clause.
- Initially, the cities are grouped based on the state through the GROUP BY clause.
- Next, The sum() function determines the sum population for each state.
- Using the having clause, we display only the sum population, which falls between 1000000 and 10000000 mentioned with the BETWEEN keyword.
- And finally, the output is ordered alphabetically based on the state.
XII. GROUP_CONCAT Function
Syntax:
GROUP_CONCAT([DISTINCT] exp [ORDER BY col_name,..[seprator sep]);
- This function will return a string with the concatenated non-NULL values from the group.
- The distinct keyword eliminates the duplicate strings before concatenating the strings
- The separator clause specified appears as a separator between each string element in the final result. The default separator is a comma(,)
- If we provide non-NULL values, NULL is returned.
Let's see an example:
We will be using the 'Country' table
Which looks like this:
Example with the group by clause:
MariaDB [student]> select state, group_concat(city) city from country group by state;
Here we can see the cities are grouped based on states. All the cities are concatenated.
Example with an order by clause:
MariaDB [student]> select state, group_concat(city) city from country group by state order by state;
In the above example, the cities are grouped based on states. All the cities are concatenated.
And finally, the result is ordered based on state alphabetically in ascending order.
Using different separators – '/.'
By default, comma(,) is a separator, but we can also mention the separator we wish to as follows:
MariaDB [student]> select state, group_concat(city order by city separator '/' )from country group by state order by city;
Here we have mentioned the / with the separator keyword. As we can see, the cities are separated by /.
Next, lets use another separator:
MariaDB [student]> select state, group_concat(DISTINCT city order by city separator '\n' )from country group by state order by city;
Here we can see the '\n' is used, which is used for the new line as we can see the cities are showed in the new line.
We have used DISTINCT, which shows unique value.