Welcome :Guest

Congratulations!!!

Top 5 Contributors of the Month
sritaa
Sandeep Singh
Melody Anderson
Eminent IT

 Home >> Articles >> DataBase >> Post New Resource

Calculate the median in a table?

Posted By:Shashi Ray       Posted Date: January 05, 2009    Points: 25    Category: DataBase    URL: http://www.dotnetspark.com

Calculate the median in a table:

MEAN is easy (using the aggregate function AVG). MEDIAN, on the other hand, can be a little more difficult to calculate. (MEDIAN is one of the aggregate functions typically handled by statistics packages or by OLAP / Analysis Services.)

SQL Server

Say we have the following table:

 CREATE TABLE blat (     splunge INT NOT NULL ) GO  SET NOCOUNT ON INSERT blat VALUES(1) INSERT blat VALUES(2) INSERT blat VALUES(3) INSERT blat VALUES(5) INSERT blat VALUES(7) INSERT blat VALUES(8) INSERT blat VALUES(8) INSERT blat VALUES(9)

If we had an odd number of rows, we could calculate the median by simply finding the row where the number of values greater than that splunge value is equal to the number of values less than that splunge value:

 SELECT splunge FROM blat b     WHERE      (         SELECT count(splunge) FROM blat         WHERE splunge < b.splunge     )     =     (         SELECT count(splunge) FROM blat         WHERE splunge > b.splunge     )

Or this way, by just taking the greatest value from the top 50% of the table, ordered by splunge:

 SELECT TOP 1 splunge FROM (     SELECT TOP 50 PERCENT splunge     FROM blat ORDER BY splunge ) sub ORDER BY splunge DESC

With an even number of rows, however, the first query returns no results, and the second query might not return the desired result (in the case above, it will return the 4th of 8 rows, where splunge = 5). We need to do a little more work to calculate the *true* median, including implicitly converting to decimal, and taking the average of two nested subqueries and a union:

 SELECT AVG(splunge) FROM (     SELECT splunge FROM (         SELECT TOP 1 splunge = splunge * 1.0 FROM         (             SELECT TOP 50 PERCENT splunge             FROM blat ORDER BY splunge         ) sub_a         ORDER BY 1 DESC     ) sub_1     UNION ALL     SELECT splunge FROM (         SELECT TOP 1 splunge = splunge * 1.0 FROM         (             SELECT TOP 50 PERCENT splunge             FROM blat ORDER BY splunge DESC         ) sub_b         ORDER BY 1     ) sub_2 ) median

No, it's not pretty, but it gets the job done. There are similar ways to approach this problem that require fewer subqueries, but add the requirement of dynamic SQL.

For added fun, you could determine beforehand which method you need, so-in the case where there are an odd number of rows-the overall work required by the procedure will likely be more efficient.

 IF (SELECT COUNT(*) % 2 FROM blat) = 1     -- use easy query for odd # of rows ELSE     -- use more complex query for even # of rows

Keep in mind that I had a CONSTRAINT here that prevented splunge from containing a NULL (this could break some of the logic above). If the column you are using can have NULL values, you'll want to add AND splunge IS NOT NULL to all WHERE clauses.

Microsoft Access

In Access, we can run the same single subquery example as above, and the same CREATE TABLE statement.

 SELECT TOP 1 splunge FROM (     SELECT TOP 50 PERCENT splunge     FROM blat ORDER BY splunge ) ORDER BY splunge DESC

In the case of an even number of rows, we might want to round to the 5th instead of 4th value, we can just swap it around:

 SELECT TOP 1 splunge FROM (     SELECT TOP 50 PERCENT splunge     FROM blat ORDER BY splunge DESC ) ORDER BY splunge

However, we can also calculate the mathematical median in Access, which is basically the average of the two median values. Instead of nesting a bunch of sub-queries and a union, let's create a peripheral table to store the results temporarily.

 CREATE TABLE blat(id INT)  INSERT INTO blat(id)      SELECT TOP 1 id FROM     (         SELECT TOP 50 PERCENT id         FROM table1 ORDER BY id     )     ORDER BY id DESC  INSERT INTO blat(id)     SELECT TOP 1 id FROM     (         SELECT TOP 50 PERCENT id         FROM table1 ORDER BY id DESC     )     ORDER BY id  SELECT AVG(id) FROM blat  DROP TABLE blat

Shashi Ray

Responses

No response found. Be the first to respond this post

Post Comment