Get the row with a max value in mysql

Get the row with a max value in mysql
G.Morreale

Introduction


Problem: Find the row with the max value of a column, for example 
For each article, find the dealer or dealers with the most expensive price.

The solution is around mysql queries.

Tests Environment

Make an example db with a test table in mysql:

CREATE DATABASE `mytest`;

DROP TABLE IF EXISTS `mytest`.`shop`;
CREATE TABLE  `mytest`.`shop` (
  `article` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `dealer` varchar(45) DEFAULT NULL,
  `price` int(11) DEFAULT NULL,
  PRIMARY KEY (`article`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=87158 DEFAULT CHARSET=latin1;

Insert some rows in the database (I inserted 6800 row)

INSERT INTO mytest.shop (dealer,price) values ('aaa',round(rand()*100));

the test was done on windows xp, mysql 5.1 standard installation

Solutions

  • SELECT price from shop s1 order by price desc limit 1))

It sort the prices and take the first. The best sorting cannot be done in less than O(n log n) time.

  • SELECT price FROM shop s1 WHERE price=(SELECT MAX(s2.price) FROM shop s2))

It take a linear (O(n)) time for max computation and the other linear time for price comparison with max value.

  • SELECT s1.price FROM shop s1 LEFT JOIN shop s2 ON s1.article <> s2.article and s1.price < s2.price WHERE s2.article IS NULL))

This last solution works on the basis that when s1.price is the maximum value there is no s2.price with a greater value so the s2 rows values will be null.
It seems also a linear time computation.

(I don't know about mysql internals algorithm so take my discussion  with a grain of salt)


Results

select benchmark(800000000,(select price from shop s1 order by price desc limit 1));

37.1 seconds

select benchmark(800000000,(SELECT s1.price FROM shop s1 LEFT JOIN shop s2 ON s1.article <> s2.article and s1.price < s2.price WHERE s2.article IS NULL));

37.3 seconds

select benchmark(800000000,(SELECT price FROM shop s1 WHERE price=(SELECT MAX(s2.price) FROM shop s2)));

38.45 seconds

Conclusion

From the test the fastest query seems to be the first (with the order by), but I'm not sure if it is correct.
Before the tests I was thinking that the other solutions was better.

So if anyone wish to express an opinion, he will be welcome.

No comments: