#25377 closed Cleanup/optimization (fixed)
Regression in expressions refactor causes database queries to run COUNT('*') instead of COUNT(*)
| Reported by: | Adam Johnson | Owned by: | nobody |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.8 |
| Severity: | Release blocker | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Performance regression on 1.8 expression refactor. Count stopped outputting COUNT(*) and started outputting COUNT('*') due to using a Value. Possibly unnoticed until now because it doesn't affect the most popular database options much - I can't measure a performance drop for this on MySQL / InnoDb.
However I have managed to measure a performance drop on MariaDB + Aria (=MySQL fork + MyISAM fork):
adamj@localhost [3]> select count(*) from count_test;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.00 sec)
adamj@localhost [10]> select benchmark(100 * 1000 * 1000, (select count(*) from count_test));
+-----------------------------------------------------------------+
| benchmark(100 * 1000 * 1000, (select count(*) from count_test)) |
+-----------------------------------------------------------------+
| 0 |
+-----------------------------------------------------------------+
1 row in set (0.86 sec)
adamj@localhost [11]> select benchmark(100 * 1000 * 1000, (select count('*') from count_test));
+-------------------------------------------------------------------+
| benchmark(100 * 1000 * 1000, (select count('*') from count_test)) |
+-------------------------------------------------------------------+
| 0 |
+-------------------------------------------------------------------+
1 row in set (1.23 sec)
This is because MyISAM / Aria store the count in a metadata variable but once you introduce the expression it figures it has to do a table scan. The situation would of course only get worse with more rows, 10000 is tiny.
Change History (6)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
| Patch needs improvement: | set |
|---|---|
| Severity: | Normal → Release blocker |
| Summary: | COUNT(*) not COUNT('*') → Regression in expressions refactor causes database queries to run COUNT('*') instead of COUNT(*) |
| Triage Stage: | Unreviewed → Accepted |
comment:3 by , 10 years ago
I'm not 100% confident in my performance measurement there, I just tried on a multi-million row table on my production DB converted to MyISAM and COUNT(*) measured the same as COUNT('*'). However since there are so many database versions and storage engines out there I think it's probable this caused a regression somewhere.
comment:4 by , 10 years ago
| Patch needs improvement: | unset |
|---|
Pull request: https://github.com/django/django/pull/5260