Opened 6 years ago
Closed 6 years ago
#30099 closed Bug (fixed)
Filter by Count annotated inside Subquery
Reported by: | MrFus10n | Owned by: | Nasir Hussain |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.1 |
Severity: | Normal | Keywords: | subquery annotate filter |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I want to get authors and annotate minimal count of books in category if it is greater than three.
Book and Author models and are not connected with ForeignKey fields (this is abstract and simplified, there is a reason):
Author(models.Model): name = models.CharField(max_length=250) Book(models.Model): author_name = models.CharField(max_length=250) book_category = models.CharField(max_length=250)
Here is simplest query I can get to reproduce:
(Author.objects .annotate(min_valuable_count=Subquery( Book.objects .filter(author_name=OuterRef('name')) .annotate(cnt=Count('book_category')) .filter(cnt__gt=3) .order_by('cnt') .values('cnt')[:1], output_field=models.IntegerField() )))
And I get an error:
psycopg2.ProgrammingError: missing FROM-clause entry for table "U0" LINE 1: ... "core_author" GROUP BY "core_author"."id", "U0"."id" ... ^
Here is SQL:
SELECT "core_author"."id", "core_author"."name", ( SELECT COUNT(U0."book_category") AS "cnt" FROM "core_book" U0 WHERE U0."id" = ("core_author"."chat_id") GROUP BY U0."id" HAVING COUNT(U0."book_category") > 3 ORDER BY "cnt" ASC LIMIT 1) AS "min_valuable_count" FROM "core_author" GROUP BY "core_author"."id", "U0"."id"
If I remove line .filter(cnt__gt=3)
, last GROUP BY
disappears and query stops raising error:
SELECT "core_author"."id", "core_author"."name", ( SELECT COUNT(U0."book_category") AS "cnt" FROM "core_book" U0 WHERE U0."id" = ("core_author"."chat_id") GROUP BY U0."id" ORDER BY "cnt" ASC LIMIT 1) AS "min_valuable_count" FROM "core_author"
Is there any way to remove GROUP BY
in outer query without removing .filter(cnt__gt=3)
in subquery?
Is qs.query.group_by = None
ok?
Change History (18)
comment:1 by , 6 years ago
Description: | modified (diff) |
---|---|
Keywords: | Subquery annotate Count filter added |
comment:2 by , 6 years ago
Description: | modified (diff) |
---|
comment:3 by , 6 years ago
Description: | modified (diff) |
---|
comment:4 by , 6 years ago
Description: | modified (diff) |
---|
comment:5 by , 6 years ago
Description: | modified (diff) |
---|
comment:6 by , 6 years ago
Cc: | added |
---|---|
Keywords: | subquery added; Subquery Count removed |
Triage Stage: | Unreviewed → Accepted |
comment:7 by , 6 years ago
Thank you for response. Yes, setting contains_aggregate = False
in subquery did the job. I will see your links and try to make a patch.
comment:8 by , 6 years ago
I ran tests before changing anything and got FAILED (failures=3, errors=3, skipped=948, expected failures=4)
. Is this ok?
comment:9 by , 6 years ago
It'd be great if you could submit a Github Pull Request with your changes but if it's not possible could you report the traceback of failures?
comment:10 by , 6 years ago
I didn't change anything yet. Just cloned repo, setup virtualenv and executed tests/runtests.py
. I am using windows 7, python 3.5.3.
This is in the middle of testing:
Exception in thread Thread-805: Traceback (most recent call last): File "C:\Python\35x32\lib\threading.py", line 914, in _bootstrap_inner self.run() File "c:\dev\projects\django\django\test\testcases.py", line 1364, in run connections.close_all() File "c:\dev\projects\django\django\db\utils.py", line 224, in close_all connection.close() File "c:\dev\projects\django\django\db\backends\sqlite3\base.py", line 237, in close self.validate_thread_sharing() File "c:\dev\projects\django\django\db\backends\base\base.py", line 531, in validate_thread_sharing % (self.alias, self._thread_ident, _thread.get_ident()) django.db.utils.DatabaseError: DatabaseWrapper objects created in a thread can only be used in that same thread. The object with alias 'other' was created in thread id 11452 and this is thread id 1148.
And this is after it's done:
====================================================================== ERROR: test_unicode_file_name (i18n.test_extraction.BasicExtractorTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\i18n\test_extraction.py", line 216, in test_unicode_file_name management.call_command('makemessages', locale=[LOCALE], verbosity=0) File "c:\dev\projects\django\django\core\management\__init__.py", line 148, in call_command return command.execute(*args, **defaults) File "c:\dev\projects\django\django\core\management\base.py", line 364, in execute output = self.handle(*args, **options) File "c:\dev\projects\django\django\core\management\commands\makemessages.py", line 384, in handle potfiles = self.build_potfiles() File "c:\dev\projects\django\django\core\management\commands\makemessages.py", line 426, in build_potfiles self.process_files(file_list) File "c:\dev\projects\django\django\core\management\commands\makemessages.py", line 519, in process_files self.process_locale_dir(locale_dir, files) File "c:\dev\projects\django\django\core\management\commands\makemessages.py", line 583, in process_locale_dir input_files_list.write(('\n'.join(input_files))) File "C:\Python\35x32\lib\encodings\cp1251.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xe9' in position 241: character maps to <undefined> ====================================================================== ERROR: test_strip_tags_files (utils_tests.test_html.TestUtilsHtml) (filename='strip_tags1.html') ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\utils_tests\test_html.py", line 103, in test_strip_tags_files content = fp.read() File "C:\Python\35x32\lib\encodings\cp1251.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 33114: character maps to <undefined> ====================================================================== ERROR: test_content_saving (file_storage.tests.ContentFileStorageTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\file_storage\tests.py", line 968, in test_content_saving self.storage.save('unicode.txt', ContentFile("espa\xf1ol")) File "c:\dev\projects\django\django\core\files\storage.py", line 52, in save return self._save(name, content) File "c:\dev\projects\django\django\core\files\storage.py", line 274, in _save _file.write(chunk) File "C:\Python\35x32\lib\encodings\cp1251.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xf1' in position 4: character maps to <undefined> ====================================================================== FAIL: test_all (admin_scripts.tests.DiffSettings) The all option also shows settings with the default value. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\admin_scripts\tests.py", line 2247, in test_all self.assertNoOutput(err) File "C:\dev\projects\django\tests\admin_scripts\tests.py", line 189, in assertNoOutput self.assertEqual(len(stream), 0, "Stream should be empty: actually contains '%s'" % stream) AssertionError: 1064 != 0 : Stream should be empty: actually contains 'Traceback (most recent call last): File "./manage.py", line 21, in <module> main() File "./manage.py", line 17, in main execute_from_command_line(sys.argv) File "C:\dev\projects\django\django\core\management\__init__.py", line 381, in execute_from_command_line utility.execute() File "C:\dev\projects\django\django\core\management\__init__.py", line 375, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "C:\dev\projects\django\django\core\management\base.py", line 323, in run_from_argv self.execute(*args, **cmd_options) File "C:\dev\projects\django\django\core\management\base.py", line 373, in execute self.stdout.write(output) File "C:\dev\projects\django\django\core\management\base.py", line 145, in write self._out.write(style_func(msg)) File "C:\Python\35x32\lib\encodings\cp1251.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xe5' in position 4960: character maps to <undefined> ' ====================================================================== FAIL: test_unified_all (admin_scripts.tests.DiffSettings) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\admin_scripts\tests.py", line 2285, in test_unified_all self.assertNoOutput(err) File "C:\dev\projects\django\tests\admin_scripts\tests.py", line 189, in assertNoOutput self.assertEqual(len(stream), 0, "Stream should be empty: actually contains '%s'" % stream) AssertionError: 1064 != 0 : Stream should be empty: actually contains 'Traceback (most recent call last): File "./manage.py", line 21, in <module> main() File "./manage.py", line 17, in main execute_from_command_line(sys.argv) File "C:\dev\projects\django\django\core\management\__init__.py", line 381, in execute_from_command_line utility.execute() File "C:\dev\projects\django\django\core\management\__init__.py", line 375, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "C:\dev\projects\django\django\core\management\base.py", line 323, in run_from_argv self.execute(*args, **cmd_options) File "C:\dev\projects\django\django\core\management\base.py", line 373, in execute self.stdout.write(output) File "C:\dev\projects\django\django\core\management\base.py", line 145, in write self._out.write(style_func(msg)) File "C:\Python\35x32\lib\encodings\cp1251.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xe5' in position 4888: character maps to <undefined> ' ====================================================================== FAIL: test_accent (dbshell.test_postgresql_psycopg2.PostgreSqlDbshellCommandTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\dev\projects\django\tests\dbshell\test_postgresql_psycopg2.py", line 100, in test_accent pgpass_string, AssertionError: Tuples differ: (['ps[32 chars]st', '-p', '444', 'dbname'], None) != (['ps[32 chars]st', '-p', '444', 'dbname'], 'somehost:444:dbname:r\xf4le:s\xe9same') First differing element 1: None 'somehost:444:dbname:r\xf4le:s\xe9same' - (['psql', '-U', 'r\xf4le', '-h', 'somehost', '-p', '444', 'dbname'], None) ? ------ + (['psql', '-U', 'r\xf4le', '-h', 'somehost', '-p', '444', 'dbname'], + 'somehost:444:dbname:r\xf4le:s\xe9same') ---------------------------------------------------------------------- Ran 12577 tests in 2925.839s FAILED (failures=3, errors=3, skipped=948, expected failures=4) Destroying test database for alias 'default'\u2026 Destroying test database for alias 'other'\u2026
follow-up: 14 comment:11 by , 6 years ago
Alright, don't worry about these failures they are likely related to your Windows configuration. Do you feel comfortable submitting a Github PR with your changes?
comment:12 by , 6 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:14 by , 6 years ago
Replying to Simon Charette:
I've pushed a fix with tests in PR. Can you please review it?
Thanks
follow-up: 16 comment:15 by , 6 years ago
Patch needs improvement: | set |
---|
Thanks for taking the time to submit a PR Nasir! I left a few comments for improvements on the PR, please uncheck patch needs improvement once they are addressed.
comment:16 by , 6 years ago
Patch needs improvement: | unset |
---|
Replying to Simon Charette:
Thanks for taking the time to submit a PR Nasir! I left a few comments for improvements on the PR, please uncheck patch needs improvement once they are addressed.
Hi, thanks for reviewing. I've pushed the fixed according to the review. Can you please review it again?
Thanks,
comment:17 by , 6 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I'm pretty sure it's caused by
Subquery.contains_aggregate
returningTrue
when it shouldn't; this attribute shouldn't cross the subquery boundary. Right now it defaults toBaseExpression.contains_aggregate
which is based of the filters of the subquery and would explain why it's the.filter(cnt__gt=3)
that triggers the issue.Could you try setting
Subquery.contains_aggregate = False
and confirm it drops the outerGROUP BY
?Also, would you be interested in submitting a patch if it does fix it? It should consist of setting this attribute and adding a test in tests/expressions/tests.py that uses the existing models. Happy to walk you through a review if you are interested.