Opened 6 years ago

Closed 6 years ago

#11107 closed (fixed)

ManyToManyFields defined with keyword 'through' should not be manipulated as auto-generated m2m tables.

Reported by: jcd@… Owned by: jcd
Component: Database layer (models, ORM) Version: master
Severity: Keywords:
Cc: gabor@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

This ticket is a generalization of #10881. The problem is that ManyToManyFields defined using the through keyword are being handled twice by some of the sql generating subcommands of django-admin.py, once for the Model table that is passed to the through keyword, and again when sql is generated for all many_to_manys on each field. The second pass is incorrect, as ManyToManyField(through=x) relationships don't auto-create an intermediate table. If the intermediate table's primary key is manually specified, it can lead to errors that cause ./manage.py flush and ./manage.py syncdb to fail. See comments on #10881, for examples.

Attachments (6)

many_to_many.rel.through.diff (2.2 KB) - added by jcd@… 6 years ago.
Fixes handling of sql generation for ManyToManyField(through=MyModel) relations in postgres
initial_data.json (313 bytes) - added by jcd 6 years ago.
Minimal fixture to demonstrate problem.
many_to_many.rel.through.2.diff (1.5 KB) - added by jcd 6 years ago.
Smaller patch, without ineffectual hunk
many_to_many.rel.through.3.diff (2.8 KB) - added by jcd 6 years ago.
New patch includes fix for oracle
many_to_many.rel.through.4.diff (4.8 KB) - added by jcd 6 years ago.
Patch now includes test. Test passes against postgresql and postgresql_psycopg2, needs to be run against oracle.
many_to_many.rel.through.5.diff (5.3 KB) - added by jcd 6 years ago.
Patch with test that doesn't rely on exact sql syntax of particular backends.

Download all attachments as: .zip

Change History (24)

Changed 6 years ago by jcd@…

Fixes handling of sql generation for ManyToManyField(through=MyModel) relations in postgres

comment:1 Changed 6 years ago by jcd@…

  • Has patch set
  • Needs documentation unset
  • Needs tests set
  • Patch needs improvement unset

comment:2 Changed 6 years ago by mir

I fail to verify the issue. I used the following models:

from django.db import models

class Member(models.Model):
    name = models.CharField(max_length=30)

class Venue(models.Model):
    # [... snip fields ...]
    member_set = models.ManyToManyField('Member',
                                        through='VenueMember',
                                        blank=True)

class VenueMember(models.Model):
    id = models.AutoField(db_column='venue_member_id', primary_key=True)
    venue = models.ForeignKey(Venue)
    member = models.ForeignKey(Member)

when I do manage.py sql testapp, I get the following result for postgresql 8.3, psycopg2:

BEGIN;
CREATE TABLE "testapp_member" (
    "id" serial NOT NULL PRIMARY KEY,
    "name" varchar(30) NOT NULL
)
;
CREATE TABLE "testapp_venue" (
    "id" serial NOT NULL PRIMARY KEY
)
;
CREATE TABLE "testapp_venuemember" (
    "venue_member_id" serial NOT NULL PRIMARY KEY,
    "venue_id" integer NOT NULL REFERENCES "testapp_venue" ("id") DEFERRABLE INITIALLY DEFERRED,
    "member_id" integer NOT NULL REFERENCES "testapp_member" ("id") DEFERRABLE INITIALLY DEFERRED
)
;
COMMIT;

I'm using current trunk (rev. 10756) with python 2.4.

So everything looks fine to me. Can you give me a step-by-step instruction on how to reproduce the bug?

Changed 6 years ago by jcd

Minimal fixture to demonstrate problem.

comment:3 Changed 6 years ago by jcd

Yes. The tables get created properly, the errors show up when running loaddata (which gets run by ./manage.py flush or ./manage.py syncdb, if you have an initial_data fixture). I'm attaching a minimal fixture which goes with the models you are using. To reproduce:

1) Install this fixture as yourapp/fixtures/initial_data.json then do one of the following
2) Run ./manage.py loaddata initial_data

or one of:

2a) Run ./manage.py syncdb yourapp (Calls the same code as 2a)
2b) Run ./manage.py flush yourapp (Calls the same code as 2a)

This (No. 2 above) yields the following output:

Installing json fixture 'initial_data' from '/cgi/django/apps/cdla_apps/venues_minimal/fixtures'.
Traceback (most recent call last):
  File "./manage.py", line 11, in <module>
    execute_manager(settings)
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/core/management/__init__.py", line 362, in execute_manager
    utility.execute()
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/core/management/__init__.py", line 303, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/core/management/base.py", line 195, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/core/management/base.py", line 222, in execute
    output = self.handle(*args, **options)
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/core/management/commands/loaddata.py", line 196, in handle
    cursor.execute(line)
  File "/usr/home/jcdyer/src/django-trunk/trunk/django/db/backends/util.py", line 19, in execute
    return self.cursor.execute(sql, params)
psycopg2.ProgrammingError: column "id" does not exist
LINE 1: ...venues_minimal_venuemember_id_seq"', coalesce(max("id"), 1),...

On closer inspection, the other hunk of the patch (in db.backends.init) is not as relevant, as it is just adding the table names to a set, so when it tries to re-add an m2m table, it's already in the set. I'll resubmit without that half of the patch.

Changed 6 years ago by jcd

Smaller patch, without ineffectual hunk

comment:4 Changed 6 years ago by ramiro

I could be totally off mark, but I think ticket #11114 is at least slightly related to this one.

comment:5 Changed 6 years ago by jcd

BEGIN;
SELECT setval('"venues_minimal_member_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_member";
SELECT setval('"venues_minimal_venue_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_venue";
SELECT setval('"venues_minimal_venuemember_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_venuemember";
SELECT setval('"venues_minimal_venuemember_venue_member_id_seq"', coalesce(max("venue_member_id"), 1), max("venue_member_id") IS NOT null) FROM "venues_minimal_venuemember";
COMMIT;
}}}

Note that the third select tries to reset a sequence (venues_minimal_venuemember_id_seq) that doesn't exist using a primary key (venues_minimal_venuemember.id) that doesn't exist. The fourth line is the proper sequence reset statement for that particular table.

comment:6 Changed 6 years ago by jcd

Sorry about that. Another way to reproduce this (without fixtures) is to run ./manage.py sqlsequencereset <appname>

Output:

BEGIN;
SELECT setval('"venues_minimal_member_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_member";
SELECT setval('"venues_minimal_venue_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_venue";
SELECT setval('"venues_minimal_venuemember_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "venues_minimal_venuemember";
SELECT setval('"venues_minimal_venuemember_venue_member_id_seq"', coalesce(max("venue_member_id"), 1), max("venue_member_id") IS NOT null) FROM "venues_minimal_venuemember";
COMMIT;

Note that the third select tries to reset a sequence (venues_minimal_venuemember_id_seq) that doesn't exist using a primary key (venues_minimal_venuemember.id) that doesn't exist. The fourth line is the proper sequence reset statement for that particular table.

And yes #11114 is the same problem, but it's not quite accurate to say that the metadata is getting ignored, because the through table is being handled as a model in its own right. The problem is that it's *also* being handled as an auto-generated many_to_many intermediary table.

Changed 6 years ago by jcd

New patch includes fix for oracle

comment:7 Changed 6 years ago by jcd

Can someone test the new patch on an oracle database?

Changed 6 years ago by jcd

Patch now includes test. Test passes against postgresql and postgresql_psycopg2, needs to be run against oracle.

comment:8 Changed 6 years ago by jcd

  • Needs tests unset
  • Owner changed from nobody to jcd
  • Status changed from new to assigned

comment:9 follow-up: Changed 6 years ago by ikelly

I've confirmed that the patch fixes the bug against Oracle.

However, the test case fails miserably because it's looking for sql specific to the postgresql backend, and the sql generated for the oracle backend is completely different.

comment:10 Changed 6 years ago by ikelly

  • Patch needs improvement set

comment:11 in reply to: ↑ 9 Changed 6 years ago by jcd

Replying to ikelly:

I've confirmed that the patch fixes the bug against Oracle.

However, the test case fails miserably because it's looking for sql specific to the postgresql backend, and the sql generated for the oracle backend is completely different.

Duh. That makes sense. Can you send me the output Oracle gives you (with the patch applied), so I can add it to the tests?

I don't see an easy way to handle this without testing the backends conditionally using settings.DATABASE_ENGINE.

comment:12 Changed 6 years ago by Alex

Testing the raw SQL output is the wrong way to do this. Using the introspection mechanisms makes significantly more sense.

comment:13 Changed 6 years ago by ikelly

I agree, inspecting the sql is too brittle. A better approach would be to actually perform the sequence reset and make sure that it works as expected.

comment:14 Changed 6 years ago by jcd

New patch. This one has a test that runs loaddata with a fixture that includes objects in the models necessary to trigger the appropriate sequence resets to test the bug. It also patches relevant documentation on sqlsequencereset fixing a broken link to Simon Willison's blog.

Changed 6 years ago by jcd

Patch with test that doesn't rely on exact sql syntax of particular backends.

comment:15 Changed 6 years ago by ikelly

  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Ready for checkin

Test passes in PostgreSQL and Oracle, and fails without the rest of the patch.

comment:16 Changed 6 years ago by jcd

Cool. It also doesn't break anything on mysql, FWIW.

comment:17 Changed 6 years ago by gabor

  • Cc gabor@… added

comment:18 Changed 6 years ago by russellm

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [11215]) Fixed #11107 -- Corrected the generation of sequence reset SQL for m2m fields with an intermediate model. Thanks to J Clifford Dyer for the report and fix.

Note: See TracTickets for help on using tickets.
Back to Top