Opened 8 years ago
Closed 6 years ago
#27090 closed Cleanup/optimization (worksforme)
pg_get_serial_sequence is broken on postgres, use a lookup in information_schema.columns instead
Reported by: | Hanne Moa | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.10 |
Severity: | Normal | Keywords: | postgres sequence |
Cc: | chris.jerdonek@… | Triage Stage: | Someday/Maybe |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
pg_get_serial_sequence() is broken because it does not look up the actual sequence in use for a table-column pair, it only concats the tablename and columnname together to yield "tablename_columnname_seq", which is what django used to do itself prior to postgres 8.1 or so.
To verify: manually alter which sequence to use for a table-column pair by changing the default value for the column, then manually run "select pg_get_serial_sequence(tablename, columname);". You will not get back the sequence you just set, at least on postgres 9.3 and 9.5. If you try pg_get_serial_sequence() on a child table utilizing postgres' table inheritance you'll get a single row back with an empty string.
This is not a problem for databases created by django but turns up again and again on legacy systems. It also means existing django-built columns cannot switch to a different sequence.
The following methods in django.db.backends.postgresql_psycopg2.base.DatabaseOperations use the postgres function pg_get_serial_sequence() to fetch a sequence name:
- last_insert_id
- sequence_reset_by_name_sql
- sequence_reset_sql
I use the function in the attached file in a fork of the postgres-backend instead. It looks up the sequence name that is actually in use in the "information_schema.columns"-table. The SQL's been tested on postgres 9.3 and 9.5, which is what I currently have available.
It would be nice if postgres would fix pg_get_serial_sequence() but we don't have to wait for that.
Attachments (1)
Change History (20)
by , 8 years ago
Attachment: | sequence_name_getter.py added |
---|
comment:1 by , 8 years ago
Description: | modified (diff) |
---|
comment:2 by , 8 years ago
I don't have expertise here. Can you submit a pull request to verify this doesn't break the Django test suite? Is it possible to write a regression test?
comment:3 by , 8 years ago
I need this for older django versions so I'll be making a standalone package regardless, with tests.
I'm not too experienced in making tests that depend on a specific database though, so a pull request will not be forthoming before 1) I'm content with the tests in the package 2) I have a working test-setup for Django itself.
follow-up: 6 comment:4 by , 8 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Type: | Uncategorized → Cleanup/optimization |
Is there an open issue to correct the behavior in PostgreSQL? I found #13735 but the reply looks like the behavior is expected (again, I'm not an expert on PG behavior, so I may have misunderstood).
follow-up: 8 comment:5 by , 8 years ago
I also found some possibly related Django tickets (e.g. ticket:8901#comment:25) that were closed as a duplicate of #13295. Maybe that would address this?
comment:6 by , 8 years ago
Replying to timgraham:
Is there an open issue to correct the behavior in PostgreSQL? I found #13735 but the reply looks like the behavior is expected (again, I'm not an expert on PG behavior, so I may have misunderstood).
That something works as documented doesn't necessarily mean that it is useful :)
Interesting that it checks owner, with an inherited table only the ancestor can be the owner of an inherited sequence so pg_get_serial_sequence() cannot ever work on them as it is defined today.
I'm guessing what would be useful is a new postgres function that always picks the currently in use sequence.
comment:7 by , 8 years ago
Discovered during testing: the standard Django TestCase does not zero out sequences on teardown/end of test-method/between classes, which complicates matters.
comment:8 by , 8 years ago
Replying to timgraham:
I also found some possibly related Django tickets (e.g. ticket:8901#comment:25) that were closed as a duplicate of #13295. Maybe that would address this?
That would also solve the problem but I think the solution I'm working on involves less code. On the other hand, explicit is better than implicit...
This ticket would make more postgres databases just work out of the box. Ticket #13295 could use the lookup in information_schema as part of its own solution. Seen that way, #13295 depends on this one.
comment:9 by , 8 years ago
Link to code of current standalone backend: https://github.com/hmpf/django-postgres-backend
comment:10 by , 8 years ago
Could you open a PostgreSQL feature request and get confirmation that your proposed solution is the best way to do introspection in the meantime?
comment:11 by , 8 years ago
Another thought: more generally, it might be good to think about whether the backends can be made more friendly to subclassing for cases like this. That way the whole backend doesn't need to be forked just to change one aspect. It's also good design in general.
I don't whether this is already easy or doable for the case at hand. It would also reduce the importance / need for Django to "take a stance" in cases where the best implementation is unknown or unclear, etc.
comment:13 by , 8 years ago
Triage Stage: | Unreviewed → Someday/Maybe |
---|
comment:14 by , 8 years ago
The postgres-general list has some very interesting spam filters, took a while to get through. (Don't have a line in the body of the email starting with the word "which"...)
https://www.postgresql.org/message-id/nu7ebq%24a2h%241%40blaine.gmane.org
A different query is suggested, the result need not be parsed:
"Personally, I'd try looking in pg_depend to see if the column's default
expression has a dependency on a relation of type sequence. That avoids
all the fun of parsing the expression and turns it into a simple SQL
join problem." -- Tom Lane
It's a very simple join =):
select sn.nspname as sequence_schema, s.relname as sequence_name, col.attname from pg_class s join pg_namespace sn on sn.oid = s.relnamespace join pg_depend d on d.refobjid = s.oid and d.refclassid='pg_class'::regclass join pg_attrdef ad on ad.oid = d.objid and d.classid = 'pg_attrdef'::regclass join pg_attribute col on col.attrelid = ad.adrelid and col.attnum = ad.adnum join pg_class tbl on tbl.oid = ad.adrelid join pg_namespace n on n.oid = tbl.relnamespace where s.relkind = 'S' and d.deptype in ('a', 'n') and n.nspname = 'public' and tbl.relname = 'THE_TABLE_NAME' and col.attname = 'THE_COLUMN_NAME'
Anyway, even using the information_schema, it might return a schema in addition to the name so adjustments need to be made to my existing code.
comment:16 by , 8 years ago
I think the first step is to add real sequence introspection support. This PR (missing Oracle) does that. Then we should be able to build on it to fix this ticket's issue.
comment:18 by , 7 years ago
I'm trying to create a failing test, but on the PostgreSQL 9.4 and 9.6 instances I'm using to test, the pg_get_serial_sequence
function does return the real sequence name even after renaming the sequence with ALTER SEQUENCE ... RENAME TO
.
I would be interested to get more feedback from other testers...
comment:19 by , 6 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
Closing per absence of reaction after my latest comment. Reopen if you can reproduce with recent versions.
Python-function looking up sequence_name in postgres