#15586 closed Bug (wontfix)
Fields with missing columns populated when querying on PostgreSQL
Reported by: | Sebastian Żurek | Owned by: | nobody |
---|---|---|---|
Component: | Documentation | Version: | 1.2 |
Severity: | Normal | Keywords: | postgresql |
Cc: | lrekucki@…, botondus@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
With PostgreSQL as backend, when I use CharField field and name it "name", i.e.:
class MyModel(models.Model): name = models.CharField(max_length=255, ....)
I can reference to a 'name' field even though the appropriate db column is missing... (I forgot to create it
manually that's how I've noticed that strange issue)
The issue went out when I updated the model (no 'name' field before), I did not create a DB column, but that did not
stop Django from filling the field with a value! On .save() everything is OK and expected DatabseError
exception is raised by psycopg2.
The value stored in the field looks like a tuple, namely its: '(A, B, C)' where A-C are integers.
Attachments (1)
Change History (19)
comment:1 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:2 by , 14 years ago
Of course testing the synchronization between the Django models & db schema every time would be a stupid idea - that was not my point.
Also, I was not asking for automated schema evolution, which is quite a tricky issue and in most cases, for complex DB schema updates, You just have to handle
it manually on your own.
I was just curious HOW 'SELECT' statement could retrieve something from database, when there was no column to get from?
Once again I try to describe the proces:
- model MyMode was created and db was synced
- to model was added a CharField and I did not create corresponding DB table on my own
- I performed qs = MyMode.objects.all(), that I had something in: qs[0].name, qs[1].name and so on ... which I do not understand how this field was field when there was no data in DB? It looks to me like an unexpected behavior....
I would appreciate Your further comments.
comment:3 by , 14 years ago
sorry I made a mistake - of course in second point of process workflow should be:
to model was added a CharField and I did not create corresponding DB *column* on my own
comment:4 by , 14 years ago
Keywords: | postgresql added |
---|---|
Resolution: | wontfix |
Status: | closed → reopened |
Summary: | Field populated when no DB column present → Fields with missing columns populated when querying on PostgreSQL |
Triage Stage: | Unreviewed → Design decision needed |
This is a PostgreSQL related issue. The query looks something like this:
SELECT "ticket15586_mymodel"."id", "ticket15586_mymodel"."name" FROM "ticket15586_mymodel"
And with absence of name
column on the table, one would expect it to fail (and it does on SQLite, for example). But PostgreSQL seems to let you select the table as a ROW object (or something like that - I'm not an expert on that). So the following queries are valid:
SELECT "ticket15586_mymodel" FROM "ticket15586_mymodel"; SELECT "ticket15586_mymodel".name FROM "ticket15586_mymodel"; SELECT ("ticket15586_mymodel")::name FROM "ticket15586_mymodel";
With the second being treated as the third, which is a type cast. Don't know if there is a way to make it unambiguous that we want a table column, not some weird type thing. I'm going to reopen, so maybe someone with more PostgreSQL-related knowledge can fix this.
comment:5 by , 14 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
comment:6 by , 14 years ago
Cc: | added |
---|
comment:7 by , 14 years ago
Resolution: | → needsinfo |
---|---|
Status: | reopened → closed |
I can't reproduce this. Tried with Django trunk and 1.2.5
- PostgresSQL version is 8.3.14
- psycopg2 version is 2.0.7-4
# t15586/models.py from django.db import models class MyModel(models.Model): name = models.CharField(max_length=255)
$ ./manage.py dbshell Welcome to psql 8.3.14, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit dbname=> select * from t15586_mymodel; id ---- (0 rows) dbname=>
$ ./manage.py shell Python 2.5.2 (r252:60911, Jan 24 2010, 14:53:14) Type "copyright", "credits" or "license" for more information. IPython 0.8.4 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more. In [1]: from django.db import connection In [2]: from t15586.models import MyModel In [3]: MyModel.objects.all() Out[3]: --------------------------------------------------------------------------- DatabaseError Traceback (most recent call last) dtest07/<ipython console> in <module>() /var/lib/python-support/python2.5/IPython/Prompts.pyc in __call__(self, arg) 549 550 # and now call a possibly user-defined print mechanism --> 551 manipulated_val = self.display(arg) 552 553 # user display hooks can change the variable to be stored in /var/lib/python-support/python2.5/IPython/Prompts.pyc in _display(self, arg) 575 return IPython.generics.result_display(arg) 576 except TryNext: --> 577 return self.shell.hooks.result_display(arg) 578 579 # Assign the default display method: /var/lib/python-support/python2.5/IPython/hooks.pyc in __call__(self, *args, **kw) 133 #print "prio",prio,"cmd",cmd #dbg 134 try: --> 135 ret = cmd(*args, **kw) 136 return ret 137 except ipapi.TryNext, exc: /var/lib/python-support/python2.5/IPython/hooks.pyc in result_display(self, arg) 163 164 if self.rc.pprint: --> 165 out = pformat(arg) 166 if '\n' in out: 167 # So that multi-line strings line up with the left column of /usr/lib/python2.5/pprint.pyc in pformat(self, object) 109 def pformat(self, object): 110 sio = _StringIO() --> 111 self._format(object, sio, 0, 0, {}, 0) 112 return sio.getvalue() 113 /usr/lib/python2.5/pprint.pyc in _format(self, object, stream, indent, allowance, context, level) 127 self._readable = False 128 return --> 129 rep = self._repr(object, context, level - 1) 130 typ = _type(object) 131 sepLines = _len(rep) > (self._width - 1 - indent - allowance) /usr/lib/python2.5/pprint.pyc in _repr(self, object, context, level) 193 def _repr(self, object, context, level): 194 repr, readable, recursive = self.format(object, context.copy(), --> 195 self._depth, level) 196 if not readable: 197 self._readable = False /usr/lib/python2.5/pprint.pyc in format(self, object, context, maxlevels, level) 205 and whether the object represents a recursive construct. 206 """ --> 207 return _safe_repr(object, context, maxlevels, level) 208 209 /usr/lib/python2.5/pprint.pyc in _safe_repr(object, context, maxlevels, level) 290 return format % _commajoin(components), readable, recursive 291 --> 292 rep = repr(object) 293 return rep, (rep and not rep.startswith('<')), False 294 django/db/models/query.py in __repr__(self) 65 66 def __repr__(self): ---> 67 data = list(self[:REPR_OUTPUT_SIZE + 1]) 68 if len(data) > REPR_OUTPUT_SIZE: 69 data[-1] = "...(remaining elements truncated)..." django/db/models/query.py in __len__(self) 80 self._result_cache = list(self.iterator()) 81 elif self._iter: ---> 82 self._result_cache.extend(list(self._iter)) 83 return len(self._result_cache) 84 django/db/models/query.py in iterator(self) 269 model = self.model 270 compiler = self.query.get_compiler(using=db) --> 271 for row in compiler.results_iter(): 272 if fill_cache: 273 obj, _ = get_cached_row(model, row, django/db/models/sql/compiler.py in results_iter(self) 675 fields = None 676 has_aggregate_select = bool(self.query.aggregate_select) --> 677 for rows in self.execute_sql(MULTI): 678 for row in rows: 679 if resolve_columns: django/db/models/sql/compiler.py in execute_sql(self, result_type) 730 731 cursor = self.connection.cursor() --> 732 cursor.execute(sql, params) 733 734 if not result_type: django/db/backends/util.py in execute(self, sql, params) 13 start = time() 14 try: ---> 15 return self.cursor.execute(sql, params) 16 finally: 17 stop = time() django/db/backends/postgresql_psycopg2/base.py in execute(self, query, args) 42 def execute(self, query, args=None): 43 try: ---> 44 return self.cursor.execute(query, args) 45 except Database.IntegrityError, e: 46 raise utils.IntegrityError, utils.IntegrityError(*tuple(e)), sys.exc_info()[2] DatabaseError: column t15586_mymodel.name does not exist LINE 1: SELECT "t15586_mymodel"."id", "t15586_mymodel"."name" FROM "... ^
comment:8 by , 14 years ago
Could you try with PostgreSQL >= 8.4 ? I was using PostgreSQL 8.4.7 and psycopg2 2.3.2. Only reference to a similar problem I could find is: http://archives.postgresql.org/pgsql-general/2010-05/msg01126.php.
comment:9 by , 14 years ago
I was using Django 1.2.3, Postgres 9.0.1, psycopg2 v. 2.2.2 (dt dec ext pq3). In a few hours I'll try once again, with a raw SQL query and I'll report the results. I'll try to test it also with Postgres 8.
comment:10 by , 14 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → reopened |
I can reproduce this with PostgreSQL 8.4.7. Reopening the ticket.
comment:11 by , 14 years ago
This is an interesting one! I was able to reproduce it in Postgres 8.4.1. The issue seems to be isolated to the column called "name". I tried selecting non-existent columns with other names besides "name", and it worked as expected, raising an error. Here's an example:
db=# CREATE TABLE test (id INTEGER PRIMARY KEY); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test" CREATE TABLE db=# INSERT INTO test VALUES (1), (2), (3); INSERT 0 3 db=# SELECT "test"."id", "test"."name" FROM "test"; id | name ----+------ 1 | (1) 2 | (2) 3 | (3) (3 rows) db=# SELECT "test"."id", "test"."name2" FROM "test"; ERROR: column test.name2 does not exist
Here is a thread from the Postgres mailing list with more info: http://archives.postgresql.org/pgsql-general/2010-02/msg01038.php. "Calling t.name is the same as name(t) if a column reference is not found."
I agree with the previous comments that we should avoid introspecting the database, and I can't immediately think of a user-friendly solution to this beyond documenting it for Postgres users. Any ideas?
comment:12 by , 14 years ago
Type: | → Bug |
---|
comment:13 by , 14 years ago
Severity: | → Normal |
---|
comment:14 by , 13 years ago
Easy pickings: | unset |
---|
After a few weeks I'm returning to this issue. To summarize what we see, let me point out the two elements:
- the issue has rather low probability to appear: You have to use PG backend + You have to have Your model defining non-existing (in DB schema sense) field/column
- I feel this is also a kind of a issue in PostgreSQL itself
What we should do now, is for sure to document this behavior --- while it's not clear in DB engine itself what does it mean to select column from table
, I believe Django should not be smarter in handling it in other way than the engine does it.
comment:15 by , 13 years ago
Component: | Database layer (models, ORM) → Documentation |
---|---|
Triage Stage: | Design decision needed → Accepted |
UI/UX: | unset |
Since no one came up with a better plan, let's document it in ref/databases.txt.
We have comparable limitations documented for other databases, for instance, "don't create a field called timestamp
if you're using Oracle".
comment:16 by , 12 years ago
The usefulness of adding something like the following seems very limited to me:
"If you are using PostgreSQL and add a field called 'name' to an existing model but forget to add the column in your database, SELECT statements will appear to work even though the column doesn't exist (since NAME is a Postgres function and SELECT foo.name FROM foo is equivalent to SELECT name(foo) FROM foo) ."
but since the ticket has been accepted, is that the essence of what is being suggested?
comment:17 by , 12 years ago
Note that "name" is just an example here. From my understanding, you can swap it to any other datatype that PostgreSQL supports and instead of raising an error about a missing column, it will treat it as a cast of the whole row to that datatype.
Anyways, I tried this on PostgreSQL 9.2 and the behavior is no longer there. I'm guessing it was removed in version 9, but don't have an instance to verify.
comment:18 by , 12 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Since this doesn't happen on PostgreSQL 9.2 and seems like it has a very low probability of happening, I don't think it's worth documenting at this point. If someone really wants to see this added and can provide a patch, I'll be happy to review and commit it.
It is not completely clear to me what is the behavior you expect here.
If what you are suggesting is that Django flags the divergence between the model design and its corresponding database table structure as soon as possible (e.g. at Python model field population time) it would be a waste of resources (it would need to introspect the DB table every time). IMHO current behavior is a good compromise: Let the DB-API driver/database engine flag the error when actual interaction with the DB is performed. Remember, we are using a RDBMS plus an ORM, not a OODB, there are two separate realms, the model objects one and the database one.
If what you are asking for is database schema evolution so it follows the model design, it is a feature that hopefully will be added in a future version and there are third party tools that implement that now (but AFAIK none of them perform automatic schema evolution so they won't be of help in cases where the developer wants to get away with changing the model design at will and hoping the software will take care of the database changes).
See also http://docs.djangoproject.com/en/1.2/faq/models/#if-i-make-changes-to-a-model-how-do-i-update-the-database