#27365 closed Bug (invalid)
Fields suddenly not found.
Reported by: | Brandon | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.9 |
Severity: | Normal | Keywords: | FieldDoesNotExist |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hello,
We just moved from Django 1.7.10 to 1.9.9. After the move a new error began occurring. The exception is:
FieldDoesNotExist: No related field named '...'
We see the exception when accessing the reverse OneToOne of a non-abstract models.Model subclass that has been subclassed.
class Case(models.Model): ...lots of fields... class SubclassCase(Case): ...lots more fields... SubclassCase.objects.get(id=12345)
generates traceback
Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 149, in get_response response = self.process_exception_by_middleware(e, request) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 147, in get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view return view_func(request, *args, **kwargs) File "/var/www/vhosts/prod/lib/workpath/workflow/views.py", line 1730, in view_case additional_view = case_module.view(case) File "/var/www/vhosts/prod/lib/workpath/workflow/cases/subclasscase/views.py", line 286, in view case = SubclassCase.objects.get(id=case.id) File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 122, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 378, in get clone = self.filter(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 790, in filter return self._filter_or_exclude(False, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 808, in _filter_or_exclude clone.query.add_q(Q(*args, **kwargs)) File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1243, in add_q clause, _ = self._add_q(q_object, self.used_aliases) File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1269, in _add_q allow_joins=allow_joins, split_subq=split_subq, File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1149, in build_filter lookups, parts, reffed_expression = self.solve_lookup_type(arg) File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1035, in solve_lookup_type _, field, _, lookup_parts = self.names_to_path(lookup_splitted, self.get_meta()) File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py", line 1345, in names_to_path targets = (final_field.remote_field.get_related_field(),) File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/reverse_related.py", line 244, in get_related_field self.field_name) FieldDoesNotExist: No related field named 'subclasscase'
Whenever the exception occurs, it is exactly this exception every time once _filter_or_exclude
is reached. We've seen it in the context of function based views, class based views, the views of our tastypie backed API (XML and JSON) and from view functions that support light-weight JSON functions.
We see it across multiple disconnected deployments with completely separate machines, datastores, databases, etc.
We do not see it all the time.
We do not see it when running under the development server.
An apache restart fixes it... for a little while.
We are using mpm_event with WSGI.
Our mpm_event configuration looks like this:
<IfModule mpm_event_module> StartServers 2 MinSpareThreads 25 MaxSpareThreads 75 ThreadLimit 64 ThreadsPerChild 25 MaxRequestWorkers 150 MaxConnectionsPerChild 5000 </IfModule>
Our WSGI (libapache2-mod-wsgi 3.4-4ubuntu2.1.14.04.2
) configuration looks like this:
WSGIDaemonProcess prod processes=2 python-path=/usr/local/lib/python2.7/dist-packages:/var/www/vhosts/prod/lib
Based on our experience that an apache restart fixed it we changed our WSGI configuration to this:
WSGIDaemonProcess prod processes=10 threads=1 maximum-requests=50 python-path=/usr/local/lib/python2.7/dist-packages:/var/www/vhosts/prod/lib
This causes the apache children to cycle quickly and the error goes away but as you can imagine this slows down the site considerably.
The error occurs much more frequently with 1.8.15 than with 1.9.9 - but again it *never* occurred in 1.7.10. Based on this, and looking at git diff 1.7.10...1.9.9 -- django/db/models/options.py
; I believe it is somehow related to the _meta API change and the introduction of _get_fields_cache in the Options class.
However, I am very unfamiliar with core django code... so that could be completely wrong.
The real difficulty here is that it seems to be time or number-of-requests based. A apache child, immediately after a restart, hums along nicely. But it *seems* that if any particular code path that needs to do that kind of "reverse OneToOne" lookup is crossed too many times the error, that exact exception, occurs. Once it has occurred for a child, that child is done. If you access the site and get that child looking for that object, the exception occurs. But if you get another apache child or are looking for another object, the error has a good chance of not occurring. Again, if that child cycles out or if Apache is restarted (not reloaded) the error vanishes completely... for a little while, seemingly dependent on site traffic.
I'll be glad to supply any debug information I can. As pervasive as this error was in production, I have been unable to reproduce the problem in a development environment. As soon as I have, I'll update the ticket with whatever information I've found.
Any insight anyone could provide in debugging this problem would be much appreciated. I would not hesitate to submit a patch if I had one.
Thanks,
--Brandon
Change History (13)
comment:1 by , 8 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Type: | Uncategorized → Bug |
comment:2 by , 8 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:3 by , 8 years ago
This looks like it could be a bug in Django, but I'm not sure. Is there any chance you're defining Django models and not importing them in an application's models
module or package? This might explain what you're seeing (in that case the answer is "don't do that").
comment:4 by , 8 years ago
I don't understand what you mean by "defining Django models and not importing them in an application's models module or package".
Every model that is being used is beneath a package that is listed in INSTALLED_APPS
.
Let me describe the process.
Given this structure:
class Case(models.Model): ...lots of fields... class SubclassCase(Case): ...lots more fields...
- A user accesses a generic page that retrieves and renders the state of a
Case
instance. - The user then, if they so choose, accesses a subclass specific page that uses the
id
of theCase
instance to retrieve the subclass instance via aSubclassCase.objects.get(id=case.id)
. The classSubclassCase
was imported via a__import__
just moments before the {{.get}} is done. - The subclass instance is then used to render a subclass specific page that includes both
Case
andSubclassCase
parameters.
This is one of 3 different but similar situations where I have seen this bug expressed. Each situation centers around following an FK backwards via on a subclass instance using SubclassCase.objects.get(id=case.id)
or SubclassCase.objects.filter(id=case.id)
, where id
is an AutoField
. As was noted in the original issue and in my post to the forums (https://groups.google.com/d/msg/django-users/S4AnygYSJqU/JttIqqTLAQAJ) once you reach _filter_or_exclude
the exception is *always* the same and the manner of the error is always the same (i.e. The field
variable seems to be populated by self
in get_related_field
at line 244). The difference between the 3 situations is, 2 of the 3 situations do *not* do the __import__
, only 1 of the 3 situations do the dynamic import.
comment:5 by , 8 years ago
Every model that is being used is beneath a package that is listed in
INSTALLED_APPS
.
This is good. If I may ask a more precise question:
- are all your models in simple
models.py
module? or do you have amodels/__init__.py
package withmodels/cases.py
, etc. submodules? - if the latter are you importing all your models in
__init__.py
? If not, that could solve your problem.
You mention __import__
. Did you mean import
? Or do you really use the low level __import__
? Could you try using apps.get_models
instead? apps.get_models
is intended to raise an exception in situations that may lead to such inconsistencies.
comment:6 by , 8 years ago
Yes we do actually use __import__
. We are bringing in models and views dynamically. Like this:
__import__("workpath.workflow.cases.%s.views" % casetype, fromlist=["workpath.workflow.cases"])
and
__import__("workpath.workflow.cases.%s.models" % casetype, fromlist=["workpath.workflow.cases"])
But again, this was *never* a problem before 1.9.9. This basic structure and usage has been around since early 1.4. It has actually worked quite well! =)
comment:7 by , 8 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
I reopened this issue because this bug is still occurring after 4 months of our 2-week development cycles. We have refactored and gone entirely to apps.get_models
and even reduced most all instances of field__id
to field_id
, because that seems to express the bug *less* often. We also could not tell if this was Apache so we moved several customer environments over from Apache2+mod_wsgi to NGINX+uWSGI (still Ubuntu). And yet the bug still occurs in *exactly* the way as described above.
Some important things to note:
- It occurs seemingly at random. But when it happens it will happen in waves.
- It occurs equally in multi-processed + single-thread or multi-processed + multi-threaded web server configurations.
- Recycling Apache children or NGINX vassals quickly (every 50-150 requests) causes the bug to occur *less* but it will still occur.
- The stacktrace is *always* the same, has remained the same, and is the same in *every* instance. There is zero deviation.
- It *never* occurred from Django 1.4.x to 1.7.x; but has not stopped in 1.9.9 and occurred *more* frequently in 1.8.x
Any attention at all would be most appreciated.
--Brandon
comment:8 by , 8 years ago
Given the esoteric nature of the problem, I think the onus is on you to explain why Django is at fault and to propose a fix. I'm not sure if I could reproduce the problem given the information you provided so far.
comment:9 by , 8 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:10 by , 8 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
After stepping completely through the call stack down from MySubclassOfaModelSubclass.objects.filter(id=1234)
we arrive at get_related_field
in reverse_related.py Line# 237
The call in this function that fails is field = self.model._meta.get_field(self.field_name)
It fails because self.field_name
has somehow been set to self.name
.
I cannot directly reproduce the error on demand. But setting self.field_name = self.name
from a breakpoint at Line# 237 reproduces *exactly* the stack trace I am (and have been) receiving on multiple independent production instances running first under Apache+mod_wsgi then nginx+uWSGI since moving from 1.7 to 1.9 months ago.
I replicated the production stack trace via pudb checking every function scoped local variable along the way against a production stack trace that logged those variables on a failure. I checked them down from .filter
to .get_related_field
; all populated variables are identical down the complete call stack. The only way to reproduce that complete call stack and exact exception is to set self.field_name = self.name
where self
is a OneToOneRel
representing MySubclassOfaModelSubclass
- note that it actually shows up from repr as <OneToOneRel: myapp.mysubclassofamodelsubclass>
I apologize for continually reopening this ticket. If there is another (or better) avenue to continue delivering detail around this issue, I'm certainly open to it. This has been a major thorn in my side. I'll provide any and all information necessary to get to the bottom of this very subtle and difficult issue. I'll be happy to work-up and submit a patch once the root cause is located.
Thanks,
--Brandon
P.S. Please note that it is not just MySubclassOfaModelSubclass.objects.filter(id=1234)
that fails. Any forward FK relation can fail (so MySubclassOfaModelSubclass.objects.filter(foo_id=55)
) - and fails in exactly the same way. The MySubclassOfaModelSubclass class has many forward FKs in it. It doesn't seem to matter which FK field is accessed, the stack trace is always the same.
comment:11 by , 8 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
You can continue to comment on the ticket without reopening it. Until you provide the root cause or a way for someone else to reproduce the issue, I don't think there's value in keeping the ticket open since no one else can do anything about it.
comment:12 by , 8 years ago
Further down the rabbit hole...
After figuring out that it was the occurance of self.field_name == self.name
that was producing the problem I instrumented field_name
in ForeignObjectRel
so that I could compare it to name
.
Here is the instrumentation I added to ForeignObjectRel
def _set_field_name(self,value): name = getattr(self,'name',None) field_name = value if field_name and name and field_name == name: import traceback stacktrace = traceback.format_stack() message = ''.join(["Bug#27365 - self.field_name == self.name\n"]+stacktrace) warnings.warn(message, RuntimeWarning, stacklevel=2) self._field_name=value def _get_field_name(self): name = getattr(self,'name',None) field_name = getattr(self,'_field_name',None) if field_name and name and field_name == name: import traceback stacktrace = traceback.format_stack() message = ''.join(["Bug#27365 - self.field_name == self.name\n"]+stacktrace) warnings.warn(message, RuntimeWarning, stacklevel=2) return self._field_name field_name = property(_get_field_name,_set_field_name)
This allowed me to see if *anything* in the system was setting self.field_name = self.name
, plus the exact trace to it in-situ.
As it turns out, it seems an old 3rd party application that adds metadata to fields as attributes was reusing the attribute name .field_name
, unaware that django was using it. I believe that the introduction of the aggressive caching to relational objects that occurred between 1.7.x and 1.9.x allowed this errant use of .field_name
to persist all the way up until django attempted to reuse that object. Then when django reused the object, self.field_name
had been *sometimes* been incorrectly set based on the caching + 3rd party app usage + life-time of the wsgi process.
I have since deployed a monkey patch that alters the name of the metadata field used by the 3rd party application so that it does not collide with the relational object attribute .field_name
.
I have not seen the error since.
--Brandon
comment:13 by , 8 years ago
Resolution: | needsinfo → invalid |
---|
Thanks for posting about the resolution. Glad to hear that Django isn't at fault.
Please see TicketClosingReasons/UseSupportChannels for ways to get help and reopen the ticket if you can say why Django is at fault.