Opened 12 months ago
Last modified 11 months ago
#34989 new Cleanup/optimization
Set Content-Length where possible for HttpResponses.
Reported by: | Jeppe Fihl-Pearson | Owned by: | Anders Kaseorg |
---|---|---|---|
Component: | HTTP handling | Version: | 4.2 |
Severity: | Normal | Keywords: | CommonMiddleware uWSGI |
Cc: | Florian Apolloner | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
We upgraded from Django 4.1.13 to 4.2.7 this week, and as part of this, we noticed redirects for URLs that added a missing trailing slash took 4-5 seconds. Before the upgrade, the redirects only took a fraction of a second to return.
Debugging this, I've found out it's related to the change made for #33700. If I revert that change the delay goes away.
I've also found out this requires a slightly specific setup. So far I've only seen it when using uWSGI as the application server using the http11-socket
option. I am not able to reproduce the issue when using http-socket
for configuring the listening socket.
I have adapted the reproduction case from #33700 to show this issue. It can be found at https://github.com/Tenzer/django-slow-redirect. When making a request against the http-socket
port, it returns immediately:
% time curl -si http://127.0.0.1:8010/url HTTP/1.1 301 Moved Permanently Content-Type: text/html; charset=utf-8 Location: /url/ curl -si http://127.0.0.1:8010/url 0.01s user 0.01s system 65% cpu 0.033 total
but when I make a request against the http11-socket
port, it instead takes at least 4 seconds extra:
% time curl -si http://127.0.0.1:8011/url HTTP/1.1 301 Moved Permanently Content-Type: text/html; charset=utf-8 Location: /url/ curl -si http://127.0.0.1:8011/url 0.01s user 0.01s system 0% cpu 4.026 total
I am assuming this might be hitting a timeout somewhere but I can't work out where that might be or how the changes in https://github.com/django/django/pull/15689 might have caused it.
We have for now had to switch to using the http-socket
option in uWSGI, but it would be ideal if the http11-socket
option also would work, as that is required for uWSGI to support keep-alive.
Change History (13)
comment:1 by , 12 months ago
Cc: | added |
---|
comment:2 by , 12 months ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
The difference is that we stopped setting Content-Length: 0
on these redirects. It is probably a uWSGI bug that removing Content-Length
causes uWSGI to delay for socket-timeout
seconds in this scenario; however, restoring it is easy and makes this symptom disappear.
comment:3 by , 12 months ago
I am against fixing this if a Content-Length is not required on redirects (which I doubt it is). A bug like this is perfect to expose issues in WSGI implementations.
comment:4 by , 12 months ago
Case in point, it allowed me to identify a possible issue in my WSGI changes to hypercorn: https://github.com/pgjones/hypercorn/pull/155/files#r1402530266
comment:5 by , 12 months ago
A redirect response is allowed to have a body, so in order for keep-alive to work, the HTTP server must send either Content-Length
or Transfer-Encoding: chunked
. (Hardly anything in HTTP is really “required”—the server can also indicate the empty body by closing the connection, but then the client might be unable to distinguish that from an error.) It’s less clear that the WSGI application should be responsible for sending one of these headers explicitly. I would have expected uWSGI to automatically add Transfer-Encoding
if Content-Length
is not provided, but evidently it doesn’t.
This is easy to reproduce without Django, so I don’t think we should feel obligated to avoid fixing this just to have a reproduction recipe.
def application(environ, start_response): body = b"Hello, world!" start_response( "200 OK", [ # ("Content-Length", str(len(body))), # uncomment to fix ], ) yield body
comment:6 by , 12 months ago
the HTTP server must send either
Content-Length
orTransfer-Encoding: chunked
I do not think this is true, a third option is most likely to simply close the connection.
It’s less clear that the WSGI application should be responsible for sending it explicitly.
It is explicitly clear that the WSGI application is not required to send this in PEP 3333 (https://peps.python.org/pep-3333/#handling-the-content-length-header):
If the application does not supply a Content-Length header, a server or gateway may choose one of several approaches to handling it. The simplest of these is to close the client connection when the response is completed.
So I would argue that this is a bug in uWSGI. This is also not the first WSGI incompatibility in uWGSI, for instance I fixed a rather annoying issue with close: https://github.com/unbit/uwsgi/commit/1746254fec076bd52e7682187041e82fafad5427
This is easy to reproduce without Django, so I don’t think we should feel obligated to avoid fixing this just to have a reproduction recipe.
The thing is, if we fix this in Django just to workaround about a spec incompatibility in uWSGI (imo), then the next framework has to do the same and we just might hide a harder to fix bug. Fixing this in uWSGI would properly fix this for all frameworks.
comment:7 by , 12 months ago
Resolution: | → invalid |
---|---|
Status: | assigned → closed |
I agree with Florian that it's not an issue in Django itself. This should be reported and fixes in uWSGI.
comment:8 by , 12 months ago
So let’s fix it in both Django and uWSGI? Even after uWSGI is fixed, we want Django to send Content-Length
anyway when possible, because Content-Length
is more efficient than the alternatives, and uWSGI isn’t in a position to generate Content-Length
because HttpResponse
doesn’t have a len()
.
comment:9 by , 12 months ago
I just looked at the PR and there might be a point to move https://github.com/andersk/django/blob/de1ee5ee0ba51229ec8ceb7146173533f45ace51/django/middleware/common.py#L114 into our WSGI handler because the middleware does not have to be active. Whether this would mean that HTTPResponse
should have a __len__
or we simply add logic in the wsgi handler, I am not sure yet. The PR as it stands now is not the proper fix though I think.
comment:10 by , 12 months ago
To be clear, uWSGI doesn’t currently implement the __len__
optimization and the spec doesn’t require it. So that’s another path that would require changing both Django and uWSGI, with the disadvantage that only some WSGI gateways would benefit.
comment:11 by , 12 months ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Yeah, in general I'd expect the server to optimize where possible, so it is rather surprising that uWSGI doesn't do that.
It is unfortunate that HttpResponse
does not have a __len__
and we cannot monkey patch it onto it while also supporting streaming via generators without a known length because Python does something along the lines of type(x).__len__(obj)
, so even if hasattr()
sees __len__
, calling len()
on it works only if the class has __len__
.
I also guess we might not exactly be able to get rid of setting the Content-Length
in the CommonMiddleware
because we document it: https://docs.djangoproject.com/en/4.2/ref/middleware/#module-django.middleware.common
So the way I see it we have two options:
- Improve the
CommonMiddleware
as in your PR (I am saying improve and not fix, because technically Django does nothing spec incompatible, so it is rather a performance improvement. No matter how we spin it, there is a deeper bug in uWSGI I think -- this also means there will be no backports).
- Also add a similar code to our WSGI/ASGI handlers and set the
Content-Length
where possible. I am in favor of doing this, since Anders is right that a server cannot optimize given thatHttpResponse
has no__len__
. I am slighty worried about backwards compatibility here though, but I honestly cannot think of a case where setting theContent-Length
would cause problems.
As such I am preliminarily reopening the ticket and if Mariusz is fine with it let's set it to accepted. The title etc will be a little bit off then, but I guess that is fine.
Would you mind opening a ticket against uWSGI and cross-link it here?
comment:12 by , 12 months ago
Has patch: | unset |
---|---|
Summary: | APPEND_SLASH redirects much slower in Django 4.2 when used with uWSGI's http11-socket → Set Content-Length where possible for HttpResponses. |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
Tentatively accepted.
comment:13 by , 11 months ago
I have been digging a bit more in the RFC and I think setting the Content-Length
to 0
is actually preferred since it indicates that there is really no content (which a redirect is allowed to have).
See https://www.rfc-editor.org/rfc/rfc9110#field.content-length:
Aside from the cases defined above, in the absence of Transfer-Encoding, an origin server SHOULD send a Content-Length header field when the content size is known prior to sending the complete header section. This will allow downstream recipients to measure transfer progress, know when a received message is complete, and potentially reuse the connection for additional requests.
Can you compare tcpdumps between the two options, that might provide some indication of what is at fault.