#33865 closed Cleanup/optimization (fixed)

Optimize django.core.handlers.wsgi.LimitedStream

Reported by: Nick Pope Owned by: Nick Pope
Component: HTTP handling Version: dev
Severity: Normal Keywords: wsgi, limitedstream, performance
Cc: Ivan Sagalaev, Florian Apolloner, Anvesh Mishra Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The current implementation of LimitedStream is slow because .read() performs an extra copy into a buffer and .readline() performs two extra copies. We're already typically wrapping a BytesIO object so this is unnecessary.

This implementation has largely been untouched for 12 years and, inspired by a simpler implementation in werkzeug, I was able to achieve the following performance improvement:

LimitedStream.read() (single line): Mean +- std dev: [bench_limitedstream-main] 286 ns +- 6 ns -> [bench_limitedstream-patch] 227 ns +- 6 ns: 1.26x faster
LimitedStream.readline() (single line): Mean +- std dev: [bench_limitedstream-main] 507 ns +- 11 ns -> [bench_limitedstream-patch] 232 ns +- 8 ns: 2.18x faster
LimitedStream.read(8192) (single line): Mean +- std dev: [bench_limitedstream-main] 360 ns +- 8 ns -> [bench_limitedstream-patch] 297 ns +- 6 ns: 1.21x faster
LimitedStream.readline(8192) (single line): Mean +- std dev: [bench_limitedstream-main] 602 ns +- 10 ns -> [bench_limitedstream-patch] 305 ns +- 10 ns: 1.98x faster
LimitedStream.read() (multiple lines): Mean +- std dev: [bench_limitedstream-main] 290 ns +- 5 ns -> [bench_limitedstream-patch] 236 ns +- 6 ns: 1.23x faster
LimitedStream.readline() (multiple lines): Mean +- std dev: [bench_limitedstream-main] 517 ns +- 19 ns -> [bench_limitedstream-patch] 239 ns +- 7 ns: 2.16x faster
LimitedStream.read(8192) (multiple lines): Mean +- std dev: [bench_limitedstream-main] 363 ns +- 8 ns -> [bench_limitedstream-patch] 311 ns +- 11 ns: 1.17x faster
LimitedStream.readline(8192) (multiple lines): Mean +- std dev: [bench_limitedstream-main] 601 ns +- 12 ns -> [bench_limitedstream-patch] 308 ns +- 7 ns: 1.95x faster
    
Geometric mean: 1.59x faster

Attachments (3)

bench_limitedstream.py (1.8 KB ) - added by Nick Pope 20 months ago.
bench_limitedstream-a46dfa87d0.json (76.2 KB ) - added by Nick Pope 20 months ago.
bench_limitedstream-patch.json (76.2 KB ) - added by Nick Pope 20 months ago.

Download all attachments as: .zip

Change History (17)

comment:1 by Nick Pope, 20 months ago

Has patch: set

by Nick Pope, 20 months ago

Attachment: bench_limitedstream.py added

by Nick Pope, 20 months ago

by Nick Pope, 20 months ago

comment:2 by Nick Pope, 20 months ago

Attached benchmark script and output.

(Although output attached is from a different run/machine to that in the description, it's in the same ballpark. Very much an improvement.)

comment:3 by Mariusz Felisiak, 20 months ago

Triage Stage: UnreviewedAccepted

Tentatively accepted for future investigation.

comment:4 by Mariusz Felisiak, 20 months ago

Patch needs improvement: set

comment:5 by Nick Pope, 20 months ago

Patch needs improvement: unset

Issues should now be resolved.

comment:6 by Mariusz Felisiak, 20 months ago

Cc: Ivan Sagalaev Florian Apolloner added

comment:7 by Mariusz Felisiak, 20 months ago

Patch needs improvement: set

Per Florian's and Nick's comments.

comment:8 by Anvesh Mishra, 17 months ago

Cc: Anvesh Mishra added

comment:9 by Nick Pope, 15 months ago

Patch needs improvement: unset

Marking ready for review again. See my comments on the ticket. I don't think we need to do anything further, and if we do it should be a separate ticket.

comment:10 by Carlton Gibson, 15 months ago

Triage Stage: AcceptedReady for checkin

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 15 months ago

In 7a1543d:

Refs #33865 -- Made RequestsTests.test_set_encoding_clears_GET use FakePayload.

The input stream, wsgi.input, must be a file-like object. The existing
implementation of LimitedStream was lax and allowed an empty string to
be passed incorrectly.

See https://wsgi.readthedocs.io/en/latest/definitions.html#envvar-wsgi.input

comment:12 by Mariusz Felisiak <felisiak.mariusz@…>, 15 months ago

In 95182a85:

Refs #33865 -- Corrected signature of ExplodingBytesIO.read().

These subclasses of io.BytesIO should inherit the correct signature.

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 15 months ago

In 57f5669d:

Refs #33865 -- Improved implementation of FakePayload.

FakePayload is a wrapper around io.BytesIO and is expected to
masquerade as though it is a file-like object. For that reason it makes
sense that it should inherit the correct signatures from io.BytesIO
methods.

Crucially an implementation of .readline() is added which will be
necessary for this to behave more like the expected file-like objects as
LimitedStream will be changed to defer to the wrapped stream object
rather than rolling its own implementation for improved performance.

It should be safe to adjust these signatures because FakePayload is
only used internally within test client helpers, is undocumented, and
thus private.

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 15 months ago

Resolution: fixed
Status: assignedclosed

In b47f2f5:

Fixed #33865 -- Optimized LimitedStream wrapper.

The current implementation of LimitedStream is slow because .read()
performs an extra copy into a buffer and .readline() performs two
extra copies. The stream being wrapped is already typically a BytesIO
object so this is unnecessary.

This implementation has largely been untouched for 12 years and,
inspired by a simpler implementation in werkzeug, it was possible to
achieve the following performance improvement:

LimitedStream.read() (single line):

Mean +- std dev: [bench_limitedstream-main] 286 ns +- 6 ns
-> [bench_limitedstream-patch] 227 ns +- 6 ns: 1.26x faster

LimitedStream.readline() (single line):

Mean +- std dev: [bench_limitedstream-main] 507 ns +- 11 ns
-> [bench_limitedstream-patch] 232 ns +- 8 ns: 2.18x faster

LimitedStream.read(8192) (single line):

Mean +- std dev: [bench_limitedstream-main] 360 ns +- 8 ns
-> [bench_limitedstream-patch] 297 ns +- 6 ns: 1.21x faster

LimitedStream.readline(8192) (single line):

Mean +- std dev: [bench_limitedstream-main] 602 ns +- 10 ns
-> [bench_limitedstream-patch] 305 ns +- 10 ns: 1.98x faster

LimitedStream.read() (multiple lines):

Mean +- std dev: [bench_limitedstream-main] 290 ns +- 5 ns
-> [bench_limitedstream-patch] 236 ns +- 6 ns: 1.23x faster

LimitedStream.readline() (multiple lines):

Mean +- std dev: [bench_limitedstream-main] 517 ns +- 19 ns
-> [bench_limitedstream-patch] 239 ns +- 7 ns: 2.16x faster

LimitedStream.read(8192) (multiple lines):

Mean +- std dev: [bench_limitedstream-main] 363 ns +- 8 ns
-> [bench_limitedstream-patch] 311 ns +- 11 ns: 1.17x faster

LimitedStream.readline(8192) (multiple lines):

Mean +- std dev: [bench_limitedstream-main] 601 ns +- 12 ns
-> [bench_limitedstream-patch] 308 ns +- 7 ns: 1.95x faster

Geometric mean: 1.59x faster

Note: See TracTickets for help on using tickets.
Back to Top