Opened 14 years ago

Last modified 3 years ago

#14035 assigned Bug

Cannot access POST after request.encoding was set to a custom value

Reported by: Piotr Czachur Owned by: Piotr Czachur
Component: HTTP handling Version: dev
Severity: Normal Keywords:
Cc: net147 Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Issue happens for multipart/form-data and WSGi handler.

#Django trunk r13353.

class DetectEncodingMiddleware(object):
   def process_request(request): 
       # Lets say we define data encoding in POST param called "enc"
       enc = request.POST['enc'] # request._load_post_and_files() 
       request.encoding = enc # request._set_encoding(): del self._post
       request.POST # request._load_post_and_files() # raise AttributeError("You cannot set the upload handlers after the upload has been processed.");

This issue is related: #12522

Attachments (2)

raising_exception.diff (2.3 KB ) - added by Piotr Czachur 11 years ago.
Raise exception when try to re-parse multipart/form-data
raising_exception.2.diff (3.2 KB ) - added by Piotr Czachur 11 years ago.
Raise exception when try to re-parse multipart/form-data

Download all attachments as: .zip

Change History (22)

comment:1 by Russell Keith-Magee, 14 years ago

Resolution: invalid
Status: newclosed

Since you shouldn't be modifying the request anyway, I'm afraid I don't see the problem here. If you need some sort of custom payload handling, then you should be getting the raw POST data and parsing it yourself, not trying to munge Django's POST handling.

comment:2 by Piotr Czachur, 14 years ago

Resolution: invalid
Status: closedreopened

What you mean by modifying request here? I'm just setting request.encoding to encoding which I assume client will use. It's common issue if you obtain forms "from outside" your pages.

In fact Django HttpRequest is designed to handle encoding switching on fly:

    def _set_encoding(self, val):   
        """                    
        Sets the encoding used for GET/POST accesses. If the GET or POST
        dictionary has already been created, it is removed and recreated on the
        next access (so that it is decoded correctly).
        """
        self._encoding = val   
        if hasattr(self, '_get'):       
            del self._get      
        if hasattr(self, '_post'):      
            del self._post     

    encoding = property(_get_encoding, _set_encoding)

In fact this encoding switching works fine when form data is encoded using "application/x-www-form-urlencoded". My point is that it doesn't work for "multipart/form-data".

comment:3 by Russell Keith-Magee, 14 years ago

Resolution: wontfix
Status: reopenedclosed

You are taking the request object you are given, and you are changing it. That is, you are *modifying* the request. You shouldn't be doing that.

Django receives a HTTP request and parses it. If the content type specified in the HTTP request can be interpreted into multiple values, Django will parse the raw content into a POST dictionary.

You have a use case where the content encoding isn't being transmitted as part of the HTTP request. Therefore, it's up to you to parse the raw request data.

comment:4 by Piotr Czachur, 14 years ago

You say I shouldn't be doing that (modifying encoding of the HttpRequest), while Django docs says:

===

HttpRequest.encoding

A string representing the current encoding used to decode form submission data (or None, which means the DEFAULT_CHARSET setting is used). You can write to this attribute to change the encoding used when accessing the form data. Any subsequent attribute accesses (such as reading from GET or POST) will use the new encoding value. Useful if you know the form data is not in the DEFAULT_CHARSET encoding.

===

My case fits exactly in this scenario. So do we get each other ?

comment:5 by Russell Keith-Magee, 14 years ago

Triage Stage: UnreviewedAccepted

My apologies. You've obviously read (and paid closer attention to) the docs than I have :-)

In which case, accepting the ticket.

comment:6 by Piotr Czachur, 14 years ago

Nice to hear that finally...
It's in docs for quite a long time, it's really surprising you didn't read docs carefully while so easily closing such low-level tickets.

P.S.
You forgot to reopen the ticket.

comment:7 by Russell Keith-Magee, 14 years ago

Resolution: wontfix
Status: closedreopened

I have triaged approx 40 tickets today. I need to triage another 120 before I can start work on fixing bugs in the 1.3 release. I hope you'll understand that occasionally, human beings make errors when they're doing labor intensive work. I hope you've also noted that if you explain yourself clearly, human beings are usually willing to admit their mistakes and move on.

On the other hand, human beings don't generally take too kindly to being snarked at when they are volunteering their time to help out others. :-)

comment:8 by Piotr Czachur, 14 years ago

Other human beings, after spending several hours on investigating a problem, are pretty unhappy when their work gets wontfix'ed in a single click without a simple question ;-)

Have a nice day!

comment:9 by Julien Phalip, 13 years ago

Severity: Normal
Type: Bug

comment:10 by Aymeric Augustin, 13 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:11 by Aymeric Augustin, 13 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:12 by net147, 12 years ago

Cc: net147 added

comment:13 by Preston Holmes, 12 years ago

Taking a bit of a look at this - and the problem ends up not being trivial.

First and obvious approach to a fix would be to add the deletion of _files to the property setter for encoding - as is done for _post

But taking that step just uncovers a bit more of a yak

First off is that WSGIRequest and HttpRequest differ in the way the POST attribute works - the former will refresh it if _post is deleted, the latter will not. This is perhaps by design - but the differences between these classes is not documented at all.

Second - there are problems with re-parsing the multipart data, as the first time parsing involves reading the post data - which sets the _read_started flag

This prevents _load_post_and_files from being able to reparse with the new encoding - instead it shortcuts to _mark_post_parse_error.

Even if one changes that detail - by allowing the object to set the full _body attribute on the first parse, you still have at least this test fail:

test_body_after_POST_multipart

Which is there to specifically verify that accessing body is not allowed after parsing post - because we don't want to repeat the expensive operation of parsing the multipart data

basically there are aspects here that assume parts of this machinery will only run through once, in a certain order

comment:14 by Claude Paroz, 12 years ago

Generally speaking, content encoding should be set with the charset attribute of the Content-Encoding header. So we speak of corner cases here. I'd be more enclined to fix the docs and not support changing encoding after parsing, unless we have a common use case showing a clear need for this.

comment:15 by mark@…, 12 years ago

I think I've stumbled on the same bug, or a related side affect. I'm receiving some data ("Content-Type: multipart/alternative;") in a POST from an external site (SendGrid) site as per their API ( http://sendgrid.com/docs/API_Reference/Webhooks/parse.html ) . The encoding type for one of the fields (text) in the POST varies, and is given in another POST field (charsets). If read the encoding type in charsets and then set request.encoding to this encoding type, the QueryDict vanishes when I try to get to request.POSTtext, as in

MultiValueDictKeyError: "Key 'text' not found in <QueryDict: {}>"

This sounds like an edge use case to me. I'm going to assume that this will not get fixed, and look for a work around, but I thought it's worth having a more helpful exception, telling that you can't change the encoding in this scenario, and updating the docs.

comment:16 by Aymeric Augustin, 12 years ago

Status: reopenednew

comment:17 by Piotr Czachur, 11 years ago

Consider following scenario:

1) read HttpRequest.POST (raw request body is consumed)
2) change HttpRequest.encoding (discards parsed POST entirely)
3) read HttpRequest.POST again

Current behaviour for 3) depends on content-type:

  • for multipart/form-data: returns {}
  • for application/x-www-form-urlencoded: returns brand new dictionary with values encoded in brand new encoding, just like docs say

For "multipart/form-data" behaviour is against documentation and gives false impression that everything went fine. Only way for user to notice such error, is checking HttpRequest._post_parse_error attribute, but I doubt it's a way to go, because it's not pulic nor documented.

My proposition is to always raise exception when we try to re-parse POST for multipart/form-data. Patch addtached.

comment:18 by Piotr Czachur, 11 years ago

Owner: changed from nobody to Piotr Czachur
Status: newassigned

by Piotr Czachur, 11 years ago

Attachment: raising_exception.diff added

Raise exception when try to re-parse multipart/form-data

by Piotr Czachur, 11 years ago

Attachment: raising_exception.2.diff added

Raise exception when try to re-parse multipart/form-data

comment:19 by Silvan Spross, 8 years ago

hi all, I just found this ticket after I created a related question on stackoverflow regarding the same sendgrid problem:

http://stackoverflow.com/questions/42862082/reencode-django-request-body-with-another-encoding

is there a way to get around this at the moment? thank you for your help!

comment:20 by Stephane Poss, 3 years ago

Hello,
5 years later and 3h of debugging... I stumbled on this. There was a discrepancy between my production code and my unittests: works in prod, not in test. My code listens to Paypal IPNs, they do not necessarily encode their payload in utf-8, it's given in a field of the POST data. They do not document the content-type used. The unittests use multipart/form-data.
Using https://docs.djangoproject.com/en/3.2/ref/unicode/#form-submission lead me to believe I could change the encoding after reading the POST data -> cannot work in unittest without changing the content type (all good once the actual content type to use is 'guessed'). I believe a note must be added to https://docs.djangoproject.com/en/3.2/ref/unicode/#form-submission explaining that changing the encoding based on POST data content only works if the content-type is application/x-www-form-urlencoded.

Note: See TracTickets for help on using tickets.
Back to Top