Opened 5 years ago

Last modified 19 months ago

#14035 assigned Bug

Cannot access POST after request.encoding was set to a custom value

Reported by: zimnyx Owned by: zimnyx
Component: HTTP handling Version: master
Severity: Normal Keywords:
Cc: net147 Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Issue happens for multipart/form-data and WSGi handler.

#Django trunk r13353.

class DetectEncodingMiddleware(object):
   def process_request(request): 
       # Lets say we define data encoding in POST param called "enc"
       enc = request.POST['enc'] # request._load_post_and_files() 
       request.encoding = enc # request._set_encoding(): del self._post
       request.POST # request._load_post_and_files() # raise AttributeError("You cannot set the upload handlers after the upload has been processed.");

This issue is related: #12522

Attachments (2)

raising_exception.diff (2.3 KB) - added by zimnyx 19 months ago.
Raise exception when try to re-parse multipart/form-data
raising_exception.2.diff (3.2 KB) - added by zimnyx 19 months ago.
Raise exception when try to re-parse multipart/form-data

Download all attachments as: .zip

Change History (20)

comment:1 Changed 5 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

Since you shouldn't be modifying the request anyway, I'm afraid I don't see the problem here. If you need some sort of custom payload handling, then you should be getting the raw POST data and parsing it yourself, not trying to munge Django's POST handling.

comment:2 Changed 5 years ago by zimnyx

  • Resolution invalid deleted
  • Status changed from closed to reopened

What you mean by modifying request here? I'm just setting request.encoding to encoding which I assume client will use. It's common issue if you obtain forms "from outside" your pages.

In fact Django HttpRequest is designed to handle encoding switching on fly:

    def _set_encoding(self, val):   
        """                    
        Sets the encoding used for GET/POST accesses. If the GET or POST
        dictionary has already been created, it is removed and recreated on the
        next access (so that it is decoded correctly).
        """
        self._encoding = val   
        if hasattr(self, '_get'):       
            del self._get      
        if hasattr(self, '_post'):      
            del self._post     

    encoding = property(_get_encoding, _set_encoding)

In fact this encoding switching works fine when form data is encoded using "application/x-www-form-urlencoded". My point is that it doesn't work for "multipart/form-data".

comment:3 Changed 5 years ago by russellm

  • Resolution set to wontfix
  • Status changed from reopened to closed

You are taking the request object you are given, and you are changing it. That is, you are *modifying* the request. You shouldn't be doing that.

Django receives a HTTP request and parses it. If the content type specified in the HTTP request can be interpreted into multiple values, Django will parse the raw content into a POST dictionary.

You have a use case where the content encoding isn't being transmitted as part of the HTTP request. Therefore, it's up to you to parse the raw request data.

comment:4 Changed 5 years ago by zimnyx

You say I shouldn't be doing that (modifying encoding of the HttpRequest), while Django docs says:

===

HttpRequest.encoding

A string representing the current encoding used to decode form submission data (or None, which means the DEFAULT_CHARSET setting is used). You can write to this attribute to change the encoding used when accessing the form data. Any subsequent attribute accesses (such as reading from GET or POST) will use the new encoding value. Useful if you know the form data is not in the DEFAULT_CHARSET encoding.

===

My case fits exactly in this scenario. So do we get each other ?

comment:5 Changed 5 years ago by russellm

  • Triage Stage changed from Unreviewed to Accepted

My apologies. You've obviously read (and paid closer attention to) the docs than I have :-)

In which case, accepting the ticket.

comment:6 Changed 5 years ago by zimnyx

Nice to hear that finally...
It's in docs for quite a long time, it's really surprising you didn't read docs carefully while so easily closing such low-level tickets.

P.S.
You forgot to reopen the ticket.

comment:7 Changed 5 years ago by russellm

  • Resolution wontfix deleted
  • Status changed from closed to reopened

I have triaged approx 40 tickets today. I need to triage another 120 before I can start work on fixing bugs in the 1.3 release. I hope you'll understand that occasionally, human beings make errors when they're doing labor intensive work. I hope you've also noted that if you explain yourself clearly, human beings are usually willing to admit their mistakes and move on.

On the other hand, human beings don't generally take too kindly to being snarked at when they are volunteering their time to help out others. :-)

comment:8 Changed 5 years ago by zimnyx

Other human beings, after spending several hours on investigating a problem, are pretty unhappy when their work gets wontfix'ed in a single click without a simple question ;-)

Have a nice day!

comment:9 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:10 Changed 4 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:11 Changed 4 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:12 Changed 3 years ago by net147

  • Cc net147 added

comment:13 Changed 3 years ago by ptone

Taking a bit of a look at this - and the problem ends up not being trivial.

First and obvious approach to a fix would be to add the deletion of _files to the property setter for encoding - as is done for _post

But taking that step just uncovers a bit more of a yak

First off is that WSGIRequest and HttpRequest differ in the way the POST attribute works - the former will refresh it if _post is deleted, the latter will not. This is perhaps by design - but the differences between these classes is not documented at all.

Second - there are problems with re-parsing the multipart data, as the first time parsing involves reading the post data - which sets the _read_started flag

This prevents _load_post_and_files from being able to reparse with the new encoding - instead it shortcuts to _mark_post_parse_error.

Even if one changes that detail - by allowing the object to set the full _body attribute on the first parse, you still have at least this test fail:

test_body_after_POST_multipart

Which is there to specifically verify that accessing body is not allowed after parsing post - because we don't want to repeat the expensive operation of parsing the multipart data

basically there are aspects here that assume parts of this machinery will only run through once, in a certain order

comment:14 Changed 3 years ago by claudep

Generally speaking, content encoding should be set with the charset attribute of the Content-Encoding header. So we speak of corner cases here. I'd be more enclined to fix the docs and not support changing encoding after parsing, unless we have a common use case showing a clear need for this.

comment:15 Changed 3 years ago by mark@…

I think I've stumbled on the same bug, or a related side affect. I'm receiving some data ("Content-Type: multipart/alternative;") in a POST from an external site (SendGrid) site as per their API ( http://sendgrid.com/docs/API_Reference/Webhooks/parse.html ) . The encoding type for one of the fields (text) in the POST varies, and is given in another POST field (charsets). If read the encoding type in charsets and then set request.encoding to this encoding type, the QueryDict vanishes when I try to get to request.POSTtext?, as in

MultiValueDictKeyError: "Key 'text' not found in <QueryDict: {}>"

This sounds like an edge use case to me. I'm going to assume that this will not get fixed, and look for a work around, but I thought it's worth having a more helpful exception, telling that you can't change the encoding in this scenario, and updating the docs.

comment:16 Changed 2 years ago by aaugustin

  • Status changed from reopened to new

comment:17 Changed 19 months ago by zimnyx

Consider following scenario:

1) read HttpRequest.POST (raw request body is consumed)
2) change HttpRequest.encoding (discards parsed POST entirely)
3) read HttpRequest.POST again

Current behaviour for 3) depends on content-type:

  • for multipart/form-data: returns {}
  • for application/x-www-form-urlencoded: returns brand new dictionary with values encoded in brand new encoding, just like docs say

For "multipart/form-data" behaviour is against documentation and gives false impression that everything went fine. Only way for user to notice such error, is checking HttpRequest._post_parse_error attribute, but I doubt it's a way to go, because it's not pulic nor documented.

My proposition is to always raise exception when we try to re-parse POST for multipart/form-data. Patch addtached.

comment:18 Changed 19 months ago by zimnyx

  • Owner changed from nobody to zimnyx
  • Status changed from new to assigned

Changed 19 months ago by zimnyx

Raise exception when try to re-parse multipart/form-data

Changed 19 months ago by zimnyx

Raise exception when try to re-parse multipart/form-data

Note: See TracTickets for help on using tickets.
Back to Top