Context Navigation

← Previous Ticket
Next Ticket →

#10571 closed Bug (fixed)

FakePayload Truncates Unicode Content

Reported by:	rwagner@…	Owned by:	nobody
Component:	Testing framework	Version:	dev
Severity:	Normal	Keywords:
Cc:	Anand Kumria, Peter van Kampen	Triage Stage:	Ready for checkin
Has patch:	yes	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description

Hi,

While testing an API for posting new objects using JSON, I think I've tripped over a problem with cStringIO and Unicode. My test had a few line like this, and the raw_post_data in the receiving view was truncated.

    json = u'{"name": "Rick"}'
    response = self.c.post('/api/person/', json,
                            content_type="application/json")

Looking a little closer, it seems that cStringIO doesn't treat Unicode objects very well. In particular, because the codec is utf-16, the number of bytes required for reading is doubled.

Here's a demonstration (I hope it's sufficient):

>>> from django.test.client import FakePayload
>>> json = u'{"name": "Rick"}'
>>> payload = FakePayload(json)
>>> len(json)
16
>>> payload._FakePayload__len
16
>>> payload.read()
'\x00{\x00"\x00n\x00a\x00m\x00e\x00"\x00:'
>>> '\x00{\x00"\x00n\x00a\x00m\x00e\x00"\x00:'.decode("utf-16")
u'{"name":'
>>> import cStringIO
>>> import StringIO
>>> cStringIO.StringIO(json).read()
'\x00{\x00"\x00n\x00a\x00m\x00e\x00"\x00:\x00 \x00"\x00R\x00i\x00c\x00k\x00"\x00}'
>>> len(cStringIO.StringIO(json).read())
32
>>> StringIO.StringIO(json).read()
u'{"name": "Rick"}'
>>> len(StringIO.StringIO(json).read())
16
>>> cStringIO.StringIO(json).read(16)
'\x00{\x00"\x00n\x00a\x00m\x00e\x00"\x00:'
>>> cStringIO.StringIO(json).read(16).decode("utf-16")
u'{"name":'
>>>

A direct fix would be to use StringIO instead of cStringIO, but I appreciate the desire to keep the speed-up, especially for unit testing.

Please correct me if I've misinterpreted the use POST in the test client.

Thanks,
Rick

PS: This may be related to #9480.

Attachments (3)

unicode_payload.patch (5.6 KB ) - added by rwagner@… 16 years ago.: Patch to encode test client post content and tests.
unicode_payload_put.patch (6.5 KB ) - added by Peter van Kampen 14 years ago.
unicode_payload_put.2.patch (5.7 KB ) - added by Peter van Kampen 14 years ago.: Updated patch addressing wildfire's comments. The deletion of the query_string handling in the put method still needs to be looked at, as for the deletion of 'put' , 'delete' in test_get_like_requests.

Download all attachments as: .zip

Change History (27)

comment:1 by Jacob, 16 years ago

milestone:	→ 1.1
Triage Stage:	Unreviewed → Accepted

comment:2 by Russell Keith-Magee, 16 years ago

Thanks for going to the trouble of trying to work out a test case, but there is something else going on here. I don't get the same behavior that you are reporting for your demonstration case:

In [1]: from django.test.client import FakePayload

In [2]: json = u'{"name": "Rick"}'

In [3]: payload = FakePayload(json)

In [4]: len(json)

Out[4]: 16

In [5]: payload._FakePayload__len

Out[5]: 16

In [6]: payload.read()

Out[6]: '{"name": "Rick"}'

Now, this could be caused by a discrepancy in the runtime environment, but you haven't provided any details on how your environment is configured. I've fiddled around for a bit with the test case that I provided for #9480, but I couldn't get that to break either.

I don't doubt that you're seeing a truncation problem, and I can certainly see how UTF-16 could be the missing piece. Your hunch that this is related to #9480 is probably correct.

The best way to help close out this bug is to provide a failing test case against Django itself. If you can drop a failing test case into the end of
tests/regressiontests/test_client_regress/models.py (see the docs for details on how to run the Django test suite), then it will be much easier to narrow down and fix the problem.

comment:3 by Jacob, 16 years ago

Resolution:	→ worksforme
Status:	new → closed

I can't reproduce this, and since Russ can't either I'm marking worksforme. If you can provide more information along the lines Russ asked please feel free to reopen.

comment:4 by rwagner@…, 16 years ago

I suspect Russ is right that it's something to do with my environment. I added a test to the client regression tests as suggested, and it's failing only on one of my installed version of Python. For reference, it's the Enthought Python Distribution, which is a big bundle of scientific packages. My build of Python 2.5 from source is fine. I take this as an indicator that it's probably not Django, but the Enthought build. I may be headed over to their bug tracker shortly.

Thanks for looking at this!

--Rick

P.S. Here's the unnecessary test case:

class FakePayloadTests(TestCase):
    def test_fake_payload_stringio(self):
        # Ensure that StringIO doesn't truncate Unicode strings
        json = u'{"name": "Rick"}'
        payload = FakePayload(json)
        new_json = payload.read()
        self.assertEqual(json, new_json)

comment:5 by rwagner@…, 16 years ago

Resolution:	worksforme
Status:	closed → reopened

After a bit more work, I think this is an issue that is possible for others to trip over. According to this feature request, cStringIO returns the raw binary data. For whatever reason, both the Enthought Python build, and production 2.6 release use UTF-16 as the default encoding for Unicode strings. This makes me think that cStringIO is behaving correctly.

Here's the setup I'm using:

Machine: PowerBook G4
OS: OS X 10.4.11
Python: 2.6.1 python-2.6.1-macosx2008-12-06.dmg

As a test, I wrote a view that deserializes a JSON string, and then regurgitates it back as a file. This ensures that the JSON data isn't getting mangled on the way, and mirrors what I'm doing.

def return_json_file(request):
    "A view that parses and returns a JSON string as a file."    
    obj_dict = simplejson.loads(request.raw_post_data)
    obj_json = simplejson.dumps(obj_dict, 
                                cls=DjangoJSONEncoder,
                                ensure_ascii=False)
    response =  HttpResponse(obj_json, status=200, 
                             mimetype='application/json')
    response['Content-Disposition'] = 'attachment; filename=testfile.json'
    return response

For the test, I tried two separate cases, one with a simple string as in my previous test case, the other with some Japanese characters pulled from the unicode.html template in the regression tests. I tested these using both the post and response mechanism, and FakePayload directly.

class UnicodePostTests(TestCase):
    def test_simple_fakepayload(self):
        json = u'{"english": "mountain pass"}'
        payload = FakePayload(json)
        self.assertEqual(payload.read(), json)

    def test_fakepayload(self):
        json = u'{"japanese": "\u5ce0 (\u3068\u3046\u3052 t\u014dge)"}'
        payload = FakePayload(json)
        self.assertEqual(payload.read(), json)

    def test_simple_unicode_payload(self):
        json = u'{"english": "mountain pass"}'
        response = self.client.post("/test_client_regress/parse_unicode_json/", json,
                                    content_type="application/json")
        # N.B: This relys on the simplejson formatting
        self.assertEqual(response.content, json)

    def test_unicode_payload(self):
        json = u'{"japanese": "\u5ce0 (\u3068\u3046\u3052 t\u014dge)"}'
        response = self.client.post("/test_client_regress/parse_unicode_json/", json,
                                    content_type="application/json")
        # N.B: This relys on the simplejson formatting
        self.assertEqual(response.content, json)

Here are the results, using cStringIO (all tests fail).

cable:~/Projects/Django/tests rpwagner$ ./runtests.py --settings=testsite.settings test_client_regress
======================================================================
ERROR: test_simple_unicode_payload (regressiontests.test_client_regress.models.UnicodePostTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/models.py", line 634, in test_simple_unicode_payload
    content_type="application/json")
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/test/client.py", line 306, in post
    response = self.request(**r)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/core/handlers/base.py", line 92, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/views.py", line 71, in return_json_file
    obj_dict = simplejson.loads(request.raw_post_data)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 338, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

======================================================================
ERROR: test_unicode_payload (regressiontests.test_client_regress.models.UnicodePostTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/models.py", line 642, in test_unicode_payload
    content_type="application/json")
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/test/client.py", line 306, in post
    response = self.request(**r)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/core/handlers/base.py", line 92, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/views.py", line 71, in return_json_file
    obj_dict = simplejson.loads(request.raw_post_data)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py", line 338, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

======================================================================
FAIL: test_fakepayload (regressiontests.test_client_regress.models.UnicodePostTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/models.py", line 638, in test_fakepayload
    self.assertEqual(payload.read(), json)
AssertionError: '\x00{\x00"\x00j\x00a\x00p\x00a\x00n\x00e\x00s\x00e\x00"\x00:\x00 \x00"' != u'{"japanese": "\u5ce0 (\u3068\u3046\u3052 t\u014dge)"}'

======================================================================
FAIL: test_simple_fakepayload (regressiontests.test_client_regress.models.UnicodePostTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/models.py", line 633, in test_simple_fakepayload
    self.assertEqual(payload.read(), json)
AssertionError: '\x00{\x00"\x00e\x00n\x00g\x00l\x00i\x00s\x00h\x00"\x00:\x00 \x00"\x00m' != u'{"english": "mountain pass"}'

----------------------------------------------------------------------
Ran 53 tests in 1.432s

FAILED (failures=2, errors=2)

Now, from experience, I know that the data showing up in the view is truncated. So I go in to django/test/client.py, and hack out the import of cStringIO to force the use of the pure-Python StingIO, and the error moves to the wsgi handler. I'm not worried about that (yet), because this shows me that in my case it's cStringIO that's causing problems.

cable:~/Projects/Django/tests rpwagner$ ./runtests.py --settings=testsite.settings test_client_regress
======================================================================
ERROR: test_unicode_payload (regressiontests.test_client_regress.models.UnicodePostTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/models.py", line 642, in test_unicode_payload
    content_type="application/json")
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/test/client.py", line 306, in post
    response = self.request(**r)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/core/handlers/base.py", line 92, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/Users/rpwagner/Projects/Django/tests/regressiontests/test_client_regress/views.py", line 71, in return_json_file
    obj_dict = simplejson.loads(request.raw_post_data)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/core/handlers/wsgi.py", line 205, in _get_raw_post_data
    size=content_length)
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/django/core/handlers/wsgi.py", line 72, in safe_copyfileobj
    fdst.write(buf)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u5ce0' in position 14: ordinal not in range(128)

----------------------------------------------------------------------
Ran 51 tests in 3.237s

FAILED (errors=1)

I'm hoping that my test with the Japanese characters will cause the same failures on other machines. I suspect any multi-byte characters would reproduce this.

Thanks again for the help, and please let me if I need to be more specific, or if I'm not misunderstanding some functionality.

--Rick

follow-up: 7 comment:6 by Russell Keith-Magee, 16 years ago

Rick - thanks so much for taking the trouble to work out such a complete set of demonstration cases.

I'm going to qualify my comments with the following: I would not, on my best day, call Unicode handling one of my strengths. It's entirely possible I've got everything bass-ackwards here. I'll be the first to defer to anyone with superior knowledge in this case.

That said, there appears to be several problems at play here.

Firstly, FakePayload itself. FakePayload isn't really intended to be an externally visible container - it's an internal facility for storing file content so it can be passed to the WSGI input handler. This handler expects a byte stream, so it isn't really appropriate to be putting unencoded unicode data into the FakePayload. Instead, what you should be putting into the FakePayload is an encoded byte string; what you get back will be a similarly encoded string. Internally, the file handler uses the FakePayload to compose a wsgi input stream; unicode content isn't appropriate in this context. As a result, your payload tests should look more like:
```
    def test_fakepayload(self):
        json = u'{"japanese": "\u5ce0 (\u3068\u3046\u3052 t\u014dge)"}'
        payload = FakePayload(json.encode('utf-8'))
        self.assertEqual(payload.read().decode('utf-8'), json)
```
I've used utf-8 here, but the actual encoding shouldn't matter (as long as it is symmetrical between encode and decode) - the point is that you need to explicitly encode from unicode to a byte string. On my machine, which uses a default charset of ascii, I don't need to make this modification to the simple_fakepayload test - I suspect that your default charset of utf-16 will mean that you will need to make the same change on both tests.

However, that said, Django's test client should be doing this encoding for you when you provided unicode data as file content. This means that the post() method of the test client (django/test/client.py, line 285), which currently starts:
```
        if content_type is MULTIPART_CONTENT:
            post_data = encode_multipart(BOUNDARY, data)
        else:
            post_data = data
```
should, in fact, read:
```
        if content_type is MULTIPART_CONTENT:
            post_data = encode_multipart(BOUNDARY, data)
        else:
            post_data = data.encode('utf-8')
```
This encoding would match the Content-Type header that is transmitted with the POST; ideally, we should be checking for a user-provided Content-Type header so we can encode to a different charset if required.

response.content holds the raw encoded byte stream from the response. If you want to compare against an original unicode input, you will need to encode that byte stream to the same encoding. This means your final assertion in the test_unicode_payload test should read
```
        self.assertEqual(response.content.decode('utf-8'), json)
```
In this case, it needs to be utf-8, because that's the content type that is being returned in the response (again, this is the default Content-Type header on HTTP responses).

You are correct that using StringIO rather than cStringIO makes problem (1) go away, but as far as I can make out, that is just hiding the deeper problems (2) and (3).

So - to confirm whether or not I understand anything about unicode... if you do the following:

Add the .encode() and .decode() calls to the FakePayload tests.
Add the .encode('utf-8') to the post() method on the test client.
Add the .decode('utf-8') to the response.content check in the payload tests.

I believe that all your tests should pass. Can you confirm that this is the case? Or do I need to go back to the books and read up on unicode some more?

in reply to: 6 comment:7 by rwagner@…, 16 years ago

Replying to russellm:

Rick - thanks so much for taking the trouble to work out such a complete set of demonstration cases.

Gladly. I respect the time you put in developing and maintaining Django.

I'm going to qualify my comments with the following: I would not, on my best day, call Unicode handling one of my strengths. It's entirely possible I've got everything bass-ackwards here. I'll be the first to defer to anyone with superior knowledge in this case.

I assure you, that would not be me. This has been one of those "learning experiences" we all appreciate.

So - to confirm whether or not I understand anything about unicode... if you do the following:

Add the .encode() and .decode() calls to the FakePayload tests.
Add the .encode('utf-8') to the post() method on the test client.
Add the .decode('utf-8') to the response.content check in the payload tests.
I believe that all your tests should pass. Can you confirm that this is the case? Or do I need to go back to the books and read up on unicode some more?

You were correct in all cases. I went ahead and added the check for a charset on the content-type argument to set the encoding based on a user-supplied encoding. I also added a test using KOI8-R, just to step away from ASCII and UTF-8.

The check for the charset is not the cleanest, but it's working:

        if content_type is MULTIPART_CONTENT:
            post_data = encode_multipart(BOUNDARY, data)
        else:
            # Encode the content so that the byte representation is correct.
            charset = settings.DEFAULT_CHARSET
            if 'charset=' in content_type:
                left, charset = content_type.split('charset=')
            post_data = data.encode(charset)

Likewise, I updated my test view to use the correct encoding when dumping the dict to JSON, and in the response:

def return_json_file(request):
    "A view that parses and returns a JSON string as a file."    
    charset = settings.DEFAULT_CHARSET
    if 'charset=' in request.META['CONTENT_TYPE']:
        left, charset = request.META['CONTENT_TYPE'].split('charset=')

    # This just checks that the uploaded data is JSON
    obj_dict = simplejson.loads(request.raw_post_data)
    obj_json = simplejson.dumps(obj_dict, encoding=charset,
                                cls=DjangoJSONEncoder,
                                ensure_ascii=False)
    response =  HttpResponse(obj_json, status=200, 
                             mimetype='application/json; charset=' + charset)
    response['Content-Disposition'] = 'attachment; filename=testfile.json'
    return response

Finally, I added your decode suggestions to the regression tests, and the new test using KOI-8.

class UnicodePostTests(TestCase):
    # FakePayload is not intended to be externally accessible.
    # These are also exercised using the _payload_ tests.
    def test_simple_fakepayload(self):
        "Make sure that FakePayload returns what it gets"
        #Regression test for #10571
        json = u'{"english": "mountain pass"}'
        payload = FakePayload(json.encode('utf-8'))
        self.assertEqual(payload.read().decode('utf-8'), json)

    def test_fakepayload(self):
        "Make sure that FakePayload returns what it gets, outside of ASCII"
        #Regression test for #10571
        json = u'{"japanese": "\u5ce0 (\u3068\u3046\u3052 t\u014dge)"}'
        payload = FakePayload(json.encode('utf-8'))
        self.assertEqual(payload.read().decode('utf-8'), json)

    def test_simple_unicode_payload(self):
        "Test POSTing a simple JSON document"
        #Regression test for #10571
        json = u'{"english": "mountain pass"}'
        response = self.client.post("/test_client_regress/parse_unicode_json/", json,
                                    content_type="application/json")
        # N.B: This relys on the simplejson formatting
        self.assertEqual(response.content, json)

    def test_unicode_payload_utf8(self):
        "Test POSTing data outside of ASCII"
        #Regression test for #10571
        json = u'{"dog": "собака"}'
        response = self.client.post("/test_client_regress/parse_unicode_json/", json,
                                    content_type="application/json; charset=utf-8")
        # N.B: This relys on the simplejson formatting
        self.assertEqual(response.content.decode('utf-8'), json)

    def test_unicode_payload_kio8(self):
        "Test POSTing data outside of ASCII and UTF-8"
        #Regression test for #10571
        json = u'{"dog": "\u044f\u2502\u043f\u256c\u043f\u2560\u043f\u255f\u043f\u2568\u043f\u255f"}'
        response = self.client.post("/test_client_regress/parse_unicode_json/", json,
                                    content_type="application/json; charset=koi8-r")
        # N.B: This relys on the simplejson formatting
        self.assertEqual(response.content.decode('koi8-r'), json)

As you point out, FakePayload isn't really meant to be exposed, so the first two tests should probably be removed. After all, the test client's functionality is being tested by the use of the post method. Which probably means that the correct title for this ticket should have been "Test Client Does Not Properly Encode POST Data".

I hope this is progress, since all of the tests pass (really all, I checked to ensure that I didn't break anything). All of the above code is in a patch I am about to attach to this ticket. If there is some particular cleaning or improvement you see, let me know, and I'll try to oblige.

Thanks,
Rick

comment:8 by rwagner@…, 16 years ago

Has patch:	set

Adding patch. I hope I don't end up doing this a bunch of times and spamming everyone and the RSS feed.

by rwagner@…, 16 years ago

Attachment:	unicode_payload.patch added

Patch to encode test client post content and tests.

comment:9 by Russell Keith-Magee, 16 years ago

Resolution:	→ fixed
Status:	reopened → closed

(In [10513]) Fixed #10571 -- Ensured that unicode POST data is correctly encoded by the test client. Thanks to Rick Wagner for his help identifying and fixing this problem.

comment:10 by Russell Keith-Magee, 16 years ago

(In [10514]) [1.0.X] Fixed #10571 -- Ensured that unicode POST data is correctly encoded by the test client. Thanks to Rick Wagner for his help identifying and fixing this problem.

Merge of r10513 from trunk.

comment:11 by Russell Keith-Magee, 16 years ago

Rick - thanks for your most excellent assistance tracking this problem. I owe you one (1) ISO standard, correctly UTF-16 encoded lollipop. :-)

comment:12 by Malcolm Tredinnick, 16 years ago

For the record, the test added here using KOI-8R isn't particularly correct. The content-type of an HTTP request has no expected influence on the encoding of the HTTP response and in lieu of an Accept-Charset header, the server is permitted to return any encoding at all, so testing for a particular encoding is only checking the view implementation works, not the Django's HTTP implementation or test client is any way correct. So this test is not incorrect, but it doesn't actually show a regression if the assertion fails.

Only noting this here because the test is going to change as part of the http-wsgi-improvements branch work (Chris Cahoon's summer of code work) and if anybody traces the change back to this ticket, there's now an explanation (to be clear: the test itself isn't enforcing incorrect behaviour at the moment. It's a valid test, but it's not adding any value currently, since it's not testing anything general about Django.)

comment:13 by Kenneth Falck, 14 years ago

Resolution:	fixed
Severity:	→ Normal
Status:	closed → reopened
Type:	→ Uncategorized

This problem seems to still occur in Django 1.3 when using client.put() to PUT fake data to the application. POST works correctly.

It's reproducible with a simple app that echoes back request.raw_post_data from a view:

def index(request):
    return HttpResponse(request.raw_post_data)

class SimpleTest(TestCase):
    def test_basic_addition(self):
        response = self.client.post('/', data=u'{"str":"Hello World"}', content_type='text/javascript')
        print response.content
        response = self.client.put('/', data=u'{"str":"Hello World"}', content_type='text/javascript')
        print response.content

Test result:

Creating test database for alias 'default'...
{"str":"Hello World"}
{"str"
.
----------------------------------------------------------------------
Ran 1 test in 0.084s

comment:14 by droberts@…, 14 years ago

Easy pickings:	unset

I just encountered this same problem (Django 1.2.3), but it looks like it's still not fully solved in Django 1.3 from the above comment. It seems like the real problem here is that FakePayload only works when the content is a str and not when it's unicode, and yet it doesn't advertise this correctly. If it's passed a unicode, it happily accepts it, failing later when .read() is called. This seems like a classic duck-typing bug, but since it's very common to expect that functions will accept either str or unicode, it seems like the correct behavior here would be to fail hard when FakePayload.init is passed a unicode object.

In my particular case, I was using a StringIO as a the file-like object. StringIO.read() returns a str if it's been given strings and unicode if it's been given unicode. I seeded it with what to me looked like a string but turned out to be a unicode, and no one along the way checked to see that the output of f.read() (where f is the StringIO object) was returning unicode. A simple check in FakePayload would have saved me a couple of hours of debugging what seemed like very mysterious behavior.

comment:15 by patchhammer, 14 years ago

Patch needs improvement:	set

unicode_payload.patch fails to apply cleanly on to trunk

comment:16 by Julien Phalip, 14 years ago

Type:	Uncategorized → Bug

by Peter van Kampen, 14 years ago

Attachment:	unicode_payload_put.patch added

comment:17 by Anand Kumria, 14 years ago

UI/UX:	unset

@pterk

comments:

in post; why not just call it 'post_data' to make it clear rather than replacing 'data'

in put; why not call is 'put_data' also you have a commented out debug line

Also I think you want to explain the removal of query_string? Whilst I can see it is not being used, that removal needs someone who understands that section better to confirm that it is correct.

tests/regressiontests/test_client_regress/models.py

trailing whitespace
unnecessary line with whitespace
I think you should also exclude 'delete' since a GET-like request is one without side-effects

by Peter van Kampen, 14 years ago

Attachment:	unicode_payload_put.2.patch added

Updated patch addressing wildfire's comments. The deletion of the query_string handling in the put method still needs to be looked at, as for the deletion of 'put' , 'delete' in test_get_like_requests.

comment:18 by Anand Kumria, 14 years ago

Patch looks a lot better. Time to hassle someone with more knowledge about query_string handling and if PUT and DELETE are GET-like requests.

comment:19 by Andrew Godwin, 14 years ago

Patch needs improvement:	unset
Triage Stage:	Accepted → Ready for checkin

I've taken a look at the change to query_string in PUT and entirely agree with it - the body data shouldn't be put there - and the removal of put and delete from get_like_requests, since I'm taking it to mean "requests with side effects".

Thus, I'm marking this RFC.

comment:20 by Anand Kumria, 14 years ago

Cc:	Anand Kumria added

comment:21 by Peter van Kampen, 14 years ago

Cc:	Peter van Kampen added

comment:22 by Russell Keith-Magee, 14 years ago

Resolution:	→ fixed
Status:	reopened → closed

In [16651]:

Fixed #10571 -- Factored out the payload encoding code to make sure it is used for PUT requests. Thanks to kennu for the report, pterk for the patch, and wildfire for the review comments.

comment:23 by Russell Keith-Magee, 14 years ago

In [16672]:

[1.3.X] Fixed #10571 -- Factored out the payload encoding code to make sure it is used for PUT requests. Thanks to kennu for the report, pterk for the patch, and wildfire for the review comments.

Backport of r16651 from trunk.

comment:24 by Jacob, 14 years ago

milestone:	1.1

Milestone 1.1 deleted

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues

Context Navigation

#10571 closed Bug (fixed)

FakePayload Truncates Unicode Content

Description

Attachments (3)

Change History (27)

comment:1 by Jacob, 16 years ago

comment:2 by Russell Keith-Magee, 16 years ago

comment:3 by Jacob, 16 years ago

comment:4 by rwagner@…, 16 years ago

comment:5 by rwagner@…, 16 years ago

follow-up: 7 comment:6 by Russell Keith-Magee, 16 years ago

in reply to: 6 comment:7 by rwagner@…, 16 years ago

comment:8 by rwagner@…, 16 years ago

by rwagner@…, 16 years ago

comment:9 by Russell Keith-Magee, 16 years ago

comment:10 by Russell Keith-Magee, 16 years ago

comment:11 by Russell Keith-Magee, 16 years ago

comment:12 by Malcolm Tredinnick, 16 years ago

comment:13 by Kenneth Falck, 14 years ago

comment:14 by droberts@…, 14 years ago

comment:15 by patchhammer, 14 years ago

comment:16 by Julien Phalip, 14 years ago

by Peter van Kampen, 14 years ago

comment:17 by Anand Kumria, 14 years ago

by Peter van Kampen, 14 years ago

comment:18 by Anand Kumria, 14 years ago

comment:19 by Andrew Godwin, 14 years ago

comment:20 by Anand Kumria, 14 years ago

comment:21 by Peter van Kampen, 14 years ago

comment:22 by Russell Keith-Magee, 14 years ago

comment:23 by Russell Keith-Magee, 14 years ago

comment:24 by Jacob, 14 years ago

Download in other formats: