Opened 7 years ago

Closed 3 years ago

Last modified 3 years ago

#28628 closed Cleanup/optimization (fixed)

Audit for and abolish all use of '\d' in regexes

Reported by: James Bennett Owned by: Ad Timmering
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: Ad Timmering, security@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Mariusz Felisiak)

Now that we're in the 2.0 release cycle and Python 3 only, any examples or code in Django using a \d in a regex should be replaced with [0-9], as \d on Python 3 matches any character with Unicode category [Nd], which is almost certainly not what people expect. Changing to explicit [0-9], perhaps with a note about why it should be preferred, would be better.

Inventory of current uses & recommendations (with recommendation & updating table reflecting feedback):

Module Description
django.contrib.admin.options In internal function _get_edited_object_pks.
Obtains PKs for edited objects through the ModelAdmin. POST should be machine-generated (through the admin interface) and not based on human input, so typically will be ASCII input.
Changing to more restrictive regex unlikely to hurt any users, but doesn't add value either.
django.contrib.gis.geometry Used to process GIS geometry data.
Changing to more restrictive regex unlikely to hurt any users, but doesn't add value either.
django.contrib.gis.db.backends.postgis.operations Used to obtain individual decimal components from the gettext version, eg. 1.2.3.
Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.contrib.gis.gdal.libgdal Used to obtain individual decimal components from the gettext version, eg. 1.2.3.

Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.contrib.gis.geos.geometry Used to process GIS geometry EWKT data. Resulting string is cast with int().

Changing to more restrictive regex unlikely to hurt any users, but doesn't add value either.
django.contrib.humanize.templatetags.humanize Used in intcomma filter, that formats ints like 3000 as 3,000.

Tested in tests.humanize_tests.tests.HumanizeTests.test_intcomma.

Using a more restrictive regex could possibly hurt users. Instead added tests to ensure this works with non-ASCII unicode.
django.core.validators (1) Used in URLValidator for ipv4_re port component in regex. Being more restrictive to capture valid IPv4 and ports seems useful, and not likely to be a breaking change. Change to [0-9] for consistency with other parts of the regex.
django.core.validators (2) Used in integer_validator, validate_integer, and int_list_validator. Any number captured by \d can be cast to an int and therefore no value is added (and likely users might be hurt) by restricting this to [0-9]). Do not change.
django.core.management.commands.makemessages Used to obtain individual decimal components from the gettext version, eg. 1.2.3.
Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.core.management.commands.runserver Used to determine valid IPv4 addresses and valid port numbers.
WSGI server seems to work fine even if for example 127.0.0.1 (incl a non-ASCII 0) is provided, and it seems the integers are properly cast -- so there would be little advantage to users to change this to a more restrictive regex.
django.db.backends.mysql.base Used to obtain individual decimal components from a version, eg. 1.2.3. Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.db.backends.sqlite3.introspection Extracts the size number from a varchar(11) type name.

Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.db.migrations.autodetector Used to extract the number from a migration filename. Unlikely to contain non-ASCII unless intentionally changed by a dev.
Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.db.migrations.writer Used to sort imports in migrations, tested in tests.migrations.test_writer, see test_sorted_imports. Unclear to me what the decimal portion of the regex represents, however I don't see value in restricting to more restrictive regex.
django.forms.widgets Used by SelectDateWidget to match a "YYYY-MM-DD" date pattern. When matched integers are cast with int(val) and therefore correctly processed if input was in non-ASCII Unicode decimals.

Changing to a more restrictive regex does not seem to be of much value, because values that don't much date_re are passed to the datetime.datetime.strptime() and will be accepted anyway.
django.http.request Used in split_domain_port to split a domain name and port ("www.example.com:8080").
The port number should be cast to an int (in which case, no problem!) - however in reality isn't currently being used in any code path, and both domain and port are returned as str. Changed to match other components of the regex.
django.template.base Used in FilterExpression regex to match filters in templates. Arguments are passed on to individual filters. Where an int is expected, filters will cast using int(x) regardless so would process correctly. Using a more restrictive regex could possibly hurt users inadvertently using non-ASCII decimals in filter parameters. Do not change.
django.template.defaultfilters Used in default title filter to change a letter after a decimal back back to lower case. Eg. 53RD ⇒ 53Rd ⇒ 53rd. Changing this could have unexpected results for users, and no value is gained. Do not change.
django.test.client Parses charset within the Content-Type header of test.client.RequestFactory. Strictly spoken the HTTP spec requires ASCII characters. Change for consistency as it reflects HTTP specs, however does not add much value.
django.utils.dateparse Used for parsing date and time fields, in parsing string representations in the to_python implementations on DateField and DateTimeField in db.models. Decimals are extracted with regex, and downcast with int(v). In normal usage these are unlikely to contain non-ASCII numbers, but if using legacy DBs it is quite possible data might have non-ASCII versions of numbers as well. Using a more restrictive regex could possibly hurt users. Do not change.
django.utils.http Used in data parsing according to RFC1123, RFC850 and asctime. Requires ASCII decimals as defined in RFC822: https://datatracker.ietf.org/doc/html/rfc822#appendix-D with "DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.)". The additional value of a more restrictive regex is not 100% evident, but it doesn't hurt to change the implementation to be closer to the specifications.
django.utils.version Used to obtain individual decimal components from a version, eg. 1.2.3. Changing to a more restrictive regex is unlikely to hurt any users, but doesn't add value either.
django.utils.translation.trans_real Used in regex to parse HTTP Accept-Language header. HTTP requires ASCII decimals. Example: Accept-Language: pt,en-US;q=0.8,en;q=0.6,ru;q=0.4. Change to more restrictive regex to match specs in https://datatracker.ietf.org/doc/html/rfc2616#section-3.9 .
django.views.i18n JavaScriptCatalog._num_plurals(). Used to parse nplurals setting (see doc). Regex match is cast with int(x). Intent is likely an ASCII decimal number ("nplurals value must be a decimal number which …"), however restricting the regex does not add any value. Do not change.

Change History (28)

comment:1 by James Bennett, 7 years ago

Here's a tentative list of places that should change, from grepping on current master:

django/contrib/gis/db/backends/postgis/operations.py:344:        proj_regex = re.compile(r'(\d+)\.(\d+)\.(\d+)')
django/contrib/gis/gdal/libgdal.py:87:version_regex = re.compile(r'^(?P<major>\d+)\.(?P<minor>\d+)(\.(?P<subminor>\d+))?')
django/contrib/gis/geometry.py:7:wkt_regex = re.compile(r'^(SRID=(?P<srid>\-?\d+);)?'
django/contrib/gis/geometry.py:11:                       r'[ACEGIMLONPSRUTYZ\d,\.\-\+\(\) ]+)$',
django/contrib/gis/geos/geometry.py:116:            match = re.match(b'SRID=(?P<srid>\-?\d+)', srid_part)
django/contrib/gis/templates/gis/admin/openlayers.js:4:{{ module }}.map = null; {{ module }}.controls = null; {{ module }}.panel = null; {{ module }}.re = new RegExp("^SRID=\\d+;(.+)", "i"); {{ module }}.layers = {};
django/contrib/humanize/templatetags/humanize.py:48:    new = re.sub(r"^(-?\d+)(\d{3})", r'\g<1>,\g<2>', orig)
django/core/management/commands/makemessages.py:402:        m = re.search(r'(\d+)\.(\d+)\.?(\d+)?', out)
django/core/management/commands/runserver.py:18:    (?P<ipv4>\d{1,3}(?:\.\d{1,3}){3}) |         # IPv4 address
django/core/management/commands/runserver.py:21:):)?(?P<port>\d+)$""", re.X)
django/core/validators.py:78:    ipv4_re = r'(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)(?:\.(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}'
django/core/validators.py:99:        r'(?::\d{2,5})?'  # port
django/core/validators.py:136:            host_match = re.search(r'^\[(.+)\](?::\d{2,5})?$', urlsplit(value).netloc)
django/core/validators.py:153:    _lazy_re_compile(r'^-?\d+\Z'),
django/core/validators.py:296:    regexp = _lazy_re_compile(r'^%(neg)s\d+(?:%(sep)s%(neg)s\d+)*\Z' % {
django/db/backends/mysql/base.py:49:server_version_re = re.compile(r'(\d{1,2})\.(\d{1,2})\.(\d{1,2})')
django/db/backends/sqlite3/introspection.py:8:field_size_re = re.compile(r'^\s*(?:var)?char\s*\(\s*(\d+)\s*\)\s*$')
django/db/migrations/autodetector.py:1233:        match = re.match(r'^\d+', name)
django/db/migrations/writer.py:170:            if re.match(r"^import (.*)\.\d+[^\s]*$", line):
django/forms/widgets.py:925:    date_re = re.compile(r'(\d{4}|0)-(\d\d?)-(\d\d?)$')
django/http/request.py:21:host_validation_re = re.compile(r"^([a-z0-9.-]+|\[[a-f0-9]*:[a-f0-9\.:]+\])(:\d+)?$")
django/template/base.py:606:    'num': r'[-+\.]?\d[\d\.e]*',
django/template/defaultfilters.py:238:    return re.sub(r"\d([A-Z])", lambda m: m.group(0).lower(), t)
django/test/client.py:34:CONTENT_TYPE_RE = re.compile(r'.*; charset=([\w\d-]+);?')
django/utils/dateparse.py:14:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})$'
django/utils/dateparse.py:18:    r'(?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/utils/dateparse.py:19:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/utils/dateparse.py:23:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})'
django/utils/dateparse.py:24:    r'[T ](?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/utils/dateparse.py:25:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/utils/dateparse.py:26:    r'(?P<tzinfo>Z|[+-]\d{2}(?::?\d{2})?)?$'
django/utils/dateparse.py:31:    r'(?:(?P<days>-?\d+) (days?, )?)?'
django/utils/dateparse.py:32:    r'((?:(?P<hours>-?\d+):)(?=\d+:\d+))?'
django/utils/dateparse.py:33:    r'(?:(?P<minutes>-?\d+):)?'
django/utils/dateparse.py:34:    r'(?P<seconds>-?\d+)'
django/utils/dateparse.py:35:    r'(?:\.(?P<microseconds>\d{1,6})\d{0,6})?'
django/utils/dateparse.py:44:    r'(?:(?P<days>\d+(.\d+)?)D)?'
django/utils/dateparse.py:46:    r'(?:(?P<hours>\d+(.\d+)?)H)?'
django/utils/dateparse.py:47:    r'(?:(?P<minutes>\d+(.\d+)?)M)?'
django/utils/dateparse.py:48:    r'(?:(?P<seconds>\d+(.\d+)?)S)?'
django/utils/dateparse.py:58:    r'(?:(?P<days>-?\d+) (days? ?))?'
django/utils/dateparse.py:60:    r'(?P<hours>\d+):'
django/utils/dateparse.py:61:    r'(?P<minutes>\d\d):'
django/utils/dateparse.py:62:    r'(?P<seconds>\d\d)'
django/utils/dateparse.py:63:    r'(?:\.(?P<microseconds>\d{1,6}))?'
django/utils/html.py:28:unencoded_ampersands_re = re.compile(r'&(?!(\w+|#\d+);)')
django/utils/http.py:30:__D = r'(?P<day>\d{2})'
django/utils/http.py:31:__D2 = r'(?P<day>[ \d]\d)'
django/utils/http.py:33:__Y = r'(?P<year>\d{4})'
django/utils/http.py:34:__Y2 = r'(?P<year>\d{2})'
django/utils/http.py:35:__T = r'(?P<hour>\d{2}):(?P<min>\d{2}):(?P<sec>\d{2})'
django/utils/translation/trans_real.py:35:        (?:\s*;\s*q=(0(?:\.\d{,3})?|1(?:\.0{,3})?))?  # Optional "q=1.00", "q=0.8"
django/views/i18n.py:249:        match = re.search(r'nplurals=\s*(\d+)', self._plural_string or '')

comment:2 by James Bennett, 7 years ago

Description: modified (diff)
Summary: Audit for and abolish all use of '\d' in URL patternsAudit for and abolish all use of '\d' in regexes

comment:3 by James Bennett, 7 years ago

A few more:

tests/gis_tests/geo3d/tests.py:218:        ref_kml_regex = re.compile(r'^<Point><coordinates>-95.363\d+,29.763\d+,18</coordinates></Point>$')
tests/gis_tests/geoapp/test_functions.py:108:                r'<gml:coordinates decimal="\." cs="," ts=" ">-104.60925\d+,38.25500\d+ '
tests/gis_tests/geoapp/test_functions.py:114:                r'-104\.60925\d+,38\.255001</gml:coordinates></gml:Point>'
tests/i18n/test_extraction.py:100:        (or `#: .\dirA\dirB\foo.py:42` on Windows)
tests/i18n/tests.py:524:            self.assertEqual(r'j \d\e F \d\e Y', get_format('DATE_FORMAT'))
tests/logging_tests/tests.py:527:            self.assertRegex(logger_output.getvalue(), r'^\[[-:,.\s\d]+\] %s' % log_msg)
tests/sitemaps_tests/urls/http.py:198:    url(r'^i18n/testmodel/(?P<id>\d+)/$', testmodelview, name='i18n_testmodel'),
tests/urlpatterns_reverse/urls.py:28:    url(r'^optional/(?P<arg1>\d+)/(?:(?P<arg2>\d+)/)?', absolute_kwargs_view, name="named_optional"),
tests/urlpatterns_reverse/urls.py:29:    url(r'^optional/(?P<arg1>\d+)/(?:(?P<arg2>\d+)/)?$', absolute_kwargs_view, name="named_optional_terminated"),

comment:4 by Aymeric Augustin, 7 years ago

Triage Stage: UnreviewedAccepted

Definitely a good idea! Also one of my pet peeves.

comment:5 by Sergey Fedoseev, 7 years ago

Cc: Sergey Fedoseev added

comment:6 by JunyiJ, 7 years ago

Owner: changed from nobody to JunyiJ
Status: newassigned

comment:7 by Sergey Fedoseev, 7 years ago

I'm not sure if this change should be done for bytes, also the usage of https://docs.python.org/3/library/re.html#re.ASCII flag could be better alternative to explicit [0-9].

comment:8 by JunyiJ, 7 years ago

I searched and replace '/d' using bash and the following places are changed:

django/django/http/request.py:21:host_validation_re = re.compile(r"^([a-z0-9.-]+|\[[a-f0-9]*:[a-f0-9\.:]+\])(:\d+)?$")
django/django/contrib/gis/geometry.py:7:wkt_regex = re.compile(r'^(SRID=(?P<srid>\-?\d+);)?'
django/django/contrib/gis/gdal/libgdal.py:87:version_regex = re.compile(r'^(?P<major>\d+)\.(?P<minor>\d+)(\.(?P<subminor>\d+))?')
django/django/contrib/gis/geos/geometry.py:116:            match = re.match(b'SRID=(?P<srid>\-?\d+)', srid_part)
django/django/contrib/gis/db/backends/postgis/operations.py:344:        proj_regex = re.compile(r'(\d+)\.(\d+)\.(\d+)')
django/django/contrib/humanize/templatetags/humanize.py:48:    new = re.sub(r"^(-?\d+)(\d{3})", r'\g<1>,\g<2>', orig)
django/django/core/management/commands/makemessages.py:402:        m = re.search(r'(\d+)\.(\d+)\.?(\d+)?', out)
django/django/core/validators.py:78:    ipv4_re = r'(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)(?:\.(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}'
django/django/core/validators.py:99:        r'(?::\d{2,5})?'  # port
django/django/core/validators.py:136:            host_match = re.search(r'^\[(.+)\](?::\d{2,5})?$', urlsplit(value).netloc)
django/django/core/validators.py:153:    _lazy_re_compile(r'^-?\d+\Z'),
django/django/core/validators.py:296:    regexp = _lazy_re_compile(r'^%(neg)s\d+(?:%(sep)s%(neg)s\d+)*\Z' % {
django/django/template/defaultfilters.py:238:    return re.sub(r"\d([A-Z])", lambda m: m.group(0).lower(), t)
django/django/views/i18n.py:249:        match = re.search(r'nplurals=\s*(\d+)', self._plural_string or '')
django/django/utils/dateparse.py:14:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})$'
django/django/utils/dateparse.py:18:    r'(?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/django/utils/dateparse.py:19:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/django/utils/dateparse.py:23:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})'
django/django/utils/dateparse.py:24:    r'[T ](?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/django/utils/dateparse.py:25:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/django/utils/dateparse.py:26:    r'(?P<tzinfo>Z|[+-]\d{2}(?::?\d{2})?)?$'
django/django/utils/dateparse.py:31:    r'(?:(?P<days>-?\d+) (days?, )?)?'
django/django/utils/dateparse.py:32:    r'((?:(?P<hours>-?\d+):)(?=\d+:\d+))?'
django/django/utils/dateparse.py:33:    r'(?:(?P<minutes>-?\d+):)?'
django/django/utils/dateparse.py:34:    r'(?P<seconds>-?\d+)'
django/django/utils/dateparse.py:35:    r'(?:\.(?P<microseconds>\d{1,6})\d{0,6})?'
django/django/utils/dateparse.py:44:    r'(?:(?P<days>\d+(.\d+)?)D)?'
django/django/utils/dateparse.py:46:    r'(?:(?P<hours>\d+(.\d+)?)H)?'
django/django/utils/dateparse.py:47:    r'(?:(?P<minutes>\d+(.\d+)?)M)?'
django/django/utils/dateparse.py:48:    r'(?:(?P<seconds>\d+(.\d+)?)S)?'
django/django/utils/dateparse.py:58:    r'(?:(?P<days>-?\d+) (days? ?))?'
django/django/utils/dateparse.py:60:    r'(?P<hours>\d+):'
django/django/utils/dateparse.py:61:    r'(?P<minutes>\d\d):'
django/django/utils/dateparse.py:62:    r'(?P<seconds>\d\d)'
django/django/utils/dateparse.py:63:    r'(?:\.(?P<microseconds>\d{1,6}))?'
django/django/utils/http.py:30:__D = r'(?P<day>\d{2})'
django/django/utils/http.py:31:__D2 = r'(?P<day>[ \d]\d)'
django/django/utils/http.py:33:__Y = r'(?P<year>\d{4})'
django/django/utils/http.py:34:__Y2 = r'(?P<year>\d{2})'
django/django/utils/http.py:35:__T = r'(?P<hour>\d{2}):(?P<min>\d{2}):(?P<sec>\d{2})'
django/django/utils/html.py:27:unencoded_ampersands_re = re.compile(r'&(?!(\w+|#\d+);)')
django/django/test/client.py:34:CONTENT_TYPE_RE = re.compile(r'.*; charset=([\w\d-]+);?')
django/django/forms/widgets.py:925:    date_re = re.compile(r'(\d{4}|0)-(\d\d?)-(\d\d?)$')
django/django/db/backends/sqlite3/introspection.py:8:field_size_re = re.compile(r'^\s*(?:var)?char\s*\(\s*(\d+)\s*\)\s*$')
django/django/db/backends/mysql/base.py:49:server_version_re = re.compile(r'(\d{1,2})\.(\d{1,2})\.(\d{1,2})')
django/django/db/migrations/writer.py:170:            if re.match(r"^import (.*)\.\d+[^\s]*$", line):
django/django/db/migrations/autodetector.py:1233:        match = re.match(r'^\d+', name)
django/tests/i18n/tests.py:524:            self.assertEqual(r'j \d\e F \d\e Y', get_format('DATE_FORMAT'))
django/tests/logging_tests/tests.py:527:            self.assertRegex(logger_output.getvalue(), r'^\[[-:,.\s\d]+\] %s' % log_msg)
django/tests/gis_tests/geo3d/tests.py:218:        ref_kml_regex = re.compile(r'^<Point><coordinates>-95.363\d+,29.763\d+,18</coordinates></Point>$')
django/tests/sitemaps_tests/urls/http.py:198:    url(r'^i18n/testmodel/(?P<id>\d+)/$', testmodelview, name='i18n_testmodel'),
django/tests/urlpatterns_reverse/urls.py:28:    url(r'^optional/(?P<arg1>\d+)/(?:(?P<arg2>\d+)/)?', absolute_kwargs_view, name="named_optional"),
django/tests/urlpatterns_reverse/urls.py:29:    url(r'^optional/(?P<arg1>\d+)/(?:(?P<arg2>\d+)/)?$', absolute_kwargs_view, name="named_optional_terminated"),

However, when I sent a pull request, I found the checks are not successful. I am still trying to figure it out. Help are welcome**

in reply to:  7 comment:9 by JunyiJ, 7 years ago

Based on https://docs.python.org/3/library/re.html:

\d
For Unicode (str) patterns:

Matches any Unicode decimal digit (that is, any character in Unicode character category [Nd]). This includes [0-9], and also many other digit characters. If the ASCII flag is used only [0-9] is matched (but the flag affects the entire regular expression, so in such cases using an explicit [0-9] may be a better choice).

I think using [0-9] may be better than ASCII. Correct me if I am wrong.

Replying to Sergey Fedoseev:

I'm not sure if this change should be done for bytes, also the usage of https://docs.python.org/3/library/re.html#re.ASCII flag could be better alternative to explicit [0-9].

comment:10 by Tim Graham, 7 years ago

Has patch: set
Needs tests: set

PR. I think tests should be added to demonstrate the benefit of each change.

comment:11 by Mariusz Felisiak, 4 years ago

Component: DocumentationCore (Other)
Owner: JunyiJ removed
Status: assignednew

comment:12 by Sergey Fedoseev, 4 years ago

Cc: Sergey Fedoseev removed

comment:13 by Ad Timmering, 3 years ago

Cc: Ad Timmering added

comment:14 by Ad Timmering, 3 years ago

Owner: set to Ad Timmering
Status: newassigned

comment:15 by Ad Timmering, 3 years ago

I went through each of the use cases in the Django code - and found little reason/benefit to update most.

Important background to this is that in all unicode matches for \d, int(x) actually properly casts the number back to a normal int. Most cases where a decimal is expected and extracted, it will be casted with int(x) so the problem goes away -- or might actually be beneficial to users (eg. I live in Japan where people frequently use full-width decimals, such as 012345 instead of 012345). Eg.

>>> int('\uABF9')  # MEETEI MAYEK DIGIT NINE
9

Most use cases to me seem to fall in one of the below:

a) We're processing user input which might actually be (inadvertently) input in non-ASCII; so result is likely desired - and the very least changing it could mean it's a braking change for users. ==> DON'T CHANGE

b) Changing to a more restrictive regex seems harmless enough, but also doesn't add much value. Eg. when parsing a version number like "1.2.3" with something like (\d)\.(\d)\.(\d).

c) To me there was only one case of Django code where it might be beneficial to change, which is in django.utils.http processing of dates/times in HTTP headers - and the spec clearly requires ASCII digits.

Inventory of use cases with thoughts in this Google doc. I should add I skipped *.js in my inventory.

Curious to thoughts of others.

Last edited 3 years ago by Ad Timmering (previous) (diff)

comment:16 by Ad Timmering, 3 years ago

Doc was not set to be openable by all - apologies.
PTAL, comments welcome here on the code tracker as well of course (but on the doc also fine).

Last edited 3 years ago by Ad Timmering (previous) (diff)

comment:17 by Jacob Walls, 3 years ago

Needs tests: unset

comment:18 by Mariusz Felisiak, 3 years ago

Patch needs improvement: set

Ad, thanks for the investigation and all details. Please see my comments in the doc. I think we should:

  • make changes in few more cases, and
  • change URL patterns in tests.

PS Can you transfer your doc to the table in the ticket description? (or in the comment). For future references, we should keep the discussion in the Trac.

comment:19 by Ad Timmering, 3 years ago

Thanks for the detailed review and comments Marius. Will process your comments (and update PR) over the next few week or so, and transfer the table to Trac as well.
(Doc is just easier to comment on for first iteration:))

comment:20 by Ad Timmering, 3 years ago

Description: modified (diff)

Description updated.

comment:21 by Ad Timmering, 3 years ago

Patch needs improvement: unset

comment:22 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedReady for checkin

comment:23 by Mariusz Felisiak, 3 years ago

Description: modified (diff)

comment:24 by Mariusz Felisiak, 3 years ago

Cc: security@… added

comment:25 by Mariusz Felisiak, 3 years ago

Description: modified (diff)

comment:26 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In fe769442:

Refs #28628 -- Added tests for intcomma with non-ASCII digits.

comment:27 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In bdf3e156:

Fixed #28628 -- Changed \d to [0-9] in regexes where appropriate.

comment:28 by Adam Johnson, 3 years ago

Thank you for the dedication and patience doing this Ad! And for your thorough review, Mariusz. Looking through the table there are a lot of clear improvements, that who knows, could have become security reports :)

Note: See TracTickets for help on using tickets.
Back to Top