﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
15529	GeoJSON regexes doesn't accept some characters within a quoted string	Wouter Klein Heerenbrink	jbronn	"The `json_regex` that is used by GeoJSON to determine if some string is in JSON-format, does not accept some of the allowed characters within a quotes string, for instance brackets: '(', ')', '<', '>'. The result is that the following perfect JSON file is not being accepted:

{{{
{
  ""abc"": ""123 (bug)"",
  ""abc"": ""123 <bug>""
}
}}}

The problem is in the `json_regex` definition in [http://code.djangoproject.com/browser/django/trunk/django/contrib/gis/geometry/regex.py contrib.gis.geometry.regex] where `json_regex` is defined as
{{{
json_regex = re.compile(r'^(\s+)?\{[\s\w,\[\]\{\}\-\.""\':]+\}(\s+)?$')
}}}

The first solution to the problem is to change the regex in such a way that it accepts any character within a quotes string.
{{{
json_regex = re.compile(r'^(\s+)?\{([\s\w,\[\]\{\}\-\.:]|(\""[^\""]*\""))+\}(\s+)?$)
}}}

I think though, that the regex is too complicated for its purpose. As far as I can see, this regex is only to determine 'roughly' if something is in fact JSON or not (eg. it is done in [http://code.djangoproject.com/browser/trunk/djanog/contrib/gis/gdal/geometries.py django.contrib.gis.gdal.geometries]). Real validation is done by other methods, it really only checks if it somewhat looks like json.

Maybe even more important; the current json_regex does not provide us with any reliable information about wether the JSON is valid or not.
{{{
 {
   } 'reversed': 'i am' {,
   [[[ 'i am invalid json }}}
 }
}}}

This is not a problem because we only want to know if it looks like JSON before we start parsing the file into detail. But I think the following, less complicated, piece of code would do:

{{{
json_regex = re.compile(r'^(\s+)?\{.*\}(\s+)?$)
}}}

It just checks if the file-content starts with a { and ends with a } (excluding whitespace). It seems like not much of a check, but it is probably the most you can check using regex. Regex is not really designed to handle (infinite) recursion.

Last but not least; the the regex won't match any of the other formats that are described within GIS (HEX and WKT)."	Bug	closed	GIS	1.2	Normal	fixed	gis geojson gdal json_regex		Accepted	1	0	0	0	0	0
