Opened 3 weeks ago
Last modified 2 weeks ago
#36988 assigned Bug
Limitation of supported GeoIP databases is too tight
| Reported by: | rami | Owned by: | SnippyCodes |
|---|---|---|---|
| Component: | GIS | Version: | 6.0 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | yes |
| Easy pickings: | no | UI/UX: | no |
Description
Starting in Django 5.1, Django ships with a [list of allowed GeoIP database types](https://github.com/django/django/blame/main/django/contrib/gis/geoip2.py):
SUPPORTED_DATABASE_TYPES = {
"DBIP-City-Lite",
"DBIP-Country-Lite",
"GeoIP2-City",
"GeoIP2-Country",
"GeoLite2-City",
"GeoLite2-Country",
}
It seems weird that Django seems to be enforcing where I am getting my databases from. We've always been using the database freely available from [here](https://github.com/geoacumen/geoacumen-country), which has the type "Geoacumen-Country".
Is it really intended that I need to monkeypatch Django to use this GeoIP database from a source that is not known to Django? Should this list extensible in some official way?
Change History (8)
comment:1 by , 3 weeks ago
comment:2 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
The commit in question is 3fad712a91a8a8f6f6f904aff3d895e3b06b24c7. Feel free to make a proposal, but unfortunately it's too late to backport a fix to Django 5.2 since per our supported versions policy, it only receives security fixes at this time.
comment:3 by , 3 weeks ago
| Owner: | set to |
|---|---|
| Status: | new → assigned |
comment:4 by , 3 weeks ago
| Has patch: | set |
|---|
comment:5 by , 3 weeks ago
| Patch needs improvement: | set |
|---|
comment:6 by , 3 weeks ago
Looking at this more deeply (since a few suggestions in the PR by @SnippyCodes have been rejected), I'm wondering if the root cause is that the changes in proposed in #35100 were misguided.
While that proposal simplified the code, it effectively broke the ability to use the existing GEOIP_PATH with one of GEOIP_COUNTRY or GEOIP_CITY to tell Django "this is a country DB" or "this is a city DB". It was instead replaced with a system that tries to infer that information from the "DB type" somehow, completely ignoring that the information as to whether it is a city or country database may already be encoded in the choice of those settings. See the Django 3.2 code here for how it used to be handled. The repeated cycle of regressions and fixes that introduce new regressions all seem to stem from trying to simplify that Django 3.2 code in #35100.
Which brings me to the question of "what is the supported database list actually for?". Because it seems like right now it's acting as a list of "DB types" that satisfy the existing string matching being done to determine if it is a city or country file (see here).
A proposed solution:
- If the DB is provided via a
paththat points to a file (either via apathkeyword argument when instantiatingGeoIP2or via theGEOIP_PATHsetting), then it must also meet the existing check viaSUPPORTED_DATABASE_TYPES. E.g. this maintains the current behaviour for a path pointing directly to a file. - If the DB is provided via a combination of
paththat points to a directory (either via apathargument when instantiatingGeoIP2or via theGEOIP_PATHsetting), and one ofcountryorcity(either via thecountry/citykeyword argument when instantiatingGeoIP2or via theGEOIP_COUNTRY/GEOIP_CITYsettings), then theSUPPORTED_DATABASE_TYPEScheck is skipped (since as far as I can see it doesn't serve a purpose). - The
GeoIP2.is_city/GeoIP2.is_countryproperties returnTrue/Falsedepending on whether the file was specified as a country/city via the algorithm in (2) and only fall back to the existing string matching behaviour on the "DB type" if the database was loaded via the algorithm in (1)
This would allow users to, via instantiation arguments or the existing global settings, provide a city or country database regardless of what the specific format of the "DB type" is set as by the database authors, while negating the need for monkey patching the SUPPORTED_DATABASE_TYPES. It would also maintain the existing behaviour (e.g. it should be backwards compatible)
Feedback on this idea is welcome - I may very well have missed something
comment:7 by , 3 weeks ago
Following up on the PR review, we need a way to allow users to utilize valid paid/custom GeoIP databases (like DBIP-Country or Geoacumen-Country) without relying on fragile substring matching or adding a new global Django setting.
Here are two potential API approaches for the community to consider:
Option A: Instance-Level Configuration
Allow developers to pass a custom list of supported types directly when initializing the GeoIP2 object.
API: GeoIP2(supported_types=["DBIP-Country", "Geoacumen-Country"])
Pros: Explicit, localized to where the object is used, doesn't pollute global settings.
Cons: Users might have to wrap the GeoIP2 instantiation if they use it in multiple places across their app.
Option B: "Bless" the AppConfig Modification
Keep SUPPORTED_DATABASE_TYPES as a module-level set (as it currently is), but officially document the recommended way for users to extend it inside their AppConfig.ready() method.
API: `python
from django.contrib.gis.geoip2 import SUPPORTED_DATABASE_TYPES
class MyGISAppConfig(AppConfig):
def ready(self):
SUPPORTED_DATABASE_TYPES.add("DBIP-Country")
I am happy to implement either of these, or another approach if the community prefers. What are your thoughts?
comment:8 by , 2 weeks ago
Thank you for the incredible deep dive, @PhilS! Tracing this back to the regressions introduced in #35100 makes perfect sense.
Your proposed solution is much cleaner than adding new settings. Trusting the explicit country/city kwargs (or GEOIP_COUNTRY/GEOIP_CITY settings) to bypass the strict SUPPORTED_DATABASE_TYPES check is an elegant, backward-compatible approach. The underlying geoip2 library will still naturally catch any invalid file formats.
I am going to pivot the PR to implement your exact algorithm. I will update the GeoIP2 initialization logic to trust the explicit routing and only use SUPPORTED_DATABASE_TYPES as a fallback. I'll push an updated patch shortly!
I have also just hit this, except I'm using the paid version of the country database from DBIP which has a database type of
DBIP-Country(no-Litesuffix). I have difficulty believing the free version is supported but the paid version is not.For me personally, it would be ideal if a fix was back-ported to 5.2 LTS as this feels like a regression from 4.2 LTS.