Opened 3 weeks ago

Last modified 2 weeks ago

#36988 assigned Bug

Limitation of supported GeoIP databases is too tight

Reported by: rami Owned by: SnippyCodes
Component: GIS Version: 6.0
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Starting in Django 5.1, Django ships with a [list of allowed GeoIP database types](https://github.com/django/django/blame/main/django/contrib/gis/geoip2.py):

SUPPORTED_DATABASE_TYPES = {
    "DBIP-City-Lite",
    "DBIP-Country-Lite",
    "GeoIP2-City",
    "GeoIP2-Country",
    "GeoLite2-City",
    "GeoLite2-Country",
}

It seems weird that Django seems to be enforcing where I am getting my databases from. We've always been using the database freely available from [here](https://github.com/geoacumen/geoacumen-country), which has the type "Geoacumen-Country".

Is it really intended that I need to monkeypatch Django to use this GeoIP database from a source that is not known to Django? Should this list extensible in some official way?

Change History (8)

comment:1 by PhilS, 3 weeks ago

I have also just hit this, except I'm using the paid version of the country database from DBIP which has a database type of DBIP-Country (no -Lite suffix). I have difficulty believing the free version is supported but the paid version is not.

For me personally, it would be ideal if a fix was back-ported to 5.2 LTS as this feels like a regression from 4.2 LTS.

comment:2 by Tim Graham, 3 weeks ago

Triage Stage: UnreviewedAccepted

The commit in question is 3fad712a91a8a8f6f6f904aff3d895e3b06b24c7. Feel free to make a proposal, but unfortunately it's too late to backport a fix to Django 5.2 since per our supported versions policy, it only receives security fixes at this time.

comment:3 by SnippyCodes, 3 weeks ago

Owner: set to SnippyCodes
Status: newassigned

comment:4 by SnippyCodes, 3 weeks ago

Has patch: set

comment:5 by David Smith, 3 weeks ago

Patch needs improvement: set

comment:6 by PhilS, 3 weeks ago

Looking at this more deeply (since a few suggestions in the PR by @SnippyCodes have been rejected), I'm wondering if the root cause is that the changes in proposed in #35100 were misguided.

While that proposal simplified the code, it effectively broke the ability to use the existing GEOIP_PATH with one of GEOIP_COUNTRY or GEOIP_CITY to tell Django "this is a country DB" or "this is a city DB". It was instead replaced with a system that tries to infer that information from the "DB type" somehow, completely ignoring that the information as to whether it is a city or country database may already be encoded in the choice of those settings. See the Django 3.2 code here for how it used to be handled. The repeated cycle of regressions and fixes that introduce new regressions all seem to stem from trying to simplify that Django 3.2 code in #35100.

Which brings me to the question of "what is the supported database list actually for?". Because it seems like right now it's acting as a list of "DB types" that satisfy the existing string matching being done to determine if it is a city or country file (see here).

A proposed solution:

  1. If the DB is provided via a path that points to a file (either via a path keyword argument when instantiating GeoIP2 or via the GEOIP_PATH setting), then it must also meet the existing check via SUPPORTED_DATABASE_TYPES. E.g. this maintains the current behaviour for a path pointing directly to a file.
  2. If the DB is provided via a combination of path that points to a directory (either via a path argument when instantiating GeoIP2 or via the GEOIP_PATH setting), and one of country or city (either via the country/city keyword argument when instantiating GeoIP2 or via the GEOIP_COUNTRY/GEOIP_CITY settings), then the SUPPORTED_DATABASE_TYPES check is skipped (since as far as I can see it doesn't serve a purpose).
  3. The GeoIP2.is_city/GeoIP2.is_country properties return True/False depending on whether the file was specified as a country/city via the algorithm in (2) and only fall back to the existing string matching behaviour on the "DB type" if the database was loaded via the algorithm in (1)

This would allow users to, via instantiation arguments or the existing global settings, provide a city or country database regardless of what the specific format of the "DB type" is set as by the database authors, while negating the need for monkey patching the SUPPORTED_DATABASE_TYPES. It would also maintain the existing behaviour (e.g. it should be backwards compatible)

Feedback on this idea is welcome - I may very well have missed something

comment:7 by SnippyCodes, 3 weeks ago

Following up on the PR review, we need a way to allow users to utilize valid paid/custom GeoIP databases (like DBIP-Country or Geoacumen-Country) without relying on fragile substring matching or adding a new global Django setting.
​Here are two potential API approaches for the community to consider:

​Option A: Instance-Level Configuration
Allow developers to pass a custom list of supported types directly when initializing the GeoIP2 object.

​API: GeoIP2(supported_types=["DBIP-Country", "Geoacumen-Country"])
​Pros: Explicit, localized to where the object is used, doesn't pollute global settings.
​Cons: Users might have to wrap the GeoIP2 instantiation if they use it in multiple places across their app.

Option B: "Bless" the AppConfig Modification
Keep SUPPORTED_DATABASE_TYPES as a module-level set (as it currently is), but officially document the recommended way for users to extend it inside their AppConfig.ready() method.

API: `python
from django.contrib.gis.geoip2 import SUPPORTED_DATABASE_TYPES
​class MyGISAppConfig(AppConfig):
def ready(self):
SUPPORTED_DATABASE_TYPES.add("DBIP-Country")

I am happy to implement either of these, or another approach if the community prefers. What are your thoughts?

comment:8 by SnippyCodes, 2 weeks ago

Thank you for the incredible deep dive, @PhilS! Tracing this back to the regressions introduced in #35100 makes perfect sense.
Your proposed solution is much cleaner than adding new settings. Trusting the explicit country/city kwargs (or GEOIP_COUNTRY/GEOIP_CITY settings) to bypass the strict SUPPORTED_DATABASE_TYPES check is an elegant, backward-compatible approach. The underlying geoip2 library will still naturally catch any invalid file formats.

I am going to pivot the PR to implement your exact algorithm. I will update the GeoIP2 initialization logic to trust the explicit routing and only use SUPPORTED_DATABASE_TYPES as a fallback. I'll push an updated patch shortly!

Note: See TracTickets for help on using tickets.
Back to Top