Opened 11 years ago

Closed 10 years ago

#20401 closed Bug (fixed)

get_for_model queries the wrong database

Reported by: Antonis Christofides Owned by: Antonis Christofides
Component: contrib.contenttypes Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

The django.contrib.contenttypes.models.ContentTypeManager.get_for_model method, if it doesn't find the requested content type in the cache, uses get_or_create to get it from the database. In multi-database environments, get_or_create always queries the db_for_write and therefore I think it is inappropriate for this case, where there is an overwhelming probability that the object requested already exists and therefore no write will occur.

Here is a copy of the method for reference:

    def get_for_model(self, model, for_concrete_model=True):
        """
        Returns the ContentType object for a given model, creating the
        ContentType if necessary. Lookups are cached so that subsequent lookups
        for the same model don't hit the database.
        """
        opts = self._get_opts(model, for_concrete_model)
        try:
            ct = self._get_from_cache(opts)
        except KeyError:
            # Load or create the ContentType entry. The smart_text() is
            # needed around opts.verbose_name_raw because name_raw might be a
            # django.utils.functional.__proxy__ object.
            ct, created = self.get_or_create(
                app_label = opts.app_label,
                model = opts.model_name,
                defaults = {'name': smart_text(opts.verbose_name_raw)},
            )
            self._add_to_cache(self.db, ct)

        return ct

Change History (9)

comment:1 by Antonis Christofides, 11 years ago

comment:2 by Antonis Christofides, 11 years ago

Has patch: set

comment:3 by Russell Keith-Magee, 11 years ago

Resolution: needsinfo
Status: newclosed

Can you clarify where this is an issue? Under master/slave replication, it doesn't matter if you query the master or the slave; under sharding, your read and write database are the same. What's the use case where this manifests as a problem?

Closing needsinfo; please reopen if you can provide more details of why this is an issue.

comment:4 by Antonis Christofides, 11 years ago

Resolution: needsinfo
Status: closednew

ContentTypeManager.get_for_model() uses the master (the db_for_write) for a read operation. The load on the master is next to negligible because of the caching. However, master/slave can also be used for a kind of high availability where, if the master goes down, the system can continue to be used in read-only mode using the slave. So this is an issue because it prevents my application from running in read-only mode when the master goes down.

comment:5 by Tim Graham, 11 years ago

Patch needs improvement: set
Triage Stage: UnreviewedAccepted

comment:6 by anonymous, 10 years ago

I improved the patch with the help of bmispelon. However, it turns out there's (most probably) no use case after all, for the following reasons:

  • If the system is put in read-only mode, it is likely that the Django processes will need to be reconfigured and restarted. Reconfiguration would probably alter the DATABASES setting so that the (former) db_for_write wouldn't be there at all.
  • If the system is put in read-only mode without reconfiguring and restarting Django (we can do that because our Django code checks for the existence of a file in order to determine if it's in read-only mode), it means that the db_for_write is still up (or at least that it can accept connections), otherwise the Django processes would try to connect and cause an error.

The patch could be useful if there is a chance that the db_for_write somehow becomes unable to respond to queries, but still allows connections. It's really an edge case, if it exists at all.

I would certainly not do the work I did in this patch had I realized that, after all, it (most probably) doesn't affect us. However, now that the patch is ready, the question is whether it should be accepted. The argument for accepting it is that the db_for_read, according to the documentation, "suggest[s] the database that should be used for read operations"; so I can argue that if the database router suggests something, we should be following the suggestion, regardless of the reason behind the suggestion. So I think that the patch makes the behaviour cleaner and more conformant.

comment:7 by Antonis Christofides, 10 years ago

Owner: changed from nobody to Antonis Christofides
Status: newassigned

comment:8 by jarshwah, 10 years ago

I think this patch should be included. I can't see any problems with including it, and it does solve a small problem.

The following is some code taken from a ReadonlyDatabaseRouter (my particular case includes checking specific app-labels also) that we use during database migrations. The vast majority of calls to get_for_model should be able to fetch from the database without requiring a create (I would imagine, correct me if I'm wrong).

def db_for_write(self, model, **hints):
        is_read_only = getattr(settings, 'SITE_READ_ONLY', False)
        if is_read_only:
            raise DatabaseWriteDenied
        return None

This code will fail currently if get_for_model is called and there is an instance available to the database. Admittedly it'd be a pretty obscure edge case, but this read only router was adapted from something else I found online so there are definitely 2 users out there that this could possibly accommodate.

comment:9 by Loic Bistuer <loic.bistuer@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In 62f9508ade5233fc7bf8af5b3afcd2f5b57d8288:

Fixed #20401 -- ContentTypeManager.get_for_model reads from db_for_read.

Thanks Simon Charette and Tim Graham for the reviews.

Note: See TracTickets for help on using tickets.
Back to Top