Opened 9 years ago

Closed 9 years ago

#25399 closed Cleanup/optimization (invalid)

Improve performance for user lookups by adding database index

Reported by: SimonSteinberger Owned by: nobody
Component: contrib.auth Version: dev
Severity: Normal Keywords: user, database, index
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Markus Holtermann)

Having a user database table with millions of entries, even a simple look-up by username (or email address) takes a rather long time and causes a significant overhead on servers. For example, the Django database query for a user signing in looks roughly like this:

SELECT * FROM "accounts_user" WHERE UPPER("accounts_user"."email"::text) = UPPER('foo@bar.com')

The problem is, without a database index, the string transformation and comparison takes a lot of resources and even with a powerful infrastructure, such a query may take over a second when having a million or more users. With the growing user base, the query gets slower an slower.

By adding a database index, the process takes only about a microsecond. Using PostgreSQL, we've simply added the index manually, which works fine:

CREATE INDEX accounts_user_username_upper ON  accounts_user (UPPER(username));

Yet, this is a crucial part of most web applications and therefore, it would probably make sense if Django created the index automatically for the integrated User and AbstractUser classes. Or at least, the issue could be pointed out clearly in the docs.

Change History (2)

comment:1 by SimonSteinberger, 9 years ago

Description: modified (diff)

comment:2 by Markus Holtermann, 9 years ago

Description: modified (diff)
Resolution: invalid
Status: newclosed
Version: 1.8master

The two examples you give are invalid: both, emails and usernames are case sensitive, either by their standard or by Django's default implementation.

Furthermore, Django already has a unique constraint on the username which is the primary identifying data point after the user's id. I don't think adding an index on the email makes much sense for Django's default implementation. If email is your primary field to authenticate a user against you should already make sure the value is unique which implies having an index.

Note: See TracTickets for help on using tickets.
Back to Top