Opened 9 years ago
Closed 9 years ago
#25399 closed Cleanup/optimization (invalid)
Improve performance for user lookups by adding database index
Reported by: | SimonSteinberger | Owned by: | nobody |
---|---|---|---|
Component: | contrib.auth | Version: | dev |
Severity: | Normal | Keywords: | user, database, index |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Having a user database table with millions of entries, even a simple look-up by username (or email address) takes a rather long time and causes a significant overhead on servers. For example, the Django database query for a user signing in looks roughly like this:
SELECT * FROM "accounts_user" WHERE UPPER("accounts_user"."email"::text) = UPPER('foo@bar.com')
The problem is, without a database index, the string transformation and comparison takes a lot of resources and even with a powerful infrastructure, such a query may take over a second when having a million or more users. With the growing user base, the query gets slower an slower.
By adding a database index, the process takes only about a microsecond. Using PostgreSQL, we've simply added the index manually, which works fine:
CREATE INDEX accounts_user_username_upper ON accounts_user (UPPER(username));
Yet, this is a crucial part of most web applications and therefore, it would probably make sense if Django created the index automatically for the integrated User and AbstractUser classes. Or at least, the issue could be pointed out clearly in the docs.
Change History (2)
comment:1 by , 9 years ago
Description: | modified (diff) |
---|
comment:2 by , 9 years ago
Description: | modified (diff) |
---|---|
Resolution: | → invalid |
Status: | new → closed |
Version: | 1.8 → master |
The two examples you give are invalid: both, emails and usernames are case sensitive, either by their standard or by Django's default implementation.
Furthermore, Django already has a unique constraint on the username which is the primary identifying data point after the user's id. I don't think adding an index on the email makes much sense for Django's default implementation. If email is your primary field to authenticate a user against you should already make sure the value is unique which implies having an index.