#17266 closed Cleanup/optimization (fixed)
psycopg2 backend should use get_parameter_status to check if SET TIME ZONE is needed
Reported by: | Anssi Kääriäinen | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
In ticket #17062 Aymeric Augustin suggested the idea for using get_parameter_status to check if the database default time zone is already correct. If it is, we can skip the SET TIME ZONE and associated isolation level handling for every new connection. Ticket #17062 has more details, but I will copy-paste the most relevant parts here. If this is not done, for every new connection we need to do these queries:
26639: LOG: connection authorized: user=akaj database=testdb1 26639: LOG: statement: SHOW default_transaction_isolation 26639: LOG: statement: SET default_transaction_isolation TO DEFAULT 26639: LOG: statement: SET TIME ZONE 'America/Chicago' 26639: LOG: statement: SET default_transaction_isolation TO 'read uncommitted' 26639: LOG: statement: BEGIN 26639: LOG: statement: first real query...
If we use get_parameter_status, we can get it down to this:
27163: LOG: connection authorized: user=akaj database=testdb1 27163: LOG: statement: SHOW default_transaction_isolation 27163: LOG: statement: BEGIN 27163: LOG: statement: first real query
The get_parameter_status was added in psycopg2 version 2.0.12 http://initd.org/psycopg/docs/connection.html#connection.get_parameter_status. I can't find a mention if we need to support versions prior to this. If we need to support versions prior to 2.0.12, we can just add a check for the psycopg2 version in use and not use get_parameter_status if the version doesn't support it.
I will wait until ticket #17062 gets fixed before working on this.
Attachments (2)
Change History (13)
comment:1 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 13 years ago
Has patch: | set |
---|---|
Type: | Uncategorized → Cleanup/optimization |
by , 13 years ago
Attachment: | 17266.diff added |
---|
comment:3 by , 13 years ago
I have tested this on psycopg 2.0.1, which does not work at all with Django (unrelated to this ticket - expects port as an integer), psycopg2 2.0.7 - works, psycopg2 2.0.11 - works, psycopg 2.3.2 works and finally the current psycopg 2.4.2 - works. Results in above post are with 2.4.2. In no case did the patch result in worse behavior. If get_parameter_status is not available, the patch doesn't change anything for new connections. If it is available, and time zone is correct for the connection, you can get rid of 3 non-necessary queries.
My opinion is that the patch is ready for committer. A summary for reviewer / committer:
- Please review the documentation changes with extra care. Writing documentation is not my strongest area...
- The patch will get rid of all non-necessary queries for new connections when settings for the connection are correct (in postgresql.conf or for the user). The documentation mentions the necessary changes to connection settings.
- The patch should work with all supported psycopg2 versions.
- There are no tests, as there is no easy way to test what queries psycopg2 has actually executed. See above post for how to test this manually.
comment:4 by , 13 years ago
Owner: | changed from | to
---|
comment:5 by , 13 years ago
Patch needs improvement: | set |
---|
In the documentation, you suggest setting client_encoding
, timezone
, and default_transaction_isolation
. However, in the code, I only see changes for timezone
.
Did you intend to make similar changes for the two other parameters? Or did I misunderstand something?
comment:6 by , 13 years ago
The documentation talks about postgresql.conf. That is, setting those parameters is done in postgresql.conf, or for the user you use for taking the connection. That is, those are hints for things you should do to your PostgreSQL setup. If you do change those, you will get fewer queries (including a removed extra commit for timezone) in connection setup. The parameters are set up for each new connection, but late versions (at least 2.4.2) of psycopg are smart enough to skip the SET queries for client_encoding and transaction_isolation if they are already correct. And the included get_parameter_status change allows for skipping the SET timezone.
I don't know if it is a good idea to include this hint in the documentation, but I figured that it is better to mention it than keep it as hidden knowledge. Also, the documentation probably needs to be more clear. If you missed this, then so will others. Me need better writing skills :)
comment:7 by , 13 years ago
Thanks, I was just missing this piece of information:
late versions (at least 2.4.2) of psycopg are smart enough to skip the SET queries for client_encoding and transaction_isolation if they are already correct
I'm going to rework the patch slightly and commit it.
by , 13 years ago
Attachment: | 17266.2.diff added |
---|
comment:8 by , 13 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Attached patch is probably good to go.
Since we don't have control over PostgreSQL's configuration in the test suite, there are no tests. I tested manually and reproduced Anssi's results.
comment:9 by , 13 years ago
Code looks good, and the documentation is much better than my version. Too bad there isn't any way to test the generated queries. Would it make sense to add a reference to this ticket in the code as a guard against future regressions? On the other hand, that part of the code begins to look like Trac ticket listing soon :)
Provided is a patch that does use get_parameter_status to sniff the connection's default time zone. I did a little testing of this. Queries generated for a new connection with patch (using psycopg2 version 2.4.2):
Without patch:
The precondition for this is that the connections default time zone is actually the wanted timezone (settings TIME_ZONE if not USE_TZ, else UTC). Attached patch contains some documentation mentioning that it is a good idea to have the database or user defaults as mentioned above. Also, I think you will need psycopg2 version 2.4.2 or later (release notes) or you will get more queries than in the examples posted above. The patch should work with all versions of psycopg2, though.
The speed difference for making single query and then disconnecting is about 10% on a localhost database. It would be likely more if the database was on another machine, as there is no TCP/IP overhead for localhost connections. The test was:
So, there is some gain from this.
I have guarded against psycopg2 versions not having get_parameter_status in the patch (versions prior to 2.0.12).
I don't see any easy way to test if you actually get less queries - psycopg2 will not report all the internal queries made. Manually it is easy to test: just set log_statement to 'all' and log_connections to on in postgresql.conf (or for the db user) and see PostgreSQL's log to check what queries are made.
I haven't build the documentation, dunno if it really works. The reviewer/committer might want to check the language used in the documentation, my English isn't too good. Also, I haven't tested this on any other psycopg version than 2.4.2.