Opened 5 years ago

Closed 3 weeks ago

#30224 closed Bug (wontfix)

Mysql Datetime value is string instead of datetime object

Reported by: Martin Kuhn Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: mysql
Cc: Simon Charette, Sarah Boyce Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Martin Kuhn)

The Problem
Datetimes are sometimes returned as objects or strings. It depends on the value. E.g. '2019-03-01 00:00:00.000000' is returned as a string, while2018-12-12 22:02:53.134000 is provided as a datetime object in the sql/compiler.py.

My Table

CREATE TABLE myTable (

id char(32) COLLATE utf8mb4_unicode_ci NOT NULL,
created_on datetime(6) NOT NULL,
modified_on datetime(6) NOT NULL,
first_spot_broadcast datetime(6) DEFAULT NULL,
expiry_date datetime(6) DEFAULT NULL

)

DB driver: django.db.backends.mysql
Django version: 2.0.4 tried 2.2 as well
DRF 3.8.2
mysqlclient = "==1.4.2"

This issue only occurred for me when I execute this query. I tried upgrading to 2.2 and the issue persisted.

        .annotate(
        minExpiryDate=Min('buyout__expiry_date', filter=Q(buyout__is_archived=False, buyout__is_deleted=False))) \
        .filter(Q(minExpiryDate__gte=(now + expiryInterval))
                & Q(minExpiryDate__lte=(now + expiryInterval + timedelta(days=1))))

My stack trace is:

Traceback (most recent call last):
  File "/usr/share/pycharm/helpers/pydev/pydevd.py", line 1741, in <module>
    main()
  File "/usr/share/pycharm/helpers/pydev/pydevd.py", line 1735, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/usr/share/pycharm/helpers/pydev/pydevd.py", line 1135, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/usr/share/pycharm/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/server/manage.py", line 15, in <module>
    execute_from_command_line(sys.argv)
  File "site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "site-packages/django/core/management/base.py", line 316, in run_from_argv
    self.execute(*args, **cmd_options)
  File "site-packages/django/core/management/base.py", line 353, in execute
    output = self.handle(*args, **options)
  File "/server/buyout/management/commands/send_expiry_reminder_email.py", line 8, in handle
    send_expiry_reminder_summary()
  File "site-packages/celery/local.py", line 191, in __call__
    return self._get_current_object()(*a, **kw)
  File "site-packages/celery/app/task.py", line 375, in __call__
    return self.run(*args, **kwargs)
  File "/server/buyout/tasks/expiry_reminder.py", line 46, in send_expiry_reminder_summary
    if len(projects_expiring) > 0:
  File "site-packages/django/db/models/query.py", line 250, in __len__
    self._fetch_all()
  File "site-packages/django/db/models/query.py", line 1186, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "site-packages/django/db/models/query.py", line 63, in __iter__
    for row in compiler.results_iter(results):
  File "site-packages/django/db/models/sql/compiler.py", line 1009, in apply_converters
    value = converter(value, expression, connection)
  File "site-packages/django/db/backends/mysql/operations.py", line 252, in convert_datetimefield_value
    value = timezone.make_aware(value, self.connection.timezone)
  File "site-packages/django/utils/timezone.py", line 264, in make_aware
    return timezone.localize(value, is_dst=is_dst)
  File "site-packages/pytz/__init__.py", line 222, in localize
    if dt.tzinfo is not None:
AttributeError: 'str' object has no attribute 'tzinfo'
Exception ignored in: <generator object cursor_iter at 0x7f7ab3f88390>
Traceback (most recent call last):
  File "site-packages/django/db/models/sql/compiler.py", line 1469, in cursor_iter
    cursor.close()
  File "site-packages/MySQLdb/cursors.py", line 86, in close
    while self.nextset():
  File "site-packages/MySQLdb/cursors.py", line 136, in nextset
    nr = db.next_result()
MySQLdb._exceptions.OperationalError: (2006, '')

Change History (27)

comment:1 by Martin Kuhn, 5 years ago

Description: modified (diff)

comment:2 by Martin Kuhn, 5 years ago

Component: UncategorizedDatabase layer (models, ORM)

comment:3 by Martin Kuhn, 5 years ago

Type: UncategorizedBug

comment:4 by Martin Kuhn, 5 years ago

Could it be related to this bug ?

https://sourceforge.net/p/mysql-python/bugs/325/

comment:5 by Simon Charette, 5 years ago

Resolution: needsinfo
Status: newclosed

Hello Martin,

There's effectively something broken here but nothing seems to prove Django is at fault; as you've noticed it looks like it's mysqlclient that returns str for some datetimes.

What I suggest you do to confirm the origin of the bug is to use mysqlclient directly to connect to your database and retrieve the problematic rows. If the returned data contains string for datetimes columns then it's likely a mysqlclient bug and there's not much Django can do.

I'll close the ticket for now but please reopen if you can provide more details about how Django is at fault.

comment:6 by Martin Kuhn, 5 years ago

Ok, I managed to narrow down the problem. My Query is using annotations to use the aggregation function MIN.

Project.objects \
        .prefetch_related('buyout_set') \
        .prefetch_related('teammember_set') \
        .annotate(min_expiry_date=Min('buyout__expiry_date',
                                      filter=Q(buyout__is_archived=False, buyout__is_deleted=False))
                  ) \
        .filter(Q(min_expiry_date__gte=(now + expiryInterval))
                & Q(min_expiry_date__lte=(now + expiryInterval + timedelta(days=1))))

I tested the generated query below using mysqlclient and found that it does return the annotated value as a string. The problem is a custom annotation "MIN(Date)". Probably mysqlclient has no type info for this custom value, since it could be any/multiple types. (E.g. using an IF statement.)
However this auto generated query could be rewritten to use CAST( "annotation" as DateTime).

The resulting query is


SELECT 
    `advertisement_project`.`id`,
    `advertisement_project`.`created_on`,
    `advertisement_project`.`modified_on`,
    `advertisement_project`.`is_deleted`,
    `advertisement_project`.`created_by_id`,
    `advertisement_project`.`modified_by_id`,
    `advertisement_project`.`name`,
    `advertisement_project`.`code`,
    `advertisement_project`.`spot_length`,
    `advertisement_project`.`eta`,
    `advertisement_project`.`spot_id`,
    `advertisement_project`.`spot_id_suffix`,
    `advertisement_project`.`folder_id`,
    `advertisement_project`.`system_id`,
    `advertisement_project`.`status`,
    `advertisement_project`.`type`,
    `advertisement_project`.`ad_type`,
    `advertisement_project`.`parent_project_id`,
    `advertisement_project`.`original_project_id`,
    `advertisement_project`.`territory_id`,
    `advertisement_project`.`language_id`,
    `advertisement_project`.`campaign_id`,
    `advertisement_project`.`preview_id`,
    `advertisement_project`.`links_id`,
    `advertisement_project`.`producer_names`,
    MIN(CASE
        WHEN
            (`buyout_buyout`.`is_archived` = 0
                AND `buyout_buyout`.`is_deleted` = 0)
        THEN
            `buyout_buyout`.`expiry_date`
        ELSE NULL
    END) AS `minExpiryDate`
FROM
    `advertisement_project`
        LEFT OUTER JOIN
    `buyout_buyout` ON (`advertisement_project`.`id` = `buyout_buyout`.`project_id`)
GROUP BY `advertisement_project`.`id`
HAVING (MIN(CASE
    WHEN
        (`buyout_buyout`.`is_archived` = 0
            AND `buyout_buyout`.`is_deleted` = 0)
    THEN
        `buyout_buyout`.`expiry_date`
    ELSE NULL
END) >= '2019-03-01 00:00:00'
    AND MIN(CASE
    WHEN
        (`buyout_buyout`.`is_archived` = 0
            AND `buyout_buyout`.`is_deleted` = 0)
    THEN
        `buyout_buyout`.`expiry_date`
    ELSE NULL
END) <= '2019-03-02 00:00:00')
ORDER BY NULL;

comment:7 by Martin Kuhn, 5 years ago

I think the most straightforward and graciously solution is to patch Django's django/db/backends/mysql/operations.py convert_datetimefield_value function. It should simply check for type string and try to convert it manually.

comment:8 by Martin Kuhn, 5 years ago

Resolution: needsinfo
Status: closednew

comment:9 by Simon Charette, 5 years ago

Triage Stage: UnreviewedAccepted

Thanks for the investigation Martin, it's greatly appreciated.

I agree that the most straightforward solution for the datetime would be to adapt convert_datetimefield_value but it'd be great to figure out if other types are affected as well (e.g. decimal, floats, ...) so we can figure out if the solution needs to live at the Aggregate(filter).as_mysql level as well.

Could you possible try to figure if it's the case?

I also wonder if it could be possible to let mysqlclient know what the expected type is for a specific cursor instead though. We have this information in the compiler (through .output_field) so if we can provide per-cursor converter hints I assume it would be even more efficient because it would be performed in C. Another option is figuring out if mysqlclient can somehow figure out how to introspect such return value on it's own (e.g. maybe MySQL returns a different field type in this case) but in all cases we'll have to ship a Django workaround anyway.

comment:11 by Martin Kuhn, 5 years ago

The problem seems to be Aggregation functions have a beef with Datetime in general. They convert them into a string. :(
So truly the best workaround would be to Wrap these functions with a CAST operation.

c.execute(""" SELECT MIN(CAST('2018-01-01 00:00:00' AS datetime)); """)
1
c.fetchall()[0][0]
'2018-01-01 00:00:00'

By the way mysqlclient's response was "The answer is noooo!" Defiantly not their problem.

We could ask Mysql to fix it, but the same seems to be the case for mariaDB. I experimented with different MySql flavours and versions, before opening this bug. That means a improvement on the server would probably be around 2055.

comment:12 by Simon Charette, 5 years ago

Version: 2.2master

Ah that's too bad. I guess we'll have to fix it on the Django side in this case.

Does it only happen for Min and Max functions? If it's the case I'm tempted to suggest we address the issue by implementing as_mysql methods on these two functions that wrap the inner expression with a CAST like you did here when self.output_field.internal_type() == 'DateTimeField. The easiest way is probably to provide a different template to the super functioné

Here's an example of how this was done for a duration handling of Avg on MySQL

Happy to review your PR if you're interested in submitting one.

comment:13 by Martin Kuhn, 5 years ago

Yes, I can take a loop in my free time. I just need to make the following correction.
Seems like this is only a problem for MySql and Percona. Looks like mariadb is working fine. I just checked against their corresponding latest versions.

comment:14 by Rohit Jha, 4 years ago

Owner: changed from nobody to Rohit Jha
Status: newassigned

comment:15 by Rohit Jha, 4 years ago

I am trying to recreate this issue, I have written a (test case)https://github.com/django/django/compare/master...rohitjha941:ticket_30224 which generates the following query:

SELECT "expressions_experiment"."id", 
       "expressions_experiment"."name", 
       "expressions_experiment"."assigned", 
       "expressions_experiment"."completed", 
       "expressions_experiment"."estimated_time", 
       "expressions_experiment"."start", 
       "expressions_experiment"."end", 
       Min(CASE 
             WHEN "expressions_result"."experiment_id" = 1 THEN 
             "expressions_result"."result_time" 
             ELSE NULL 
           END) AS "min_expiry_date" 
FROM   "expressions_experiment" 
       LEFT OUTER JOIN "expressions_result" 
                    ON ( "expressions_experiment"."id" = 
                         "expressions_result"."experiment_id" ) 
GROUP  BY "expressions_experiment"."id", 
          "expressions_experiment"."name", 
          "expressions_experiment"."assigned", 
          "expressions_experiment"."completed", 
          "expressions_experiment"."estimated_time", 
          "expressions_experiment"."start", 
          "expressions_experiment"."end" 

and output is

  [{'id': 1, 'name': 'Rohit', 'assigned': datetime.date(2020, 3, 1), 'completed': datetime.date(2020, 3, 21), 'estimated_time': datetime.timedelta(days=20, seconds=36000), 'start': datetime.datetime(2020, 3, 1, 11, 23, 36, 34995), 'end': datetime.datetime(2020, 3, 21, 21, 23, 36, 34998), 'min_expiry_date': datetime.datetime(2020, 4, 20, 21, 23, 36, 35778)}] 

comment:16 by Sergey Fedoseev, 4 years ago

Cc: Sergey Fedoseev added

comment:17 by Tobias Krönke, 22 months ago

Could we proceed with this issue with this very minimalistic failing example?

Failing with AttributeError

>>> SomeModel.objects.annotate(bla=Value(timezone.now(), models.DateTimeField())).first().bla
AttributeError: 'str' object has no attribute 'utcoffset'

Working with Cast

>>> SomeModel.objects.annotate(bla=Cast(Value(timezone.now(), models.DateTimeField()), models.DateTimeField())).first().bla
datetime.datetime(2022, 6, 21, 8, 25, 39, 678166, tzinfo=datetime.timezone.utc)

Are we asking everyone to always wrap all kinds of expressions involving timestamps with Cast? There is no such hint in https://docs.djangoproject.com/en/4.0/ref/models/expressions/#value-expressions. We use Value(<some datetime>) a lot as fallbacks in When, Coalesce, Least, ... expressions.

comment:18 by Sergey Fedoseev, 22 months ago

Cc: Sergey Fedoseev removed

comment:19 by aleaforny, 12 months ago

I confirm that this issue is related to mysql, in my case with version 5.7

Upgrading to MariaDB solved this issue. However, it can be tricky to do this kind of upgrade, so I'll give my feedback if it can help others that struggle to proceed to the upgrade.

Disclaimer/Environment

First and foremost, please take into consideration the following environment/context as I have NOT tested this process for other OS/version, so if you're not under this environment, you should proceed CAREFULLY and double-check steps!

  • CentOS 7.5
  • MySQL version 5.7
  • Django 2.2.28
  • Python 3.6.9
  • pip 21.0.1

Prerequisites

I suggest you first to put your server into maintenance. It depends on your WSGI application/webserver, but the goal here is to prevent any inbound HTTP connection to your server. In my case, I just configured a maintenance view from the NGINX server (we have a reverse proxy) in order to let the users know, and we just stopped all redirections to our gunicorn instances.

Then :

  1. Stop mysql service
    systemctl stop mysql
    
  1. Take a backup of the databases (it will save all of your Django applications/projects if you have several). PS: For the sake of the example, I've stored it on /root but feel free to store anywhere you want, and also care to specify the root user of your mysql instance to get the dump.
    cp -apf /var/lib/mysql /var/lib/mysql_backup
    cd /root/
    mysqldump --all-databases -u <your_user_goes_here> -p > all_databases.sql
    
  1. Finally, remove all mysql packages
    yum remove mysql-server mysql-client mysql-community-*
    

Install MariaDB server 10.4

  1. You need to use the MariaDB official repositories for YUM
    yum install wget
    wget https://downloads.mariadb.com/MariaDB/mariadb_repo_setup
    chmod +x mariadb_repo_setup
    sudo ./mariadb_repo_setup
    
  1. Before you can install MariaDB, you may need to cleanup your ius and epel releases repos in your server (in my case, I had to, otherwise it would generate errors during installing MariaDB). If you feel you can skip this step, go ahead, you can go back to this step if needed.
    yum erase ius-release
    rm /etc/yum.repos.d/ius*.repo*
    yum install https://repo.ius.io/ius-release-el7.rpm https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
    
  1. Then, install all the packages required for Django to work with MariaDB (MariaDB-devel was required in my case, otherwise some libraries were missing for Django to work)
    yum install MariaDB-server MariaDB-client MariaDB-devel
    
  1. Start the MariaDB services afterwards
    systemctl start mariadb
    systemctl enable mariadb
    

Proceed to migration

Welcome to the tricky steps! Good luck...

  1. You should start by using the MariaDB's command to upgrade the database. If you get no errors, good for you.. :)
    mariadb-upgrade -u <your_db_user> -p
    
  1. If errors appeared at some steps, you can always try to add the --force flag as per suggested by MariaDB:
    mariadb-upgrade -u <your_db_user> -p --force
    
  1. If you had no luck with some tables using JSON data type, you may encounter Unknown data type "MYSQL_JSON" errors when selecting data in your DB. Then you need to manually fix all the tables with that JSON type columns. First, you need to stop the MariaDB instance
    systemctl stop mariadb
    
  1. Then, open a new mysql instance with the following settings (in my case, datadir was /var/lib/mysql/ and plugin-dir was /usr/lib64/mysql/plugin)
    mysqld --no-defaults --datadir=<Your data dir directory> --lc-messages_dir=./share --plugin-dir=<path to directory containing type_mysql_json.so> --plugin-maturity=alpha -u <your_sql_user>
    
  1. On another terminal, you can then install the mysql_json plugin and fix the tables with
    install soname 'type_mysql_json';
    ALTER TABLE `database_name`.`table_name` FORCE;  
    
  1. You can also use this command if you have lots of tables with JSON type columns
    SELECT CONCAT("ALTER TABLE `", TABLE_SCHEMA,"`.`", TABLE_NAME, "` FORCE;") AS MySQLCMD FROM TABLES WHERE TABLE_SCHEMA = "<YOUR_DATABASE_NAME>";
    
  1. If this, still, does not fix all of your SELECT commands in your DB, you can also try to retrieve the dump directly:
    mysql -u <your_db_user> -p < all_databases.sql
    
  1. Don't forget to restart the MariaDB instance, if it was shutdown :)
    systemctl start mariadb
    

Testing & troubleshooting

You should deeply test requests (read, create, update and delete) once everything seem to work again. You can, for instance:

  • Run random SELECT on several tables of your databases.
  • Start a Django instance
  • Dive deeply into your Django Admin, and do some CRUD operations from there

Here a few bugs that I've encountered after the migration, if it helps:

  • Error loading MySQLdb module. Did you install mysqlclient? when launching a Django instance

To solve this, I just had to reinstall mysql client from pip

pip install --force-reinstall --ignore-installed --no-binary :all: mysqlclient
  • EnvironmentError: mysql_config not found when launching a Django instance

Make sure that you have all the MariaDB yum required packages (especially the MariaDB-devel package)

  • For no apparent reasons, some of the tables didn't recovered from the upgrade (Celery-related issue)

I have several django-celery-beat instances running, and for some, the periodic tasks did not recovered from the upgrade. I simply had to manually restore the tables from Celery (days after migration), and here some useful tricks if you want to recover from a previous SQL dump without erasing all your current and accurate data from other tables.

  1. From your SQL server, create a temp restore user and assign permissions on the only tables you'd want to restore
    GRANT SELECT ON django_project.* TO 'admin_restore_temp'@'localhost'  IDENTIFIED BY 'its_pwd';
    GRANT ALL ON django_project.table1 TO 'admin_restore_temp'@'localhost';
    
  1. Restore the data (assuming you're getting a recover from /data/backup/db1.sql dump file)
    mysql --user admin_restore_temp --password --force < /data/backup/db1.sql
    
  1. Check that everything is okay, and don't forget to drop the temp user
    DROP USER admin_restore_temp;
    

Useful resources

Hope these instructions can help you to migrate safely from mysql to MariaDB in order to workaround this issue. Here's some of the resources I've used:

Good luck!

comment:20 by Weiyi Yang, 3 weeks ago

I experienced the exact same issue with Django 4.1.13, mysqlclient 2.2.4 and MySQL 5.7.44-48. It seems setting USE_TZ=False will bypass this issue but of course it creates other problems by losing time zone info. The issue can not be reproduced with MySQL 8.0.32-24.

Last edited 3 weeks ago by Weiyi Yang (previous) (diff)

comment:21 by Natalia Bidart, 3 weeks ago

Cc: Simon Charette added

For the record, Django supports MySQL 8.0.11 and higher.

Simon, do you think this issue is still valid (assuming that in fact is not reproduceable in MySQL 8 and above)?

comment:22 by Simon Charette, 3 weeks ago

If we've confirmed that the issue is only reproducible on versions on MySQL that Django no longer support then it should effectively be closed.

comment:23 by Natalia Bidart, 3 weeks ago

Cc: Sarah Boyce added

Adding Sarah as cc since we agreed she may use their MySQL setup to check on this.

comment:24 by Sarah Boyce, 3 weeks ago

As long as the PR covers the issue as you all understand it, I can confirm that this is issue is resolved for MySQL 8.0.22+ but still exists from 8.0.11-8.0.21.

In MySQL 8.0.21

FAIL [0.010s]: test_filtered_aggregate_datetime (aggregation.test_filter_argument.FilteredAggregateTests.test_filtered_aggregate_datetime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tests/django/tests/aggregation/test_filter_argument.py", line 223, in test_filtered_aggregate_datetime
    self.assertEqual(
AssertionError: {'max_filtered_pubdate': '2008-06-23'} != {'max_filtered_pubdate': datetime.date(2008, 6, 23)}
- {'max_filtered_pubdate': '2008-06-23'}
+ {'max_filtered_pubdate': datetime.date(2008, 6, 23)}

but from MySQL 8.0.22+

.
----------------------------------------------------------------------
Ran 1 test in 0.023s

OK

(happy days)

comment:25 by Simon Charette, 3 weeks ago

Thanks for the sleuthing Sarah!

In this case I think we should either keep this ticket open, as MySQL 8.0 EOL in April 2025, in hope that someone will take a look at it and provide a not too invasive patch that can make the cut for the 5.1 release in August as this won't qualify for 5.2.

Given how long this ticket has been opened and that upgrading to MySQL 8.0.22+ (released 4 years ago) should be relatively trivial if you're on 8.0.11+ already I think we should simply wont-fix this one.

comment:26 by Natalia Bidart, 3 weeks ago

Keywords: mysql added; Mysql removed
Owner: Rohit Jha removed
Status: assignednew
Triage Stage: AcceptedUnreviewed

I think it's best to take the second approach which is to wontfix this ticket since it's very unlikely that we could get a fix in before feature freeze.

comment:27 by Natalia Bidart, 3 weeks ago

Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top