Opened 16 years ago

Closed 10 years ago

#7074 closed Bug (worksforme)

MySQL error/warning when 'gt' field lookup with a datetime field and fulltext search.

Reported by: anonymous Owned by: ANUBHAV JOSHI
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: qsrf-cleanup mysql, fulltext search, gt
Cc: cortland@…, anubhav9042@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Malcolm Tredinnick)

This query produces the following traceback:

Keyword.objects.filter(keyword__search=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1")
term = 'test'

In [28]: Keyword.objects.filter(keyword__search=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1"
).select_related()
Out[29]: ---------------------------------------------------------------------------
Warning                                   Traceback (most recent call last)

/home/lybp/dev/lybp/<ipython console> in <module>()

/usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/Prompts.py in __call__(self, arg)
    533
    534             # and now call a possibly user-defined print mechanism
--> 535             manipulated_val = self.display(arg)
    536
    537             # user display hooks can change the variable to be stored in

/usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/Prompts.py in _display(self, arg)
    559             return IPython.generics.result_display(arg)
    560         except TryNext:
--> 561             return self.shell.hooks.result_display(arg)
    562
    563     # Assign the default display method:

/usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/hooks.py in __call__(self, *args, **kw)
    132             #print "prio",prio,"cmd",cmd #dbg
    133             try:
--> 134                 ret = cmd(*args, **kw)
    135                 return ret
    136             except ipapi.TryNext, exc:

/usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/hooks.py in result_display(self, arg)
    160
    161     if self.rc.pprint:
--> 162         out = pformat(arg)
    163         if '\n' in out:
    164             # So that multi-line strings line up with the left column of

/usr/local/lib/python2.5/pprint.py in pformat(self, object)
    109     def pformat(self, object):
    110         sio = _StringIO()
--> 111         self._format(object, sio, 0, 0, {}, 0)
    112         return sio.getvalue()
    113
/usr/local/lib/python2.5/pprint.py in _format(self, object, stream, indent, allowance, context, level)
    127             self._readable = False
    128             return
--> 129         rep = self._repr(object, context, level - 1)
    130         typ = _type(object)
    131         sepLines = _len(rep) > (self._width - 1 - indent - allowance)

/usr/local/lib/python2.5/pprint.py in _repr(self, object, context, level)
    193     def _repr(self, object, context, level):
    194         repr, readable, recursive = self.format(object, context.copy(),
--> 195                                                 self._depth, level)
    196         if not readable:
    197             self._readable = False

/usr/local/lib/python2.5/pprint.py in format(self, object, context, maxlevels, level)
    205         and whether the object represents a recursive construct.
    206         """
--> 207         return _safe_repr(object, context, maxlevels, level)
    208
    209

/usr/local/lib/python2.5/pprint.py in _safe_repr(object, context, maxlevels, level)
    290         return format % _commajoin(components), readable, recursive
    291
--> 292     rep = repr(object)
    293     return rep, (rep and not rep.startswith('<')), False
    294

/usr/local/lib/python2.5/site-packages/django/db/models/query.py in __repr__(self)
    106
    107     def __repr__(self):
--> 108         return repr(self._get_data())
    109
    110     def __len__(self):

/usr/local/lib/python2.5/site-packages/django/db/models/query.py in _get_data(self)
    484     def _get_data(self):
    485         if self._result_cache is None:
--> 486             self._result_cache = list(self.iterator())
    487         return self._result_cache
    488
/usr/local/lib/python2.5/site-packages/django/db/models/query.py in iterator(self)
    187
    188         cursor = connection.cursor()
--> 189         cursor.execute("SELECT " + (self._distinct and "DISTINCT " or "") + ",".join(select) + sql, params)
    190
    191         fill_cache = self._select_related

/usr/local/lib/python2.5/site-packages/django/db/backends/util.py in execute(self, sql, params)
     16         start = time()
     17         try:
---> 18             return self.cursor.execute(sql, params)
     19         finally:
     20             stop = time()

/home/lybp/dev/lybp/build/bdist.linux-i686/egg/MySQLdb/cursors.py in execute(self, query, args)

/home/lybp/dev/lybp/build/bdist.linux-i686/egg/MySQLdb/cursors.py in _warning_check(self)

/usr/local/lib/python2.5/warnings.py in warn(message, category, stacklevel)
     60     registry = globals.setdefault("__warningregistry__", {})
     61     warn_explicit(message, category, filename, lineno, module, registry,
---> 62                   globals)
     63
     64 def warn_explicit(message, category, filename, lineno,

/usr/local/lib/python2.5/warnings.py in warn_explicit(message, category, filename, lineno, module, registry, module_globals)
    100
    101     if action == "error":
--> 102         raise message
    103     # Other actions
    104     if action == "once":

Warning: Truncated incorrect DOUBLE value: '2008-04-23 14:39:36.133203'

In [30]: Keyword.objects.filter(keyword__search=term, keyworddata__updated=datetime.datetime.now(), keyworddata__source="1"
).select_related()
Out[31]: []
In [12]: Keyword.objects.filter(keyword=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1"
).select_related()
Out[13]: []

If I change keyword__search= to keyword=, I don't get the error. Also if I use keyworddata__updated= instead of keyworddata__updated__gt I don't get the error.

KeywordData is a model with a foreign key pointing to keyword - so I guess it's a reverse lookup.

I'm using MySQL 5.0.45.

Change History (23)

comment:1 by Erik Wickstrom <erik@…>, 16 years ago

It looks to me like the date isn't being quoted - or at least that's what connection.queries says...

comment:2 by Malcolm Tredinnick, 16 years ago

Description: modified (diff)

Fixed description.

comment:3 by George Vilches, 16 years ago

Keywords: qsrf-cleanup added

comment:4 by Jacob, 16 years ago

milestone: 1.0

comment:5 by Malcolm Tredinnick, 16 years ago

This seems to be trying to report two bugs in the one report, which is a no-no. Let's have one ticket about the __gt comparison and one about the full text searching, each with a small model (as small as possible) demonstrating the problem.

Note that Erik's comment isn't the source of the problem, since connection.queries shows the query prior to it being passed to the Python DB wrapper, which is where any quoting takes place.

This needs confirmation and a simpler example and I'm in a triage mode at the moment, so I'm not going to do that right now. If somebody else has time and inclination to reproduce this, it would be appreciated.

comment:6 by Karen Tracey <kmtracey@…>, 16 years ago

I think it's only one (MySQL) bug, some interaction with fulltext search and greater than (or gte) comparison involving dates. I can recreate in just plain mysql, but if I change the gte to lte or the fulltext search to a like search, the warning goes away.

mysql> select count(Clue) from Clues natural join Puzzles where match (Clue) against ('Marceau') and `Date` >= '2008-06-26'; 
+-------------+
| count(Clue) |
+-------------+
|           1 |
+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+------------------------------------------------+
| Level   | Code | Message                                        |
+---------+------+------------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '2008-06-26' |
+---------+------+------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(Clue) from Clues natural join Puzzles where match (Clue) against ('Marceau') and `Date` <= '2008-06-26';
+-------------+
| count(Clue) |
+-------------+
|          23 |
+-------------+
1 row in set (0.01 sec)

mysql> select count(Clue) from Clues natural join Puzzles where Clue like '%Marceau%' and `Date` >= '2008-06-26';   
+-------------+
| count(Clue) |
+-------------+
|           1 |
+-------------+
1 row in set (0.00 sec)

That's just bizarre, so I searched for MySQL bugs involving this message and found this:

http://bugs.mysql.com/bug.php?id=34374

which looks like a pretty good match. They've verified it's a bug but have no fix yet. They list a workaround to use CAST(date_val AS DATE)...somehow I'm thinking that wouldn't be trivial for the ORM to start doing?

comment:7 by Malcolm Tredinnick, 16 years ago

Owner: changed from nobody to Malcolm Tredinnick
Triage Stage: UnreviewedAccepted

Impressive debugging (and searching), again, Karen. Thanks. But... good grief! :-(

Since we've also got integers needing to be cast to text for PostgreSQL 8.3 in another ticket, it looks like it's time to crank out some general code to explicitly cast to the right type in a bunch of cases. Maybe it's not too horrible, since we really only convert Python values to SQL lookup values in one place. I'll look into it.

comment:8 by Malcolm Tredinnick, 16 years ago

Owner: Malcolm Tredinnick removed

There's no point this staying on my list at the moment.. I have a bunch of other stuff I'm doing first. So reassigning back to the pool in case anybody else wants to pick it up. I'm hoping it's possible to do something similar to [8242] in this case.

comment:9 by Karen Tracey <kmtracey@…>, 16 years ago

I took a look at this. The cast needs to be added on the right-hand-side, not the left as is done by the use of lookup_cast in [8242]. There doesn't seem to be any general backend hook for adding casting sql to the right-hand-side.

What there is is datetime_cast_sql (currently only implemented by the Oracle backend) that is called from source:django/trunk/django/db/models/sql/where.py in make_atom in the case where value_annot is datetime.datetime:

        if value_annot is datetime.datetime:
            cast_sql = connection.ops.datetime_cast_sql()
        else:
            cast_sql = '%s'

Implementing this for mysql as:

    def datetime_cast_sql(self):
        return 'CAST(%s AS DATETIME)'

will actually fix the problem exactly as reported, but not the identical problem that exists for just plain DATE fields, nor the case where the values are passed in as strings not date[time] objects. Keeping the above datetime_cast_sql, plus defining a new general right-hand-side casting hook and implementing it for mysql like so:

    def rhs_cast_sql(self, db_type):
        if db_type in ['date', 'datetime']:
            return 'CAST(%%s AS %s)' % db_type
        return '%s'

plus changing the code in where.py make_atom to call it:

        if value_annot is datetime.datetime:
            cast_sql = connection.ops.datetime_cast_sql()
        else 
            cast_sql = connection.ops.rhs_cast_sql(db_type)

works to fix the reported problem for both DATE and DATETIME fields (and covers the case where values are passed as strings, not date[time] objects).

But that puts the mysql fix in two different places and that if value_annot...else is a bit mysterious looking. I'd think it would be better to have a single general hook that covers both what the existing Oracle datetime_cast_sql function does and what's needed for mysql for this bug. Unfortunately I have zero knowledge of Oracle so I'm not sure what the value of db_type is for which the current datetime_cast_sql function is being called. (And value_annot is apparently not useful for datetime.date values -- it's set to 'True' for them? So value_annot can't be used in the general hook case.)

So, in summary I think a fix for this would be to implement a general right-hand-side casting hook for the database backends but doing that correctly requires some Oracle knowledge I don't have. I'll investigate a little more if anyone cares to provides clues for me and thinks this approach is worthwhile.

[Also, I don't know how to write a test for this. The MySQL bug only manifests itself when you've got a full-text index involved in the WHERE clause. So it would have to be a test that ran only when the backend was MySQL with the MyISAM storage engine, that created a full-text index (via custom sql?) and used it in a lookup in combination with a date/datetime gt/gte lookup. Not sure it's worth figuring out how to do all that in a test? Actually I'm not sure this problem is worth fixing in Django since it's really a MySQL bug, but as of today the MySQL bug is still open with no fix in sight other than the casting workaround, so it's still there for Django users to hit.]

comment:10 by Dan Watson, 16 years ago

In my patch for [8242], I initially looked into creating a cast function for values, in addition to the field cast function there currently. A couple notes:

  • The db_type values are defined per Field type in the backend's creation module (DATE and TIMESTAMP for Oracle).
  • I tried to implement a generic value casting method, whilst doing away with datetime_cast_sql. The problem is then you need at least the value's type to be passed through (as value_annot does now) to the cast method.
  • It seems like value_annot should be split to avoid it's current dual function of knowing when a value evaluates to True/False, and specifying the value's type for datetimes. However, that tuple that gets passed around is also used by GIS, so it's not completely internal.

So a generic rhs_cast_sql function for Oracle may look something like this:

def rhs_cast_sql(self, db_type, value_type):
    if db_type == 'TIMESTAMP' and value_type is datetime.datetime:
        return "TO_TIMESTAMP(%s, 'YYYY-MM-DD HH24:MI:SS.FF')"
    return "%s"

comment:11 by Jacob, 16 years ago

milestone: 1.0post-1.0

This needs to wait on a proper rhs-cast solution, which won't be in 1.0. So, for anyone reading this far down using 1.0: sorry -- try not to do this kind of lookup, and we'll fix this bug in the next release!

in reply to:  description comment:12 by pixelcort, 15 years ago

Cc: cortland@… added

I'm also expreiencing this issue when using djangosearch r21 with a filter on a resulting search; the filter has a gte on a datetime.now.

comment:13 by reun, 15 years ago

You can get around the bug by using .extra(). Instead of

task_set.filter(field__gt=time)

use

task_set.extra(where=['field > DATE(%s)'], params=[time]).

comment:14 by (none), 15 years ago

milestone: post-1.0

Milestone post-1.0 deleted

comment:15 by bluszcz, 14 years ago

I have just also experienced this problem, I am using software with below versions:

  • Django 1.2.1
  • MySQL 5.1.41
  • python-mysqldb 1.2.2

Full traceback below:

In [20]: MessageSearch.objects.using('search').all().count()
Out[20]: 826

In [21]: MessageSearch.objects.using('search').filter(search_full__search='newsy').filter(created__gt='2010-10-05').count()
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (215, 0))

ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (21, 0))

ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (70, 0))

---------------------------------------------------------------------------
Warning                                   Traceback (most recent call last)

/home/bluszcz/projekty/property/rynek_pierwotny/<ipython console> in <module>()

/home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/query.pyc in count(self)
    324             return len(self._result_cache)
    325
--> 326         return self.query.get_count(using=self.db)
    327
    328     def get(self, *args, **kwargs):

/home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/query.pyc in get_count(self, using)
    392
    393         obj.add_count_column()
--> 394         number = obj.get_aggregation(using=using)[None]
    395
    396         # Apply offset and limit constraints manually, since using LIMIT/OFFSET


/home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/query.pyc in get_aggregation(self, using)
    364         query.related_select_fields = []
    365
--> 366         result = query.get_compiler(using).execute_sql(SINGLE)
    367         if result is None:
    368             result = [None for q in query.aggregate_select.items()]

/home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/compiler.pyc in execute_sql(self, result_type)
    725
    726         cursor = self.connection.cursor()
--> 727         cursor.execute(sql, params)
    728
    729         if not result_type:

/home/bluszcz/projekty/property/rynek_pierwotny/django/db/backends/util.pyc in execute(self, sql, params)
     13         start = time()
     14         try:
---> 15             return self.cursor.execute(sql, params)
     16         finally:
     17             stop = time()

/home/bluszcz/projekty/property/rynek_pierwotny/django/db/backends/mysql/base.pyc in execute(self, query, args)
     84     def execute(self, query, args=None):
     85         try:
---> 86             return self.cursor.execute(query, args)
     87         except Database.IntegrityError, e:
     88             raise utils.IntegrityError, utils.IntegrityError(*tuple(e)), sys.exc_info()[2]

/usr/lib/pymodules/python2.6/MySQLdb/cursors.pyc in execute(self, query, args)
    166             self.errorhandler(self, exc, value)
    167         self._executed = query
--> 168         if not self._defer_warnings: self._warning_check()
    169         return r
    170

/usr/lib/pymodules/python2.6/MySQLdb/cursors.pyc in _warning_check(self)
     80                     self.messages.append((self.Warning, w))
     81                 for w in warnings:
---> 82                     warn(w[-1], self.Warning, 3)
     83             elif self._info:
     84                 self.messages.append((self.Warning, self._info))

Warning: Truncated incorrect DOUBLE value: '2010-10-05 00:00:00'

In [22]:

Someone mentioned above, that it could related with MySQL bug: http://bugs.mysql.com/bug.php?id=34374

and looks like that it has been fixed there for versions 5.5+. Anyone maybe checked this?

comment:16 by Luke Plant, 13 years ago

Severity: Normal
Type: Bug

comment:17 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:18 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:19 by Aymeric Augustin, 11 years ago

Component: Core (Other)Database layer (models, ORM)

comment:20 by ANUBHAV JOSHI, 10 years ago

Owner: set to ANUBHAV JOSHI
Status: newassigned

Working on this in my GSoC project.

comment:21 by ANUBHAV JOSHI, 10 years ago

Can someone help me in reproducing this now. Because __gt, __gte works for both DateTimeField as well as DateField. If that's not the only issue here, kindly point me to what I have been missing.

I did:

s = "2014-06-12 11:57"
Comment.objects.filter(article__pub_date__gte=s)
Last edited 10 years ago by ANUBHAV JOSHI (previous) (diff)

comment:22 by ANUBHAV JOSHI, 10 years ago

models.py

from django.db import models

class Test(models.Model):
	name = models.CharField(max_length=40)
	created = models.DateField()
	created2 = models.DateTimeField()

shell:

In [1]: from t19508.models import *

In [2]: import datetime

In [3]: d = datetime.datetime.now()

In [4]: d2 = d.date()

In [5]: t=Test.objects.using('mysql').create(name='anubhav', created=d.date(), created2=d)

In [6]: t=Test.objects.using('mysql').filter(created2__gte="2008-04-23 14:39:36.133203")

In [7]: t[0].created
Out[7]: datetime.date(2014, 7, 28)

In [8]: t[0].created2
Out[8]: datetime.datetime(2014, 7, 28, 23, 36, 13, tzinfo=<UTC>)

In [9]: t=Test.objects.using('mysql').filter(name__search="anubhav").filter(created2__gte="2008-04-23 14:39:36.133203")

In [10]: t[0].created
Out[10]: datetime.date(2014, 7, 28)

In [11]: t[0].created2
Out[11]: datetime.datetime(2014, 7, 28, 23, 36, 13, tzinfo=<UTC>)

I think that the problem now doesn't exists.

comment:23 by ANUBHAV JOSHI, 10 years ago

Cc: anubhav9042@… added
Resolution: worksforme
Status: assignedclosed
Note: See TracTickets for help on using tickets.
Back to Top