Opened 3 years ago

Last modified 3 years ago

#32411 closed Cleanup/optimization

JSONField filter icontains is case sensitive in mysql — at Version 1

Reported by: elonzh Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords: JSONField mysql
Cc: Mariusz Felisiak, Sage Abdullah Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by elonzh)

given model

class Paper(models.Model):
    ...
    authors = models.JSONField(default=list)
    ...
In [53]: p = Paper.objects.get(id=1)

In [54]: p.__dict__
Out[54]: 
{'_state': <django.db.models.base.ModelState at 0x7f2aa74b3790>,
 'id': 1,
...
 'authors': [{'name': 'Rehmann Kim-Thomas',
   'workplace': 'SAP SE, Walldorf, Germany'},
  {'name': 'Folkerts Enno', 'workplace': 'SAP SE, Walldorf, Germany'}],
...
}

In [55]: print(Paper.objects.filter(authors__icontains="rehmann").count())
0

In [56]: print(Paper.objects.filter(authors__icontains="Rehmann").count())
1

In [57]: print(Paper.objects.filter(authors__icontains="Rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`, `scholardata_paper`.`modified`, `scholardata_paper`.`doi`, `scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`, `scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`, `scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`, `scholardata_paper`.`mag_id`, `scholardata_paper`.`language`, `scholardata_paper`.`url`, `scholardata_paper`.`abstract`, `scholardata_paper`.`title`, `scholardata_paper`.`authors`, `scholardata_paper`.`keywords`, `scholardata_paper`.`issued`, `scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`, `scholardata_paper`.`volume`, `scholardata_paper`.`issue`, `scholardata_paper`.`page`, `scholardata_paper`.`references_count`, `scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`, `scholardata_paper`.`references_refreshed`, `scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE JSON_UNQUOTE(`scholardata_paper`.`authors`) LIKE %Rehmann% ORDER BY `scholardata_paper`.`id` ASC

In [58]: print(Paper.objects.filter(authors__iregex="rehmann").count())
1

In [59]: print(Paper.objects.filter(authors__iregex="Rehmann").count())
1

In [60]: print(Paper.objects.filter(authors__iregex="Rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`, `scholardata_paper`.`modified`, `scholardata_paper`.`doi`, `scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`, `scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`, `scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`, `scholardata_paper`.`mag_id`, `scholardata_paper`.`language`, `scholardata_paper`.`url`, `scholardata_paper`.`abstract`, `scholardata_paper`.`title`, `scholardata_paper`.`authors`, `scholardata_paper`.`keywords`, `scholardata_paper`.`issued`, `scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`, `scholardata_paper`.`volume`, `scholardata_paper`.`issue`, `scholardata_paper`.`page`, `scholardata_paper`.`references_count`, `scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`, `scholardata_paper`.`references_refreshed`, `scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE REGEXP_LIKE(JSON_UNQUOTE(`scholardata_paper`.`authors`), Rehmann, 'i') ORDER BY `scholardata_paper`.`id` ASC

In [63]: print(Paper.objects.filter(authors__0__name__icontains="rehmann").count())
1

In [64]: print(Paper.objects.filter(authors__0__name__icontains="rehmann").query)
SELECT `scholardata_paper`.`id`, `scholardata_paper`.`created`, `scholardata_paper`.`modified`, `scholardata_paper`.`doi`, `scholardata_paper`.`s2_id`, `scholardata_paper`.`wanfang_id`, `scholardata_paper`.`corpus_id`, `scholardata_paper`.`pm_id`, `scholardata_paper`.`arxiv_id`, `scholardata_paper`.`acl_id`, `scholardata_paper`.`mag_id`, `scholardata_paper`.`language`, `scholardata_paper`.`url`, `scholardata_paper`.`abstract`, `scholardata_paper`.`title`, `scholardata_paper`.`authors`, `scholardata_paper`.`keywords`, `scholardata_paper`.`issued`, `scholardata_paper`.`venue`, `scholardata_paper`.`journal_id`, `scholardata_paper`.`volume`, `scholardata_paper`.`issue`, `scholardata_paper`.`page`, `scholardata_paper`.`references_count`, `scholardata_paper`.`citations_count`, `scholardata_paper`.`refreshed`, `scholardata_paper`.`references_refreshed`, `scholardata_paper`.`citations_refreshed` FROM `scholardata_paper` WHERE LOWER(JSON_UNQUOTE(JSON_EXTRACT(`scholardata_paper`.`authors`, $[0]."name"))) LIKE LOWER(%rehmann%) ORDER BY `scholardata_paper`.`id` ASC

Change History (1)

comment:1 by elonzh, 3 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top