Context Navigation

← Previous Ticket
Next Ticket →

#2634 closed defect (wontfix)

[search-api] Lucene issues with UTF

Reported by:	Scater	Owned by:	nobody
Component:	Contrib apps	Version:	other branch
Severity:	normal	Keywords:	search-api; beck; lucene; index;
Cc:		Triage Stage:	Accepted
Has patch:	no	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description

Here http://code.djangoproject.com/changeset/3632 - information about adding search-api branch for Brian Beck's SoC project.

I have 2 major bugs in Lucene backend (lucene.py):

No index in no-utf encoding. My database and Django-site in cp-1251 (Russian-Ukrainian). So i have 'InvalidArgsError' bug when i call indexer.update() method.

Exception Value: (, 'init', ('annotation', '\xe0\xe0\xe0\xe0\xe0\xe0\xe0\xe0\xe0', , ))
\xe0..\xe0 - this is my russian text in Annotation field

When i complete update index in UTF, and use search like this:

for hit in News_Indexer.search(result_query):

result.append(hit.instance)

And update index once again - i have Java Exception: 'Cant delete file' or 'cant create file'

Change History (7)

comment:1 by anonymous, 19 years ago

milestone:	Version 0.93
Version:	0.95

comment:2 by anonymous, 19 years ago

Owner:	Brian Beck removed

comment:3 by Adrian Holovaty, 19 years ago

priority:	high → normal
Severity:	major → normal

comment:4 by Michael Radziej <mir@…>, 18 years ago

Version:	→ other branch

comment:5 by Simon G. <dev@…>, 18 years ago

Summary:	Bugs in Brian Beck's SoC project. Search-API → [search-api] Lucene issues with UTF
Triage Stage:	Unreviewed → Accepted

comment:6 by jelle, 18 years ago

I agree that this is an major show stopper for this otherwise very nice addition.
Its really a pity that Unicode isn't supported, that's such a severe error that it renders this effort pretty useless I'm afraid...
Which is really too bad, since it sure has a huge potential!

Here's a traceback:

In [4]: models.indexer.update()

<type 'exceptions.UnicodeEncodeError'> Traceback (most recent call last)

c:\Jelle_prive\Jelle_dev\workspace\JakobMacfarlane\src\jm_book_site\<ipython console> in <module>()

c:\Python25\lib\site-packages\django\contrib\search\lucene.py in update(self, documents)

53 for document in update_queue:
54 self.delete(document)

---> 55 self.index(document)

56
57 if close:

c:\Python25\lib\site-packages\django\contrib\search\lucene.py in index(self, row)

102 # newlines solves this.
103 contents = '\n'.join([str(getattr(row, field.name)) for field in \

--> 104 self.text_fields])

105 # FIXME: Hardcoded 'contents' field.
106 document.add(PyLucene.Field('contents', contents,

<type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\u2019' in position 224: ordinal not in r
ange(128)

In [5]:

comment:7 by Simon G. <dev@…>, 18 years ago

Resolution:	→ wontfix
Status:	new → closed

Currently the search-api is dead in the water, so I'm marking this as wontfix unless someone takes over the search-api

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues

Context Navigation

#2634 closed defect (wontfix)

[search-api] Lucene issues with UTF

Description

Change History (7)

comment:1 by anonymous, 19 years ago

comment:2 by anonymous, 19 years ago

comment:3 by Adrian Holovaty, 19 years ago

comment:4 by Michael Radziej <mir@…>, 18 years ago

comment:5 by Simon G. <dev@…>, 18 years ago

comment:6 by jelle, 18 years ago

comment:7 by Simon G. <dev@…>, 18 years ago

Download in other formats:

Django Links

Learn More

Get Involved

Get Help

Follow Us

Support Us