Opened 19 years ago
Closed 18 years ago
#2634 closed defect (wontfix)
[search-api] Lucene issues with UTF
| Reported by: | Scater | Owned by: | nobody |
|---|---|---|---|
| Component: | Contrib apps | Version: | other branch |
| Severity: | normal | Keywords: | search-api; beck; lucene; index; |
| Cc: | Triage Stage: | Accepted | |
| Has patch: | no | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Here http://code.djangoproject.com/changeset/3632 - information about adding search-api branch for Brian Beck's SoC project.
I have 2 major bugs in Lucene backend (lucene.py):
- No index in no-utf encoding. My database and Django-site in cp-1251 (Russian-Ukrainian). So i have 'InvalidArgsError' bug when i call indexer.update() method.
Exception Value: (, 'init', ('annotation', '\xe0\xe0\xe0\xe0\xe0\xe0\xe0\xe0\xe0', , ))
\xe0..\xe0 - this is my russian text in Annotation field
- When i complete update index in UTF, and use search like this:
for hit in News_Indexer.search(result_query):
result.append(hit.instance)
And update index once again - i have Java Exception: 'Cant delete file' or 'cant create file'
Change History (7)
comment:1 by , 19 years ago
| milestone: | Version 0.93 |
|---|---|
| Version: | 0.95 |
comment:2 by , 19 years ago
| Owner: | removed |
|---|
comment:3 by , 19 years ago
| priority: | high → normal |
|---|---|
| Severity: | major → normal |
comment:4 by , 19 years ago
| Version: | → other branch |
|---|
comment:5 by , 19 years ago
| Summary: | Bugs in Brian Beck's SoC project. Search-API → [search-api] Lucene issues with UTF |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
comment:6 by , 18 years ago
comment:7 by , 18 years ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
Currently the search-api is dead in the water, so I'm marking this as wontfix unless someone takes over the search-api
I agree that this is an major show stopper for this otherwise very nice addition.
Its really a pity that Unicode isn't supported, that's such a severe error that it renders this effort pretty useless I'm afraid...
Which is really too bad, since it sure has a huge potential!
Here's a traceback:
In [4]: models.indexer.update()
<type 'exceptions.UnicodeEncodeError'> Traceback (most recent call last)
c:\Jelle_prive\Jelle_dev\workspace\JakobMacfarlane\src\jm_book_site\<ipython console> in <module>()
c:\Python25\lib\site-packages\django\contrib\search\lucene.py in update(self, documents)
---> 55 self.index(document)
c:\Python25\lib\site-packages\django\contrib\search\lucene.py in index(self, row)
--> 104 self.text_fields])
<type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\u2019' in position 224: ordinal not in r
ange(128)
In [5]: