lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens
Date Mon, 17 Aug 2009 07:22:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743983#action_12743983
] 

Andrzej Bialecki  commented on LUCENE-1813:
-------------------------------------------

+1. One comment, perhaps stating the obvious .. I picked char 0001 for two reasons - it's
not likely to be used in regular text, and its UTF-8 encoding uses one byte. The use case
for this filter means that it will create more or less as many tokens as there were in the
original token stream, thus doubling the size of term dictionary. One byte here, one byte
there, and suddenly it matters whether we use 0001 or FFFF ...

> Add option to ReverseStringFilter to mark reversed tokens
> ---------------------------------------------------------
>
>                 Key: LUCENE-1813
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1813
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>    Affects Versions: 2.9
>            Reporter: Andrzej Bialecki 
>            Assignee: Robert Muir
>             Fix For: 2.9
>
>         Attachments: LUCENE-1813.patch, reverseMark-2.patch, reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" reversed tokens
with a special marker character (Unicode 0001). This is useful when indexing both straight
and reversed tokens (e.g. to implement efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message