lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Cowan (JIRA)" <>
Subject [jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens
Date Mon, 17 Aug 2009 00:04:14 GMT


Paul Cowan commented on LUCENE-1813:

Yeah, ok, makes sense.

I'd suggest choosing a range of Private Use characters from the BMP block then, that's what
they're for. Doesn't really matter which... we can pick a block of (say) 256 and use the first
one for this, then others can be used for other purposes later if required. U+ECxx, maybe,
because that's got 3 letters out of 'lucene' in it. So EC00 means 'reversed', and then people
who need other similar filters can organise amongst themselves.

> Add option to ReverseStringFilter to mark reversed tokens
> ---------------------------------------------------------
>                 Key: LUCENE-1813
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>    Affects Versions: 2.9
>            Reporter: Andrzej Bialecki 
>            Assignee: Robert Muir
>             Fix For: 2.9
>         Attachments: reverseMark-2.patch, reverseMark.patch
> This patch implements additional functionality in the filter to "mark" reversed tokens
with a special marker character (Unicode 0001). This is useful when indexing both straight
and reversed tokens (e.g. to implement efficient leading wildcards search).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message