lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: Probelm sort on TermEnum
Date Tue, 07 Apr 2009 17:56:11 GMT
On 4/7/2009 at 1:19 PM, Michael McCandless wrote:
> I think the new contrib/collation package may address this use case?
> It converts each term to its CollationKey, outside of Lucene.

Since AFAIK CollationKey creation is a one-way process, CollationKeyFilter may not be useful
for Federica.

Federica, what use do you make of the terms returned by reader.terms()?  I ask because the
new CollationKeyFilter would produce terms that would not be suitable for human consumption,
but might be useful for other purposes.

Steve

> On Tue, Apr 7, 2009 at 7:36 AM, Federica Falini Data Management S.p.A
> <ffalini@datamanagement.it> wrote:
> > Good morning,
> > In Lucene 2.2 i have made modification to Term.java, TermBuffer.java
> > (see below)  in order to have  Term enumerations sorted case-insensitive
> > (when a field is not-tokenized):
> > TermEnum terms = reader.terms(new Term("myFieldNotTokenized", ""));
> >       while ("myFieldNotTokenized".equals(terms.term().field())) {
> >
> >         System.out.println( "     " + terms.term());
> >         if (!terms.next()) break;
> >   }
> >
> > For example, instead to obtain this sort on TermEnum:
> >
> > Annales
> > Cafè
> > Zucche
> > cafe
> >
> > i need to obtain this :
> >
> > Annales
> > cafe
> > Cafè
> > Zucche
> >
> > Now in Lucene 2.4 i find it difficult because the package "index" is
> > changed a lot; can i have some indications to keep my sort?
> > Thanks in advance
> > Federica


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message