lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Will Johnson" <wjohn...@GETCONNECTED.COM>
Subject RE: TermEnum - previous() method ?
Date Fri, 20 Jul 2007 14:11:10 GMT
One other possibility I've used in the past:

Store 2 fields, one with the normal characters, one with each character
inversed, A->Z, Y->B, and so on.  As long as you have the function to go
from the normal->reversed and reversed->normal you can iterate over both
fields in a forward only mode and just reverse out the strings on the
display side.  

This method makes a number of assumptions about index size constraints,
character sets; ie ymmv.

- will  

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Friday, July 20, 2007 10:05 AM
To: java-user@lucene.apache.org
Subject: Re: TermEnum - previous() method ?

There's nothing that I know of built in to Lucene that allows you
to do anything like termenum.previous(), I'm pretty sure you
have to roll your own.

Some possibilities:

1> Depending upon how many terms in your index, you could
just read all the authors into a Java collection of your choice
at startup time and reference that list.

2> You could just iterate from the beginning of the list every
time, storing the N elements somewhere and stop when
you get to the ones you need. Do you have any evidence
that this wouldn't work for you? I was surprised how quickly this
worked.

3> A variant of <2> would be to choose some arbitrary term that's
less than your target, skip to that one, and iterate forward. For
instance,
if you were looking for the terms previous to "Erickson", you could
skip to "Eg", and iterate forward. You'd have to do some heuristics
in case there was, say, only one author between "Eg" and "Erickson".

4> You could assemble a list of, say, every 100th term, either at
startup time or as a permanent part of your index. When you had
to do a previous, skip to the appropriate term and iterate forward.
Remember that not all documents need to have the same fields,
so you can have a "super special document" that contains some
list like this.

I'd recommend that you take some simple timings of iterating
through the entire list to see if you need to get fancy or not. The
critical piece of information you've omitted is any indication of
how many terms you expect to have to iterate over. If it's
10,000 terms, the simple solution will almost certainly work for
you. If it's 10,000,000,000 then you have to do some fancier
dancing.

Best
Erick

On 7/20/07, muraalee <muraalee@gmail.com> wrote:
>
>
> Hi Mark,
> Thanks for your inputs.
>
> >> I do wonder why you want a previous though? It sounds like you
might be
> >> better off heading down a different path...
>
> In our content, we have indexed Author as separate field. We want to
> expose
> a feature to  'browse' the Author list. They can type any author name
and
> do
> a locate from there onwards.. While doing that they need to navigate
> 'previous' & 'next' author list.
>
> e.g AU:Rowling
>
> sample code snippet:
> TermEnum browseTermEnum = indexReader.terms(new Term("AU",
"rowling"));
> while(browseTermEnum.next()){
>   System.out.println( browseTermEnum.term().text());
> }
>
> Navigating the next 10 authors is straight foward becuase we do that
by
> calling browseTermEnum.next(), but we couldn't do previous. I did
consider
> the approach of all terms while navigation. But the problem is, when
we do
> a
> lookup of term like 'rowling', we didn't iterate from the start of the
> list,
> hence we may not have the previous terms..
>
> Any thoughts ?
>
>
> markrmiller wrote:
> >
> > I am not very familiar with the Lucene file formats, but I think
that
> > there is a lot of "this number tells you how far ahead to read" when
> > enumerating terms. As you might guess, I think this lends toward
reading
> > the terms file forward. Not that an index couldn't point you into
the
> > terms index somehow (a meta index?). It would make just as much
sense
> (or
> > more) to buffer all of the terms to allow a previous though.
Depending
> on
> > your RAM and index size, this could be an option.
> >
> > I do wonder why you want a previous though? It sounds like you might
be
> > better off heading down a different path...
> >
> > - Mark
> >
> >
> >
> > muraalee wrote:
> >>
> >> Hi All,
> >> I searched in this forum for anybody looking for need for
previous()
> >> method in TermEnum. I found only this link
> >>
>
http://www.nabble.com/How-to-navigate-through-indexed-terms-tf28148.html
#a189225
> >>
> >> Would it be possible to implement previous() method ? I know i am
> asking
> >> for quick solution here ;) Just i want to ensure if it not
implemented,
> >> there might be a reason. So i can consider alternates approaches to
> >> implement similar feature..
> >>
> >> appreciate your thoughts...
> >>
> >> Thanks
> >> Murali V
> >>
> >
> >
>
> --
> View this message in context:
>
http://www.nabble.com/TermEnum----previous%28%29-method---tf4107296.html
#a11707160
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message