lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: TermEnum - previous() method ?
Date Fri, 20 Jul 2007 14:04:35 GMT
There's nothing that I know of built in to Lucene that allows you
to do anything like termenum.previous(), I'm pretty sure you
have to roll your own.

Some possibilities:

1> Depending upon how many terms in your index, you could
just read all the authors into a Java collection of your choice
at startup time and reference that list.

2> You could just iterate from the beginning of the list every
time, storing the N elements somewhere and stop when
you get to the ones you need. Do you have any evidence
that this wouldn't work for you? I was surprised how quickly this
worked.

3> A variant of <2> would be to choose some arbitrary term that's
less than your target, skip to that one, and iterate forward. For instance,
if you were looking for the terms previous to "Erickson", you could
skip to "Eg", and iterate forward. You'd have to do some heuristics
in case there was, say, only one author between "Eg" and "Erickson".

4> You could assemble a list of, say, every 100th term, either at
startup time or as a permanent part of your index. When you had
to do a previous, skip to the appropriate term and iterate forward.
Remember that not all documents need to have the same fields,
so you can have a "super special document" that contains some
list like this.

I'd recommend that you take some simple timings of iterating
through the entire list to see if you need to get fancy or not. The
critical piece of information you've omitted is any indication of
how many terms you expect to have to iterate over. If it's
10,000 terms, the simple solution will almost certainly work for
you. If it's 10,000,000,000 then you have to do some fancier
dancing.

Best
Erick

On 7/20/07, muraalee <muraalee@gmail.com> wrote:
>
>
> Hi Mark,
> Thanks for your inputs.
>
> >> I do wonder why you want a previous though? It sounds like you might be
> >> better off heading down a different path...
>
> In our content, we have indexed Author as separate field. We want to
> expose
> a feature to  'browse' the Author list. They can type any author name and
> do
> a locate from there onwards.. While doing that they need to navigate
> 'previous' & 'next' author list.
>
> e.g AU:Rowling
>
> sample code snippet:
> TermEnum browseTermEnum = indexReader.terms(new Term("AU", "rowling"));
> while(browseTermEnum.next()){
>   System.out.println( browseTermEnum.term().text());
> }
>
> Navigating the next 10 authors is straight foward becuase we do that by
> calling browseTermEnum.next(), but we couldn't do previous. I did consider
> the approach of all terms while navigation. But the problem is, when we do
> a
> lookup of term like 'rowling', we didn't iterate from the start of the
> list,
> hence we may not have the previous terms..
>
> Any thoughts ?
>
>
> markrmiller wrote:
> >
> > I am not very familiar with the Lucene file formats, but I think that
> > there is a lot of "this number tells you how far ahead to read" when
> > enumerating terms. As you might guess, I think this lends toward reading
> > the terms file forward. Not that an index couldn't point you into the
> > terms index somehow (a meta index?). It would make just as much sense
> (or
> > more) to buffer all of the terms to allow a previous though. Depending
> on
> > your RAM and index size, this could be an option.
> >
> > I do wonder why you want a previous though? It sounds like you might be
> > better off heading down a different path...
> >
> > - Mark
> >
> >
> >
> > muraalee wrote:
> >>
> >> Hi All,
> >> I searched in this forum for anybody looking for need for previous()
> >> method in TermEnum. I found only this link
> >>
> http://www.nabble.com/How-to-navigate-through-indexed-terms-tf28148.html#a189225
> >>
> >> Would it be possible to implement previous() method ? I know i am
> asking
> >> for quick solution here ;) Just i want to ensure if it not implemented,
> >> there might be a reason. So i can consider alternates approaches to
> >> implement similar feature..
> >>
> >> appreciate your thoughts...
> >>
> >> Thanks
> >> Murali V
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/TermEnum----previous%28%29-method---tf4107296.html#a11707160
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message