lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max Lynch <ihas...@gmail.com>
Subject Re: Continuously iterate over documents in index
Date Thu, 15 Jul 2010 15:39:42 GMT
Erick,
This is what I ended up doing.  I initially avoided it because I was storing
dates using Solr's date type which AFAIK aren't usable in Lucene, but I
ended up using DateTools to store a lucene readable version that seems to
work well.

Thanks!

On Wed, Jul 14, 2010 at 7:59 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Hmmmm, if you somehow know the last date you processed, why wouldn't using
> a
> range query work for you? I.e.
> date:[<recorded last date> TO <new date to record (NOW?)>]?
>
> Best
> Erick
>
> On Wed, Jul 14, 2010 at 10:37 AM, Max Lynch <ihasmax@gmail.com> wrote:
>
> > You could have a field within each doc say "Processed" and store a
> >
> > > value Yes/No, next run a searcher query which should give you the
> > > collection of unprocessed ones.
> > >
> >
> > That sounds like a reasonable idea, and I just realized that I could have
> > done that in a way specific to my application.  However, I already tried
> > doing something with a MatchAllDocsQuery with a custom collector and sort
> > by
> > date.  I store the last date and time of a doc I processed and process
> only
> > newer ones.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message