lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Wang <john.w...@gmail.com>
Subject Re: addIndexesNoOptimize
Date Mon, 06 Jul 2009 06:18:57 GMT
Hi Mark and Michael:

     Thanks for your replies.

     Currently, addIndexesNoOptimize(Directory[] dir) is really really
really fast! (I duplicated my index of 15k docs 200 times and created a 3M
doc index in less than a minute) Perhaps we should handle duplicate
directory names more gracefully? e.g. append a numeral after the segment
name or something? (I'd happy to work on a patch for it)

     For what I need now, I think in my case
addIndexesNoOptimize(IndexReader[]) would work as well (I wouldn't know how
performance would compare though).

Thanks

-John

On Sun, Jul 5, 2009 at 6:10 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> This was added defensively a while back (can't find the issue right
> now), because internally IndexWriter now identifies each SegmentInfo
> as its Directory + segment name.
>
> EG the "runningMerges" set makes use of this.
>
> If you comment the check out, and pass duplicate segments in, I think
> at least IndexWriter would falsely delay certain merges (ie, gain less
> concurrency from CMS) because of the dups.
>
> But offhand I'm not sure where else we key on a SegmentInfo and what
> else might go wrong if dups enter IndexWriter's segmentInfos but it'd
> make me somewhat nervous removing that defensive check.
>
> Maybe instead we can add an addIndexesNoOptimize(IndexReader[]) (and
> deprecate addIndexes(IndexReader[]))?  Would that work?
>
> Mike
>
> On Sun, Jul 5, 2009 at 1:40 PM, John Wang<john.wang@gmail.com> wrote:
> > Guys:
> >
> >        Any thoughts? Forwarding the question from the users list after
> not
> > hearing back.
> >
> > Thanks
> >
> > -John
> >
> > ---------- Forwarded message ----------
> > From: John Wang <john.wang@gmail.com>
> > Date: Fri, Jul 3, 2009 at 3:49 PM
> > Subject: addIndexesNoOptimize
> > To: java-user@lucene.apache.org
> >
> >
> > Hi guys:
> >
> >     Running into a question with IndexWriter.addIndexesNoOptimize:
> >
> >     I am trying to expand a smaller index by replicating it into a larger
> > index. So I am adding the same directory N times.
> >
> >     I get an exception because noDupDirs(dirs) fails. For this call, is
> this
> > check neccessary?
> >
> >     I temporarily commented it and the resulting index seems to fine.
> >
> > Thanks
> >
> > -John
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message