lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Re: BlockGroupingCollector, not always getting first document
Date Fri, 09 Mar 2012 11:06:18 GMT
On Thu, Mar 8, 2012 at 7:22 AM, Grzegorz Tańczyk
<grzegorz.tanczyk@polskastrefa.eu> wrote:
> Hello,
>
> Thanks for reply, I can find first document from group using non grouping
> search.

OK, so the index seems ok.

> To be sure about this I deleted index and indexed only first 100 groups
> which gives around 2300 documents and I see the problem on at least half of
> groups.  No problem in finding first documents normally.
> I noticed this problem first when I had indexed few thousands groups.

Hmm.

> When I index everything(15k groups, which means around 200k documents,
> commit every 500 groups) the problem is no more or at least I can't find any
> group with non first document in scoreDocs[0]. I'm reindexing it since
> morning, I will reindex it once again to be sure about this one.

Weird that the full index doesn't show the issue but the partial index does.

> I'm not Lucene internals expert, but maybe this problem is somehow connected
> to segment merging?

Well, a simple way to test this is to use set NoMergePolicy on the
IndexWriterConfig.

> Some additional info:
>
> I'm using Lucene 3.5.0.
>
> Sort:
> public final static Sort SORT_ID = new Sort(new SortField("id_n",
> SortField.INT));
>
> Adding field to document:
> doc.add(new NumericField("id_n", Store.NO, true).setIntValue(rs.getInt(1)));
>
> (I checked how it works with Store.YES, it didn't change anything.)
>
> I also call searcher.setDefaultFieldSortScoring(true, true) before grouping
> search.

If you don't call this, is the issue still there?

> Calling optimize() also didn't help(but anyway I wouldn't use this method
> even if it was the solution for this problem ;-) )

OK.  Did calling optimize() change which docs were missing...?

> Index writer config has default settings.

Are you doing any deleteDocuments or updateDocument calls?

> For now I'm using workaround, but I'm looking forward to finding solution of
> this problem.

Wait, what's the workaround?

I noticed you pass maxDocsPerGroup=1; if you increase that (eg to 10)
does it change the bug...?

Is it possible to boil this down to a small test case?

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message