lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Binda <>
Subject Re: Fields, Index segments and docIds
Date Tue, 29 Apr 2014 09:00:09 GMT
On 04/29/2014 08:46 AM, Uwe Schindler wrote:
> Hi Oliver,
> To me it looks like you want to do it much too complicated. It also seems that you misunderstood
join queries, which seems to be your problem. Comments inside:
>> My lucene Index is built and stored in a zip file (uncompressed) which is used
>> as a read-only Directory.
>> 1) At lucene indexing time, is it possible to rewrite the index so that some
>> fields are only found in some segments Say :
>> EnglishWords, EnglishVerbs go to Segment 1 GermanWords,
>> GermanSentences go to Segment 2 French, frenchWines go to Segment 3 ...
> You can create the 100% same index structure manually without dealing with Lucene internals.
Just index every language into a separate index with a separate IndexWriter. As those segments
are read-only, you can call forceMerge(1) after indexing, so those indexes have exactly 1
segment -> every language has one single segment.
Say, to implement what you just suggested, I will need some indexWriters


and have to do :

for commonStuff
Document document
document.add(term("stuff", "common things", stored))
document.add(term("id", someCommonId))

for German :
Document document1

document1.add(term("de", "GutenTag"))
document1.add(term("id", someCommonId))

for English :
Document document2

document2.add(term("en", "Hello"))
document2.add(term("id", someCommonId))

The docId from the 3 writers have nothing in common, right ?
Won't this be problematic ? Especially since I allow queryParser queries 
like " stuff:fff de:Guten"

Is there a way to share a document between segments written with 
different writters with the same docId ?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message