lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renaud Delbru <renaud.del...@deri.org>
Subject Re: IndexingChain and TermHash
Date Fri, 06 Nov 2009 18:34:01 GMT
Hi Michael,

Thanks for the quick fix. I have tested it (indexing multiple documents 
+ searching), and it seems to work.

On 06/11/09 18:09, Michael McCandless wrote:
> To be honest, you are sort of forging new territory here :)
>    
I think so too, not an easy task ;o). I have seen that you have tried to 
make modular the indexing chain of Lucene (DocumentsWriter). I still try 
to have a good understanding of the default indexing, but I would like 
to see how it is easy (or difficult) to modify the format of the 
postings. From my current understanding, it seems that only the consumer 
at the end of this chain (FreqProxTermsWriter and its consumer 
FormatPostingsFieldsWriter) has to be changed to a certain extend.

Thanks for the help.
-- 
Renaud Delbru
> The intention of TermsHash was to allow a null nextTermsHash.
> Hopefully fixing a few places will in fact make it work that way.
>
> EG for this, maybe change (in TermsHashPerThread.java) this:
>
>      if (nextTermsHash != null) {
>        // We are primary
>        charPool = new CharBlockPool(termsHash.docWriter);
>        primary = true;
>      } else {
>        charPool = primaryPerThread.charPool;
>        primary = false;
>      }
>
> to:
>
>      if (nextTermsHash != null || primaryPerThread == null) {
>        // We are primary
>        charPool = new CharBlockPool(termsHash.docWriter);
>        primary = true;
>      } else {
>        charPool = primaryPerThread.charPool;
>        primary = false;
>      }
>
> ?
>
> Mike
>
> On Fri, Nov 6, 2009 at 12:29 PM, Renaud Delbru<renaud.delbru@deri.org>  wrote:
>    
>> Hi,
>>
>> I am trying to modify the indexing chain of Lucene. To start, I have
>> extracted and modified the default indexing chain. I have just removed the
>> TermVectorsTermsWriter from the chain, i.e., I instantiate a TermHash with a
>> null 'nextTermsHash'. So, in the chain, my inverted doc consumer looks like:
>>
>>   final InvertedDocConsumer termsHash = new TermsHash(documentsWriter, true,
>> freqProxWriter, null);
>>
>>  From looking at the code of TermsHash, it looks like you can pass a null
>> value as 'nextTermsHash' parameter. However, I got a NPE during the
>> TermsHashPerThread initialisation (line 48, primaryPerThread is null).
>>
>> Is it a normal behavior ? Do TermHash is always waiting for a non null
>> 'nextTermsHash' by design ?
>>
>> Thanks
>> --
>> Renaud Delbru
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>      
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>    


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message