lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Preetham Kajekar <preet...@cisco.com>
Subject Re: Combining results of multiple indexes
Date Wed, 17 Dec 2008 14:40:11 GMT
Hi Grant,
 Thanks four response. Replies inline.

Grant Ingersoll wrote:
>
> On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote:
>
>> Hi,
>> I am new to Lucene. I am not using it as a pure text indexer.
>>
>> I am trying to index a Java object which has about 10 fields (like 
>> id, time, srcIp, dstIp) - most of them being numerical values.
>> In order to speed up indexing, I figured that having two separate 
>> indexers, each of them indexing different set of fields works great. 
>> So I have the first 5 fields in index1 and the remaining in index2.
>
> Can you explain this a bit more?  Are those two fields really large 
> org something?  How are you obtaining them?  How are you correlating 
> the documents between the two indexes?  Did you actually try a single 
> index and it was too slow?
I have a java object which has about 10 fields. However, the fields are 
not fixed. The java object is essentially a representation of Syslogs 
from network devices. So different syslogs have different fields. Each 
field has a unique id and a value (mostly numeric types, so i convert it 
to string). There are some fixed fields. So the object is a list of 
fields which is produced by a parser.
I am trying to index using two indexers in two separate threads- one for 
fixed and another for the non-fixed fields. Except for a unique id, I do 
not store the fields in Lucene - i just index them. From the index, i 
get the unique id which is all I care about. (the objects are stored 
elsewhere and can be looked up based on this unique id).
I did try using a single indexer, but things were quite slow. Getting 
high throughput is crucial and having two indexers seemed to do very 
well. (more than twice as fast)

Further, the index will never be modified and I can have just one thread 
writing to the index. If there are any other performance tips would be 
very helpful. I have already looked at the wiki link regarding 
performance and using some of them.

Thanks,
 ~preetham
>
>>
>>
>> Now, I want to have boolean AND query's looking for values in both 
>> indexes. Like f1=1234 AND f7=ABCD.f1 and f7 and present in two 
>> separate indexes. Would using the MultiIndexReader help ? Since I am 
>> doing an AND, I dont expect that it would work.
>>
>> Thanks,
>> ~preetham
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message