lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chew Yee Chuang" <yeechu...@tecforte.com>
Subject RE: High CPU usage duing index and search
Date Mon, 13 Aug 2007 01:26:43 GMT
Hi testn,

I have tested Filter, it is pretty fast, but still take a lot of CPU resource, Maybe it could
due to the number of filter I run.

Thank you
eChuang, Chew


-----Original Message-----
From: testn [mailto:test1@doramail.com] 
Sent: Tuesday, August 07, 2007 10:37 PM
To: java-user@lucene.apache.org
Subject: RE: High CPU usage duing index and search


Check out Filter class. You can create a separate filter for each field and
then chain them together using ChainFilter. If you cache the filter, it will
be pretty fast. 


Chew Yee Chuang wrote:
> 
> Greetings,
> 
> Yes, process a little bit and stop for a while really reduce the CPU
> usage,
> but I need to find out a balance so that the indexing or searching will
> not
> have so much delay.
> 
> Execute 20,000 queries at a time is because the process is generating the
> aggregation data for reporting,
> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
> 1Q - Gender:M AND Department: Accounting
> 2Q - Gender:M AND Department: R&D
> 3Q - Gender:M AND Department: Financial
> 4Q - Gender:F AND Department: Accounting
> 5Q - ....
> Thus, the more combination, the more query need to run. For now, I still
> can't get any idea on how to reduce it, just thinking maybe there is a
> different way to index it so that I can get It easily.
> 
> Any help would be appreciated.
> 
> Thanks
> eChuang, Chew
> 
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com] 
> Sent: Thursday, August 02, 2007 7:11 AM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
> 
> It sounds like you have a fairly busy system, perhaps 100% load on the
> process is not that strange, at least not during short periods of time.
> 
> A simpler solution would be to nice the process a little bit in order to
> give your background jobs some more time to think.
> 
> Running a profiler is still the best advice I can think of. It should
> clearly show you what is going on when you run out of CPU.
> 
> --  
> karl
> 
> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
> 
>> Hi,
>>
>> Thanks for the link provided, actually I've go through those  
>> article when I
>> developing the index and search function for my application. I  
>> haven’t try
>> profiler yet, but I monitor the CPU usage and notice that whatever  
>> index or
>> search performing, the CPU usage raise to 100%. Below I will try to
>> elaborate more on what my application is doing and how I index and  
>> search.
>>
>> There are many concurrent process running, first, the application  
>> will write
>> records that received into a text file with tab separated each  
>> different
>> field. Application will point to a new file every 10mins and start  
>> writing
>> to it. So every file will contains only 10mins record, approximate  
>> 600,000
>> records per file. Then, the indexing process will check whether  
>> there is a
>> text file to be index, if it is, the thread will wake up and start  
>> perform
>> indexing.
>>
>> The indexing process will first add documents to RAMDir, Then  
>> later, add
>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>> 100,000
>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>> (FSDir)
>> was created but a few IndexWriter(RAMDir) was created during the whole
>> process. Below are some configuration for IndexWriters that I  
>> mentioned:-
>>
>> IndexWriter (RAMDir)
>> - SimpleAnalyzer
>> - setMaxBufferedDocs(10000)
>> - Filed.Store.YES
>> - Field.Index.NO_NORMS
>>
>> IndexWriter (FSDir)
>> - SimpleAnalyzer
>> - setMergeFactor(20)
>> - addIndexesNoOptimize()
>>
>> For the searching, because there are many queries(20,000) run  
>> continuously
>> to generate the aggregate table for reporting purpose. All this  
>> queries is
>> run in nested loop, and there is only 1 Searcher created, I try  
>> searcher and
>> filter as well, filter give me better result, but both also utilize  
>> lots of
>> CPU resources.
>>
>> Hope this info will help and sorry for my bad English.
>>
>> Thanks
>> eChuang, Chew
>>
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com]
>> Sent: Tuesday, July 31, 2007 5:54 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>>
>>
>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>> But just notice that when Lucene performing search or index,
>>> the CPU usage on my machine raise to 100%, because of this issue,
>>> some of my
>>> others backend process will slow down eventually. Just want to know
>>> does
>>> anyone face this problem before ? and is it any idea on how to
>>> overcome this
>>> problem ?
>>
>> Did you run a profiler to see what it is that consume all the  
>> resources?
>> It is very hard to guess based on the information you supplied. Start
>> here:
>>
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>
>>
>> -- 
>> karl
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>> 7/30/2007
>> 5:02 PM
>>
>>
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>> 7/31/2007
>> 5:26 PM
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
> 2:22 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007 4:53 PM
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007 11:03 AM
 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message