lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From testn <te...@doramail.com>
Subject RE: High CPU usage duing index and search
Date Mon, 13 Aug 2007 08:34:08 GMT

To me, it looks like what you are trying to achieve is more suitable to be
the database where it can help you do grouping and sorting, etc. But if you
still want to achieve it using lucene, you might want to post some code so
that I can go through it and see why it uses so much resource then.


Chew Yee Chuang wrote:
> 
> Hi testn,
> 
> I have tested Filter, it is pretty fast, but still take a lot of CPU
> resource, Maybe it could due to the number of filter I run.
> 
> Thank you
> eChuang, Chew
> 
> 
> -----Original Message-----
> From: testn [mailto:test1@doramail.com] 
> Sent: Tuesday, August 07, 2007 10:37 PM
> To: java-user@lucene.apache.org
> Subject: RE: High CPU usage duing index and search
> 
> 
> Check out Filter class. You can create a separate filter for each field
> and
> then chain them together using ChainFilter. If you cache the filter, it
> will
> be pretty fast. 
> 
> 
> Chew Yee Chuang wrote:
>> 
>> Greetings,
>> 
>> Yes, process a little bit and stop for a while really reduce the CPU
>> usage,
>> but I need to find out a balance so that the indexing or searching will
>> not
>> have so much delay.
>> 
>> Execute 20,000 queries at a time is because the process is generating the
>> aggregation data for reporting,
>> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
>> 1Q - Gender:M AND Department: Accounting
>> 2Q - Gender:M AND Department: R&D
>> 3Q - Gender:M AND Department: Financial
>> 4Q - Gender:F AND Department: Accounting
>> 5Q - ....
>> Thus, the more combination, the more query need to run. For now, I still
>> can't get any idea on how to reduce it, just thinking maybe there is a
>> different way to index it so that I can get It easily.
>> 
>> Any help would be appreciated.
>> 
>> Thanks
>> eChuang, Chew
>> 
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com] 
>> Sent: Thursday, August 02, 2007 7:11 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>> 
>> It sounds like you have a fairly busy system, perhaps 100% load on the
>> process is not that strange, at least not during short periods of time.
>> 
>> A simpler solution would be to nice the process a little bit in order to
>> give your background jobs some more time to think.
>> 
>> Running a profiler is still the best advice I can think of. It should
>> clearly show you what is going on when you run out of CPU.
>> 
>> --  
>> karl
>> 
>> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
>> 
>>> Hi,
>>>
>>> Thanks for the link provided, actually I've go through those  
>>> article when I
>>> developing the index and search function for my application. I  
>>> haven’t try
>>> profiler yet, but I monitor the CPU usage and notice that whatever  
>>> index or
>>> search performing, the CPU usage raise to 100%. Below I will try to
>>> elaborate more on what my application is doing and how I index and  
>>> search.
>>>
>>> There are many concurrent process running, first, the application  
>>> will write
>>> records that received into a text file with tab separated each  
>>> different
>>> field. Application will point to a new file every 10mins and start  
>>> writing
>>> to it. So every file will contains only 10mins record, approximate  
>>> 600,000
>>> records per file. Then, the indexing process will check whether  
>>> there is a
>>> text file to be index, if it is, the thread will wake up and start  
>>> perform
>>> indexing.
>>>
>>> The indexing process will first add documents to RAMDir, Then  
>>> later, add
>>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>>> 100,000
>>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>>> (FSDir)
>>> was created but a few IndexWriter(RAMDir) was created during the whole
>>> process. Below are some configuration for IndexWriters that I  
>>> mentioned:-
>>>
>>> IndexWriter (RAMDir)
>>> - SimpleAnalyzer
>>> - setMaxBufferedDocs(10000)
>>> - Filed.Store.YES
>>> - Field.Index.NO_NORMS
>>>
>>> IndexWriter (FSDir)
>>> - SimpleAnalyzer
>>> - setMergeFactor(20)
>>> - addIndexesNoOptimize()
>>>
>>> For the searching, because there are many queries(20,000) run  
>>> continuously
>>> to generate the aggregate table for reporting purpose. All this  
>>> queries is
>>> run in nested loop, and there is only 1 Searcher created, I try  
>>> searcher and
>>> filter as well, filter give me better result, but both also utilize  
>>> lots of
>>> CPU resources.
>>>
>>> Hope this info will help and sorry for my bad English.
>>>
>>> Thanks
>>> eChuang, Chew
>>>
>>> -----Original Message-----
>>> From: karl wettin [mailto:karl.wettin@gmail.com]
>>> Sent: Tuesday, July 31, 2007 5:54 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: High CPU usage duing index and search
>>>
>>>
>>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>>> But just notice that when Lucene performing search or index,
>>>> the CPU usage on my machine raise to 100%, because of this issue,
>>>> some of my
>>>> others backend process will slow down eventually. Just want to know
>>>> does
>>>> anyone face this problem before ? and is it any idea on how to
>>>> overcome this
>>>> problem ?
>>>
>>> Did you run a profiler to see what it is that consume all the  
>>> resources?
>>> It is very hard to guess based on the information you supplied. Start
>>> here:
>>>
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>>
>>>
>>> -- 
>>> karl
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>>> 7/30/2007
>>> 5:02 PM
>>>
>>>
>>> No virus found in this outgoing message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>>> 7/31/2007
>>> 5:26 PM
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
>> 2:22 PM
>>  
>> 
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
>> 4:53 PM
>>  
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007
> 11:03 AM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12122714
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message