lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: should I import the XML file into a mysql dataset ?
Date Tue, 29 Mar 2011 09:59:22 GMT
> 1 - I'm using commons Digester as xml parser, how can I find the bottleneck
> ? Should I run the code and comment out the Lucene queries part and just
> leave the xml parsing ?

That is what I was suggesting.

> 2 - I actually also wanted to know the following: how much does it take to
> run a 100MB queries text file against each single document of a 100MB
> collection ? On a Intel Dual Duo Core with 4GB Ram ? Are we talking about
> few hours ? Can I have an estimate ?

How many queries are there in the file?
How many documents are there in the lucene index?
How big is the lucene index?
How long does a typical single query take?

What do you mean by "run ... against each single document"?


--
Ian.


> On 29 March 2011 11:43, Ian Lea <ian.lea@gmail.com> wrote:
>
>> You need to figure out what is taking the time, for example by reading
>> the XML file without making any lucene queries.  What XML parsing
>> process are you using?  Some are faster than others.  A google search
>> should find loads of info.
>>
>> If it turns out that it is lucene searching taking most of the time,
>> see http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>
>>
>> But do the figuring out first - there is little point in speeding up
>> the bit that is already quick.
>>
>>
>> --
>> Ian.
>>
>>
>> On Tue, Mar 29, 2011 at 10:22 AM, Patrick Diviacco
>> <patrick.diviacco@gmail.com> wrote:
>> > hi,
>> >
>> > I performing multiple queries (stored in a 100MB XML file) against a
>> > collection (indexed with lucene, and it was stored before in a 100MB XML
>> > file).
>> >
>> > The process seems pretty long on my machine (more than 2 hours), so I was
>> > wondering if importing the 100MB queries XML file into a mysql dataset
>> and
>> > extract them with Java would dramatically improve the performances
>> (rather
>> > than working with Java + a xml text file).
>> >
>> > thanks
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message