lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: Dataimport performance
Date Wed, 15 Dec 2010 14:39:20 GMT
We are currently running Solr 4.x from trunk.

-d64 -Xms10240M -Xmx10240M

Total Rows Fetched: 24935988
Total Documents Skipped: 0
Total Documents Processed: 24568997
Time Taken: 5:55:19.104

24.5 Million Docs as XML from filesystem with less than 6 hours.

May be your MySQL is the bottleneck?

Regards
Bernd


Am 15.12.2010 14:40, schrieb Robert Gr├╝ndler:
> Hi,
> 
> we're looking for some comparison-benchmarks for importing large tables from a mysql
database (full import).
> 
> Currently, a full-import of ~ 8 Million rows from a MySQL database takes around 3 hours,
on a QuadCore Machine with 16 GB of
> ram and a Raid 10 storage setup. Solr is running on a apache tomcat instance, where it
is the only app. The tomcat instance
> has the following memory-related java_opts:
> 
> -Xms4096M -Xmx5120M
> 
> 
> The data-config.xml looks like this (only 1 entity):
> 
>       <entity name="track" query="select t.id as id, t.title as title, l.title as
label from track t left join label l on (l.id = t.label_id) where t.deleted = 0" transformer="TemplateTransformer">
>         <field column="title" name="title_t" />
>         <field column="label" name="label_t" />
>         <field column="id" name="sf_meta_id" />
>         <field column="metaclass" template="Track" name="sf_meta_class"/>
>         <field column="metaid" template="${track.id}" name="sf_meta_id"/>
>         <field column="uniqueid" template="Track_${track.id}" name="sf_unique_id"/>
>         
>         <entity name="artists" query="select a.name as artist from artist a left join
track_artist ta on (ta.artist_id = a.id) where ta.track_id=${track.id}">
>           <field column="artist" name="artists_t" />
>         </entity>
>         
>       </entity>
> 
> 
> We have the feeling that 3 hours for this import is quite long - regarding the performance
of the server running solr/mysql. 
> 
> Are we wrong with that assumption, or do people experience similar import times with
this amount of data to be imported?
> 
> 
> thanks!
> 
> 
> -robert
> 
> 
> 

-- 
*************************************************************
Bernd Fehling                Universit├Ątsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universit├Ątsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehling@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Mime
View raw message