lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Troy Edwards <tedwards415...@gmail.com>
Subject Re: Data Import Handler takes different time on different machines
Date Wed, 03 Feb 2016 13:28:42 GMT
While researching the space on the servers, I found that log files from
Sept 2015 are still there. These are solr_gc_log_datetime and
solr_log_datetime.

Is the default logging for Solr ok for production systems or does it need
to be changed/tuned?

Thanks,

On Tue, Feb 2, 2016 at 2:04 PM, Troy Edwards <tedwards415107@gmail.com>
wrote:

> That is help!
>
> Thank you for the thoughts.
>
>
> On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> Scratch that installation and start over?
>>
>> Really, it sounds like something is fundamentally messed up with the
>> Linux install. Perhaps something as simple as file paths, or you have
>> old jars hanging around that are mis-matched. Or someone manually
>> deleted files from the Solr install. Or your disk filled up. Or....
>>
>> How sure are you that the linux setup was done properly?
>>
>> Not much help I know,
>> Erick
>>
>> On Tue, Feb 2, 2016 at 10:11 AM, Troy Edwards <tedwards415107@gmail.com>
>> wrote:
>> > Rerunning the Data Import Handler again on the the linux machine has
>> > started producing some errors and warnings:
>> >
>> > On the node on which DIH was started:
>> >
>> > WARN SolrWriter Error creating document : SolrInputDocument
>> >
>> > org.apache.solr.common.SolrException: No registered leader was found
>> > after waiting for 4000ms , collection: collectionmain slice: shard1
>> >
>> >
>> >
>> > On the second node:
>> >
>> > WARN ReplicationHandler Exception while writing response for params:
>> >
>> command=filecontent&checksum=true&generation=1047&qt=/replication&wt=filestream&file=_1oo_Lucene50_0.tip
>> >
>> > java.nio.file.NoSuchFileException:
>> >
>> /var/solr/data/collectionmain_shard2_replica1/data/index/_1oo_Lucene50_0.tip
>> >
>> >
>> > ERROR
>> >
>> > Index fetch failed :org.apache.solr.common.SolrException: Unable to
>> > download _169.si completely. Downloaded 0!=466
>> >
>> >
>> > ReplicationHandler Index fetch failed
>> > :org.apache.solr.common.SolrException: Unable to download _169.si
>> > completely. Downloaded 0!=466
>> >
>> > WARN
>> > IndexFetcher File _1pd_Lucene50_0.tim did not match. expected checksum
>> is
>> > 3549855722 and actual is checksum 2062372352. expected length is 72522
>> and
>> > actual length is 39227
>> >
>> > WARN UpdateLog Log replay finished.
>> recoveryInfo=RecoveryInfo{adds=840638
>> > deletes=0 deleteByQuery=0 errors=0 positionOfStart=554264}
>> >
>> >
>> > Any suggestions about this?
>> >
>> > Thanks
>> >
>> > On Mon, Feb 1, 2016 at 10:03 PM, Erick Erickson <
>> erickerickson@gmail.com>
>> > wrote:
>> >
>> >> The first thing I'd be looking at is how I the JDBC batch size compares
>> >> between the two machines.....
>> >>
>> >> AFAIK, Solr shouldn't notice the difference, and since a large majority
>> >> of the development is done on Linux-based systems, I'd be surprised if
>> >> this was worse than Windows, which would lead me to the one thing that
>> >> is definitely different between the two: Your JDBC driver and its
>> settings.
>> >> At least that's where I'd look first.
>> >>
>> >> If nothing immediate pops up, I'd probably write a small driver
>> program to
>> >> just access the database from the two machines and process your 10M
>> >> records _without_ sending them to Solr and see what the comparison is.
>> >>
>> >> You can also forgo DIH and do a simple import program via SolrJ. The
>> >> advantage here is that the comparison I'm talking about above is
>> >> really simple, just comment out the call that sends data to Solr.
>> Here's an
>> >> example...
>> >>
>> >> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Mon, Feb 1, 2016 at 7:34 PM, Troy Edwards <tedwards415107@gmail.com
>> >
>> >> wrote:
>> >> > Sorry, I should explain further. The Data Import Handler had been
>> running
>> >> > for a while retrieving only about 150000 records from the database.
>> Both
>> >> in
>> >> > development env (windows) and linux machine it took about 3 mins.
>> >> >
>> >> > The query has been changed and we are now trying to retrieve about
10
>> >> > million records. We do expect the time to increase.
>> >> >
>> >> > With the new query the time taken on windows machine is consistently
>> >> around
>> >> > 40 mins. While the DIH is running queries slow down i.e. a query that
>> >> > typically took 60 msec takes 100 msec.
>> >> >
>> >> > The time taken on linux machine is consistently around 2.5 hours.
>> While
>> >> the
>> >> > DIH is running queries take about 200  to 400 msec.
>> >> >
>> >> > Thanks!
>> >> >
>> >> > On Mon, Feb 1, 2016 at 8:45 PM, Erick Erickson <
>> erickerickson@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> What happens if you run just the SQL query from the
>> >> >> windows box and from the linux box? Is there any chance
>> >> >> that somehow the connection from the linux box is
>> >> >> just slower?
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >> On Mon, Feb 1, 2016 at 6:36 PM, Alexandre Rafalovitch
>> >> >> <arafalov@gmail.com> wrote:
>> >> >> > What are you importing from? Is the source and Solr machine
>> collocated
>> >> >> > in the same fashion on dev and prod?
>> >> >> >
>> >> >> > Have you tried running this on a Linux dev machine? Perhaps
your
>> prod
>> >> >> > machine is loaded much more than a dev.
>> >> >> >
>> >> >> > Regards,
>> >> >> >    Alex.
>> >> >> > ----
>> >> >> > Newsletter and resources for Solr beginners and intermediates:
>> >> >> > http://www.solr-start.com/
>> >> >> >
>> >> >> >
>> >> >> > On 2 February 2016 at 13:21, Troy Edwards <
>> tedwards415107@gmail.com>
>> >> >> wrote:
>> >> >> >> We have a windows development machine on which the Data
Import
>> >> Handler
>> >> >> >> consistently takes about 40 mins to finish. Queries run
fine. JVM
>> >> >> memory is
>> >> >> >> 2 GB per node.
>> >> >> >>
>> >> >> >> But on a linux machine it consistently takes about 2.5
hours. The
>> >> >> queries
>> >> >> >> also run slower. JVM memory here is also 2 GB per node.
>> >> >> >>
>> >> >> >> How should I go about analyzing and tuning the linux machine?
>> >> >> >>
>> >> >> >> Thanks
>> >> >>
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message