lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Heisey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
Date Tue, 16 Oct 2012 20:41:03 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477326#comment-13477326
] 

Shawn Heisey commented on SOLR-3954:
------------------------------------

A completed import with updateLog turned off:

{code}
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">0</int>
</lst>
<lst name="initArgs">
  <lst name="defaults">
    <str name="config">dih-config.xml</str>
  </lst>
</lst>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
  <str name="Total Requests made to DataSource">1</str>
  <str name="Total Rows Fetched">12947488</str>
  <str name="Total Documents Skipped">0</str>
  <str name="Full Dump Started">2012-10-16 07:46:01</str>
  <str name="">Indexing completed. Added/Updated: 12947488 documents. Deleted 0 documents.</str>
  <str name="Committed">2012-10-16 11:17:48</str>
  <str name="Total Documents Processed">12947488</str>
  <str name="Time taken">3:31:47.508</str>
</lst>
<str name="WARNING">This response format is experimental.  It is likely to change in
the future.</str>
</response>
{code}

                
> Option to have updateHandler and DIH skip updateLog
> ---------------------------------------------------
>
>                 Key: SOLR-3954
>                 URL: https://issues.apache.org/jira/browse/SOLR-3954
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 4.0
>            Reporter: Shawn Heisey
>             Fix For: 4.1
>
>
> The updateLog feature makes updates take longer, likely because of the I/O time required
to write the additional information to disk.  It may take as much as three times as long for
the indexing portion of the process.  I'm not sure whether it affects the time to commit,
but I would imagine that the difference there is small or zero.  When doing incremental updates/deletes
on an existing index, the time lag is probably very small and unimportant.
> When doing a full reindex (which may happen via DIH), especially if this is done in a
build core that is then swapped with a live core, this performance hit is unacceptable.  It
seems to make the import take about three times as long.
> An option to have an update skip the updateLog would be very useful for these situations.
 It should have a method in SolrJ and be exposed in DIH as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message