lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramkumar Aiyengar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-7820) IndexFetcher should delete the current index directory before downloading the new index when isFullCopyNeeded==true
Date Sat, 25 Jul 2015 11:14:04 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641543#comment-14641543
] 

Ramkumar Aiyengar commented on SOLR-7820:
-----------------------------------------

I agree there are a few issues here, just that the deleting the current index just brushes
them all under the carpet and adds risk.

 - The current default of 100 updates for {{UpdateLog}} is often insufficient for many cases.
I made that number configurable, if it's a few thousand updates, just tweaking it might work.
But {{UpdateLog}} has scaling limitations I think, so YMMV. I thought {{CdcrUpdateLog}} came
about to overcome this scaling limitation -- but I haven't looked at it enough to know if
it can replace {{UpdateLog}}, perhaps [~erickerickson] or [~yseeley@gmail.com] know..
 - The other thing which could vastly improve this situation, even if a full recovery was
needed, was synchronizing commits across replicas, since recovery skips segments already present
in the current index. I believe [~varunthacker] was looking at this, but I can't find the
issue now.
 - Regardless, I agree that it would be a good enhancement to calculate ahead of time how
much space is needed for recovery and cleanly abort instead of trying and running out of space.


> IndexFetcher should delete the current index directory before downloading the new index
when isFullCopyNeeded==true
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7820
>                 URL: https://issues.apache.org/jira/browse/SOLR-7820
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>            Reporter: Timothy Potter
>
> When a replica is trying to recover and it's IndexFetcher decides it needs to pull the
full index from a peer (isFullCopyNeeded == true), then the existing index directory should
be deleted before the full copy is started to free up disk to pull a fresh index, otherwise
the server will potentially need 2x the disk space (old + incoming new). Currently, the IndexFetcher
removes the index directory after the new is downloaded; however, once the fetcher decides
a full copy is needed, what is the value of the existing index? It's clearly out-of-date and
should not serve queries. Since we're deleting data preemptively, maybe this should be an
advanced configuration property, only to be used by those that are disk-space constrained
(which I'm seeing more and more with people deploying high-end SSDs - they typically don't
have 2x the disk capacity required by an index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message