lucene-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <>
Subject [jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud
Date Tue, 26 Nov 2019 18:59:00 GMT


ASF subversion and git services commented on SOLR-13101:

Commit db72b8b2061875bd421e1a657caecbca0c922817 in lucene-solr's branch refs/heads/jira/SOLR-13101
from Andy Vuong
[;h=db72b8b ]

SOLR-13101: Address flakiness of tests using async pulls and handle interrupt properly (#1029)

> Shared storage support in SolrCloud
> -----------------------------------
>                 Key: SOLR-13101
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>            Reporter: Yonik Seeley
>            Priority: Major
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
> Solr should have first-class support for shared storage (blob/object stores like S3,
google cloud storage, etc. and shared filesystems like HDFS, NFS, etc).
> The key component will likely be a new replica type for shared storage.  It would have
many of the benefits of the current "pull" replicas (not indexing on all replicas, all shards
identical with no shards getting out-of-sync, etc), but would have additional benefits:
>  - Any shard could become leader (the blob store always has the index)
>  - Better elasticity scaling down
>    - durability not linked to number of replcias.. a single replica could be common for
write workloads
>    - could drop to 0 replicas for a shard when not needed (blob store always has index)
>  - Allow for higher performance write workloads by skipping the transaction log
>    - don't pay for what you don't need
>    - a commit will be necessary to flush to stable storage (blob store)
>  - A lot of the complexity and failure modes go away
> An additional component a Directory implementation that will work well with blob stores.
 We probably want one that treats local disk as a cache since the latency to remote storage
is so large.  I think there are still some "locking" issues to be solved here (ensuring that
more than one writer to the same index won't corrupt it).  This should probably be pulled
out into a different JIRA issue.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message