lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "juergen.wagner@devoteam.com" <juergen.wag...@devoteam.com>
Subject Re: Replication for SolrCloud
Date Sun, 19 Apr 2015 11:46:10 GMT
In simple words:

HDFS is good for file-oriented replication. Solr is good for index replication.

Consequently, if atomic file update operations of an application (like Solr) are not atomic
on a file level, HDFS is not adequate - like for Solr with live index updates. Running Solr
on HDFS (as a file system) will pose limitations due to HDFS properties. Indexing, however,
still won't use Hadoop.

If you produce indexes and distribute them as finalized, read-only structures (e.g., through
Hadoop jobs), HDFS is fine. Solr does not need to be much aware of HDFS.

The third one in the picture is records-based replication to be handled by Hbase, Cassandra
or Zookeeper, depending on requirements.

Cheers,
J├╝rgen
Mime
View raw message