lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: SolrCloud Feedback
Date Mon, 14 Feb 2011 15:40:39 GMT
Some more comments:

f) For consistency, the JAVA OPTIONS should all be prefixed with solr.* even if they are related
to embedded ZK
   -Dsolr.hostPort=8900 -Dsolr.zkRun -Dsolr.zkHost=localhost:9900 -Dsolr.zkBootstrap_confdir=./solr/conf

g) I often share parts of my config between cores, e.g. a common schema.xml or synonyms.xml
   In the file-based mode I can thus use ../../common_conf/synonyms.xml or similar.
   I have not tried to bootstrap such a config into ZK but I assume it will not work
   ZK mode should support such a use case either by supporting notations like ".."
   or by allowing an explicit zk name space: zk://configs/common-cfg/synonyms.xml

h) Support for dev / test / prod environments
   In real life you want to develop in one environment, test in another and run production
in a third
   Thus, the ZK data structure should have a clear separation between logical feature configuration
and
   physical deployment config.

   Perhaps a new level above /COLLECTIONS could be used to model this, e.g.
   /ENV/PROD/COLLECTIONS/WEB/SHARDS/shardA/prod01.server.com:8080
   /ENV/PROD/COLLECTIONS/WEB/SHARDS/shardB/prod02.server.com:8080
   /ENV/PROD/COLLECTIONS/FILES/SHARDS/shardA/prod03.server.com:8080
   /ENV/TEST/COLLECTIONS/WEB/SHARDS/shardA/test01.server.com:8080
   /ENV/TEST/COLLECTIONS/WEB/SHARDS/shardB/test01.server.com:9090
   /ENV/TEST/COLLECTIONS/FILES[@configName=TESTFILES]/SHARDS/shardA/test01.server.com:7070
   
   When starting solr we may specify environment: -Dsolr.env=TEST (or configure a default)
   The main benefit is that we can maintain and store one single ZK config in our SCM,
   distribute the same configs to all servers, and if you like, point all envs to the same
ZK ensemble.

   In the future, we can use this for automatic install of a new node as well:
   By simply adding a ZK entry on the right place, the node can discover "who it is" from
ZK.

i) Ideally, no config inside conf should contain host names.
   My DIH config will most likely include server names, which will be different between TEST
and PROD
   This could be solved as above, by letting the collection in TEST use another configName
than PROD,
   but for some use cases, it might be more elegant to swap out a hardcoded string with a
ZK node 
   in a generic way, such as jdbcString="my-hardcoded-string" to jdbcString="${zk://ENV/PROD/jdbcstrA}"

j) Question: Is ReplicationHandler ZK-aware yet?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 10. feb. 2011, at 16.10, Jan Høydahl wrote:

> Hi,
> 
> I have so far just tested the examples and got a N by M cluster running. My feedback:
> 
> a) First of all, a major update of the SolrCloud Wiki is needed, to clearly state what
is in which version, what are current improvement plans and get rid of outdated stuff. That
said I think there are many good ideas there.
> 
> b) The "collection" terminology is too much confused with "core", and should probably
be made more distinct. I just tried to configure two cores on the same Solr instance into
the same collection, and that worked fine, both as distinct shards and as same shard (replica).
The wiki examples give the impression that "collection1" in localhost:8983/solr/collection1/select?distrib=true
is some magic collection identifier, but what it really does is doing the query on the *core*
named "collection1", looking up what collection that core is part of and distributing the
query to all shards in that collection.
> 
> c) ZK is not designed to store large files. While the files in conf are normally well
below the 1M limit ZK imposes, we should perhaps consider using a lightweight distributed
object or k/v store for holding the /CONFIGS and let ZK store a reference only
> 
> d) How are admins supposed to update configs in ZK? Install their favourite ZK editor?
> 
> e) We should perhaps not be so afraid to make ZK a requirement for Solr in v4. Ideally
you should interact with a 1-node Solr in the same manner as you do with a 100-node Solr.
An example is the Admin GUI where the "schema" and "solrconfig" links assume local file. This
requires decent tool support to make ZK interaction intuitive, such as "import" and "export"
commands.
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 19. jan. 2011, at 21.07, Mark Miller wrote:
> 
>> Hello Users,
>> 
>> About a little over a year ago, a few of us started working on what we called SolrCloud.
>> 
>> This initial bit of work was really a combination of laying some base work - figuring
out how to integrate ZooKeeper with Solr in a limited way, dealing with some infrastructure
- and picking off some low hanging search side fruit.
>> 
>> The next step is the indexing side. And we plan on starting to tackle that sometime
soon.
>> 
>> But first - could you help with some feedback?ISome people are using our SolrCloud
start - I have seen evidence of it ;) Some, even in production.
>> 
>> I would love to have your help in targeting what we now try and improve. Any suggestions
or feedback? If you have sent this before, I/others likely missed it - send it again!
>> 
>> I know anyone that has used SolrCloud has some feedback. I know it because I've used
it too ;) It's too complicated to setup still. There are still plenty of pain points. We accepted
some compromise trying to fit into what Solr was, and not wanting to dig in too far before
feeling things out and letting users try things out a bit. Thinking that we might be able
to adjust Solr to be more in favor of SolrCloud as we go, what is the ideal state of the work
we have currently done?
>> 
>> If anyone using SolrCloud helps with the feedback, I'll help with the coding effort.
>> 
>> - Mark Miller
>> -- lucidimagination.com
> 


Mime
View raw message