lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sami Siren (JIRA)" <>
Subject [jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud
Date Wed, 20 Jun 2012 08:48:43 GMT


Sami Siren commented on SOLR-3488:

Mark, nice work.

bq. I'm still somewhat unsure about handing failures though...

IMO Fail fast: at minimum an error should be reported back (the completed queue Yonik mentions?).
It seems that in the latest patch even in case of failure the job is removed from queue.

bq. I also have not switched to requiring or respecting a replication factor - I was thinking
perhaps specifying nothing or -1 would give you what you have now? An infinite rep factor?
And we would enforce a lower rep factor if requested?

Sounds good to me.

bq. I'm not sure how replication factor would be enforced though? The Oveerseer just periodically
prunes and adds given what it sees and what the rep factor is? Is that how failures should
be handled? Don't readd to the queue, just let the periodic job attempt to fix things later?

I would first implement the simplest? case first where if not enough nodes are available to
meet #shards and/or #replication factor: report error to user and do not try to create the
collection. Or did you mean at runtime after the collection has been created?

I have one question about the patch specifically in the OverseerCollectionProcessor where
you create the collection: why do you need the collection param? 
In context of creating N * R cluster: why don't you just go though live nodes to find available
nodes and perhaps then based on some "strategy" class create specific shards (with shardids)
to specific nodes? The rest of the overseer would have to respect that same strategy (instead
of the dummy AssignShard that is now used) so that things would not break when new nodes are
attached to the collection. Perhaps this "strategy" could also handle things like time based
sharding and whatnot?

bq. it should be easy to merge but I think that it'd be also good to start committing your
patch and improve things on SVN from now on to ease code review (no patch merging) and concurrent

+1 for committing this as is, there are some minor weak spots in the current patch like checking
the input for the collections api requests (unexisitng params cause OverseerCollectionProcessor
to die with NPE), reporting back input errors etc. put lets just put this in and open more
jira issues to cover the improvement tasks and bugs?

One more thing: I am seeing BasicDistributedZkTest failing (not just sporadically), nut sure
if it is related, with the following error:

 [junit4] ERROR   0.00s J1 | BasicDistributedZkTest (suite)
   [junit4]    > Throwable #1: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=496
   [junit4]    > 	at __randomizedtesting.SeedInfo.seed([F1C0A91EB78BAB39]:0)
   [junit4]    > 	at
   [junit4]    > 	at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(

> Create a Collections API for SolrCloud
> --------------------------------------
>                 Key: SOLR-3488
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>         Attachments: SOLR-3488.patch, SOLR-3488.patch, SOLR-3488.patch, SOLR-3488_2.patch

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message