lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: Migrating from cores to collections
Date Mon, 30 Nov 2015 12:36:08 GMT


On Sun, Nov 29, 2015, at 07:38 PM, William Bell wrote:
> OK. Been using Cores for 4 years. Want to migrate to collections / Cloud.
> 
> Do we have to change our queries?
> 
> http://loadbalancer:8983/solr/corename/select?q=*:*
> 
> What does this become once we have the collection sharded? Do we need a
> Load Balancer or just point to one box and run the new query? Or would it
> be better to hit the LB in case one machine is no longer good to go?
> 
> http://loadbalancer:8983/solr/collectionname/select?q=*:*
> 
> What features would not yet be ready for sharded setups with SolrCloud?
> In
> the past, facet counts were an issue, grouping? stats? as well as IDF for
> sorting by scores. i.e. facet.field=specialties. We want the Cardiologist
> specialty to have unique numbers across shards. So if shard1 has 4 people
> with Cardiology, and shard2 has 2 people with Cardiology, we would want
> the
> number to be 6. We would want facet.sort to work on counts... I guess we
> could index another collection for facets and just use 1 machine for
> that?
> But doesn't that defeat the purpose?
> 
> What is the best walk thru for SOLR 5.3.1 ?
> 
> Looking at https://wiki.apache.org/solr/SolrCloud

1. Your queries should stay (more or less) the same
2. If you name a collection the same as what you are using for a core,
your base URL will remain the same
3. If you use SolrJ, then you would change to CloudSolrClient, which
would feel quite different, but the SolrQuery objects should be
interchangeable
4. If you use SolrJ, then you don't need a load balancer - SolrJ will do
round robin against the Solr nodes for that collection. It will respond
to failures far faster than an LB ever could (I've seen downed machines
pulled in <200ms)
5. Regarding sharded setups, there's two scenarios to consider -
distributed in general, and solrcloud in particular. Every search
component must be enabled for distributed search (faceting,
highlighting, grouping, etc, etc). Some of the newer ones may not have
had distributed support implemented yet. Others, such as Joining, will
require particular concern, and will work in only a subset of
conditions.
6. For IDF, mostly, IDF balances itself across the shards. If it
doesn't, then distributed IDF is available, but that has a cost in terms
of additional network traffic.
7. Faceting should work just fine (as you describe) across shards. I
would check specifically on newer faceting features though before
assuming anything.
8. facet.sort+counts, have you tried it?
9. I would consider this to be a more up-to-date place to go:
https://cwiki.apache.org/confluence/display/solr/SolrCloud

Upayavira

Mime
View raw message