lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susheel Kumar <susheel2...@gmail.com>
Subject Re: Multi tenant setup
Date Tue, 13 Jun 2017 17:12:47 GMT
Going with single cluster having multiple collections (for each client) is
what I would try.  How many clients do you have? If 10K, mean 10K
collections and then how many documents, their size etc. you will need to
come up with to nail down #machines and their memory/cpu requirements.
Going with single collection is not really a multi-tenant setup and also
when you have different schema's.

Thanks,
Susheel


On Tue, Jun 13, 2017 at 12:35 PM, Zisis T. <zistach@runbox.com> wrote:

> I'm trying to setup a multi-tenant Solr cluster (v6.5.1) which must meet
> the
> following requirements. The tenants are different customers with similar
> type of data.
>
> * Ability to query per client but also across all clients
> * Don't want to hit all shards for all type of requests (per client, across
> clients)
> * Don't want to have everything under a single multi-sharded collection to
> avoid a SPOF and maintenance headaches
>    (e.g. a schema change will force an all-client reindexing. single huge
> backup/restore)
> * Ability to semi-support different schemas.
>
> Based on the above I ruled out the following setups
> * Single multi-sharded collection for all clients and all its variations
> (e.g. multiple clients in a singe shard)
> * One collection per client
>
> My preference lies in a setup like the following
> * Create a limited # of collections
> * Split the clients in the collections created above based on some criteria
> (size, content-type)
> * Client specific requests will be limited in a single collection
> * Across clients requests will target a limited # of collections (using
> &collection=col_1,col_2,col_3)
>
> The approach above meets the requirements posted above but the issue that
> is
> blocking me is the Distributed IDF not working properly across collections.
> (Check comment#3, bullet#2 of
> http://lucene.472066.n3.nabble.com/Distributed-IDF-in-
> inter-collections-distributed-queries-td4317519.html)
>
>
> -> Do you see anything wrong with my assumptions/approach above? Are there
> any alternatives besides having separate clusters for the search across
> clients and the individual clients?
> -> Is it safe to go with a single collection? If it is, I still need to
> handle the possible different schemas per client somehow.
> -> Is there a way to enforce local stats when quering a single collection
> and use global stats only when querying across collections? (see link
> above)
>
> Thanks
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Multi-tenant-setup-tp4340377.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message