Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 245B118F72 for ; Mon, 30 Nov 2015 14:14:58 +0000 (UTC) Received: (qmail 69243 invoked by uid 500); 30 Nov 2015 14:14:52 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 68431 invoked by uid 500); 30 Nov 2015 14:14:51 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 68242 invoked by uid 99); 30 Nov 2015 14:14:51 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Nov 2015 14:14:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 847BC1A5B7F for ; Mon, 30 Nov 2015 12:36:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.08 X-Spam-Level: * X-Spam-Status: No, score=1.08 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, KAM_COUK=1.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=messagingengine.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id n7qiU0E5vHeR for ; Mon, 30 Nov 2015 12:36:10 +0000 (UTC) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id A37E22026E for ; Mon, 30 Nov 2015 12:36:09 +0000 (UTC) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id AC47D20C35 for ; Mon, 30 Nov 2015 07:36:08 -0500 (EST) Received: from web3 ([10.202.2.213]) by compute5.internal (MEProxy); Mon, 30 Nov 2015 07:36:08 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=WccbyQqf6yb2r6K yutv1dJ+4DEI=; b=fttWWn/fxEPeLyiWh1yh/JqsWvRB+mM5tG0qqAJUa174Wtc esY7GSo6672KNb3bNTkXVPXPLnn+tVaMTNLlkH06kOcdkWIrxt2psNyK9FmwkHSq Hg5IL6boGDdUKT9aZwu8JQEDDx/SaAcLILzHQWvF3axaCKHO3aaE3X7MFWEU= Received: by web3.nyi.internal (Postfix, from userid 99) id 7BED7111744; Mon, 30 Nov 2015 07:36:08 -0500 (EST) Message-Id: <1448886968.98880.453430817.7EC28101@webmail.messagingengine.com> X-Sasl-Enc: Nrt0zheSL1SZMpJnMW34QfFWt9xSjnfuS4Iwv9gXccj7 1448886968 From: Upayavira To: solr-user@lucene.apache.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-227d657c Subject: Re: Migrating from cores to collections Date: Mon, 30 Nov 2015 12:36:08 +0000 In-Reply-To: References: On Sun, Nov 29, 2015, at 07:38 PM, William Bell wrote: > OK. Been using Cores for 4 years. Want to migrate to collections / Cloud. > > Do we have to change our queries? > > http://loadbalancer:8983/solr/corename/select?q=*:* > > What does this become once we have the collection sharded? Do we need a > Load Balancer or just point to one box and run the new query? Or would it > be better to hit the LB in case one machine is no longer good to go? > > http://loadbalancer:8983/solr/collectionname/select?q=*:* > > What features would not yet be ready for sharded setups with SolrCloud? > In > the past, facet counts were an issue, grouping? stats? as well as IDF for > sorting by scores. i.e. facet.field=specialties. We want the Cardiologist > specialty to have unique numbers across shards. So if shard1 has 4 people > with Cardiology, and shard2 has 2 people with Cardiology, we would want > the > number to be 6. We would want facet.sort to work on counts... I guess we > could index another collection for facets and just use 1 machine for > that? > But doesn't that defeat the purpose? > > What is the best walk thru for SOLR 5.3.1 ? > > Looking at https://wiki.apache.org/solr/SolrCloud 1. Your queries should stay (more or less) the same 2. If you name a collection the same as what you are using for a core, your base URL will remain the same 3. If you use SolrJ, then you would change to CloudSolrClient, which would feel quite different, but the SolrQuery objects should be interchangeable 4. If you use SolrJ, then you don't need a load balancer - SolrJ will do round robin against the Solr nodes for that collection. It will respond to failures far faster than an LB ever could (I've seen downed machines pulled in <200ms) 5. Regarding sharded setups, there's two scenarios to consider - distributed in general, and solrcloud in particular. Every search component must be enabled for distributed search (faceting, highlighting, grouping, etc, etc). Some of the newer ones may not have had distributed support implemented yet. Others, such as Joining, will require particular concern, and will work in only a subset of conditions. 6. For IDF, mostly, IDF balances itself across the shards. If it doesn't, then distributed IDF is available, but that has a cost in terms of additional network traffic. 7. Faceting should work just fine (as you describe) across shards. I would check specifically on newer faceting features though before assuming anything. 8. facet.sort+counts, have you tried it? 9. I would consider this to be a more up-to-date place to go: https://cwiki.apache.org/confluence/display/solr/SolrCloud Upayavira