Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7CC08200B41 for ; Thu, 7 Jul 2016 10:13:01 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7B35D160A68; Thu, 7 Jul 2016 08:13:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7937E160A59 for ; Thu, 7 Jul 2016 10:13:00 +0200 (CEST) Received: (qmail 58379 invoked by uid 500); 7 Jul 2016 08:12:59 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 58367 invoked by uid 99); 7 Jul 2016 08:12:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Jul 2016 08:12:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 39FE4C0403 for ; Thu, 7 Jul 2016 08:12:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=dawandamail-com.20150623.gappssmtp.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id V6lX50_M2S0i for ; Thu, 7 Jul 2016 08:12:53 +0000 (UTC) Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id E6CAA5F23D for ; Thu, 7 Jul 2016 08:12:52 +0000 (UTC) Received: by mail-wm0-f49.google.com with SMTP id n127so6110041wme.1 for ; Thu, 07 Jul 2016 01:12:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dawandamail-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=XLpIx6+3aszjk4p2ToSwaRVVYtr8+Cz4Mw4bF1bagtI=; b=wj31zzKXHqfDgQ8hzpFQ1hA024J3rz6xwzGY3zv3Rh2f71msnxTBPiLov1WdIwRrP5 AJkCWOAx1cIVaciraRhdtPsKWCdv7hhFlczVWVOjrDiXw0BoqpD2YJCFzhtHpNCuJVqO Y8osQyd25fZhFOTz4V5dUumtC3onojrNatlNQMp/+q9ouEMtKTcvimBjrl5yhY+mfaHx DmFr+QMOBYiGhuwVrv2/wjzMRoiluIbBPC5ybIjcKdo/u0X84k5AvqpQ3lBP54CluSrH u1Rtqfzz5yIduCBHak28/BPrSe/IZHv1trGdTzNU4GrbB+vve6aTSGLv5I9J3Q4dBG5n Bx5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=XLpIx6+3aszjk4p2ToSwaRVVYtr8+Cz4Mw4bF1bagtI=; b=fjJjJxJRGFfsNtal+Q66DEtcuKRFipvlmldK8xCtK6mOXYnjPWZ5lUifvi7p5YGlMu piyUcB5aIIe4igZoUWVVgXXRRInV1I40CP+QNQTDuLfcmbCOqtEsppuTgNlKVopatFYB HuAul/6nGU09b4nf1F5PyrVwQqGySIUCXWWVxJ2gGy89UwJmyzWuzNPWnnjqEKStSYkB wPKKYrO27g+8rrtQ84iijF8McOz+Icbn/mztJVdcEe5F93J2ugjHlMgo4EvnLkP6XqUy 4AtBRdyS3TWkE9VtvaSbTelip0pIYf8pustW2du/XU8LgJun257NbxyRT6HbL1VrB9VW TKDA== X-Gm-Message-State: ALyK8tLPPT2FvNYQiHmtPXojuA6V61VYQv/5/BGD6X0ibyYX0GrCLEKlFXrOOfHQcg29ikBUUjiu7z0vTsuhLPuw X-Received: by 10.194.3.7 with SMTP id 7mr4787441wjy.68.1467879170854; Thu, 07 Jul 2016 01:12:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.55.102 with HTTP; Thu, 7 Jul 2016 01:12:31 -0700 (PDT) In-Reply-To: References: <89e80916-842c-b567-043a-c64ef6cd9fc9@elyograg.org> From: =?UTF-8?Q?Lorenzo_Fundar=C3=B3?= Date: Thu, 7 Jul 2016 10:12:31 +0200 Message-ID: Subject: Re: deploy solr on cloud providers To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=047d7b3a84a68d7df00537074043 archived-at: Thu, 07 Jul 2016 08:13:01 -0000 --047d7b3a84a68d7df00537074043 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thank you Tomas, I would take a thorough look to the jira ticket you're pointing out. On 6 July 2016 at 20:49, Tom=C3=A1s Fern=C3=A1ndez L=C3=B6bbe wrote: > On Wed, Jul 6, 2016 at 2:30 AM, Lorenzo Fundar=C3=B3 < > lorenzo.fundaro@dawandamail.com> wrote: > > > On 6 July 2016 at 00:00, Tom=C3=A1s Fern=C3=A1ndez L=C3=B6bbe > > wrote: > > > > > The leader will do the replication before responding to the client, s= o > > lets > > > say the leader gets to update it's local copy, but it's terminated > before > > > sending the request to the replicas, the client should get either an > HTTP > > > 500 or no http response. From the client code you can take action (lo= g, > > > retry, etc). > > > > > > > If this true then whenever I ask for min_rf having three nodes (1 leade= r > + > > 2 replicas) > > I should get rf =3D 3, but in reality i don't. > > > > > > > The "min_rf" is useful for the case where replicas may be down or not > > > accessible. Again, you can use this for retrying or take any necessar= y > > > action on the client side if the desired rf is not achieved. > > > > > > > > > I think both paragraphs are contradictory. If the leader does the > > replication before responding to the client, then > > why is there a need to use the min_rf ? I don;t think is true that you > get > > a 200 when the update has been passed to all replicas. > > > > The reason why "min_rf" is there is because: > * If there are no replicas at the time of the request (e.g. if replicas a= re > unreachable and disconnected from ZK) > * Replicas could fail to ACK the update request from the leader, in that > case the leader will mark them as unhealthy but would HTTP 200 to the > client. > > So, it could happen that you think your data is being replicated to 3 > replicas, but 2 of them are currently out of service, this means that you= r > doc is in a single host, and if that one dies, then you lose that data. I= n > order to prevent this, you can ask Solr to tell you how many replicas > succeeded that update request. You can read more about this in > https://issues.apache.org/jira/browse/SOLR-5468 > > > > > > The thing is that, when you have persistent storage yo shouldn't worry > > about this because you know when the node comes back > > the rest of the index will be sync, the problem is when you don't have > > persistent storage. For my particular case I have to be extra careful a= nd > > always > > make sure that all my replicas have all the data I sent. > > > > In any case you should assume that storage on a host can be completely > lost, no mater if you are deploying on premises or on the cloud. Consider > that once that host comes back (could be hours later) it could be already > out of date, and will replicate from the current leader, possibly droppin= g > parts or all it's current index. > > Tom=C3=A1s > > > > > > > Tom=C3=A1s > > > > > > On Tue, Jul 5, 2016 at 11:39 AM, Lorenzo Fundar=C3=B3 < > > > lorenzo.fundaro@dawandamail.com> wrote: > > > > > > > @Tomas and @Steven > > > > > > > > I am a bit skeptical about this two statements: > > > > > > > > If a node just disappears you should be fine in terms of data > > > > > availability, since Solr in "SolrCloud" replicates the data as it > > comes > > > > it > > > > > (before sending the http response) > > > > > > > > > > > > and > > > > > > > > > > > > > > You shouldn't "need" to move the storage as SolrCloud will > replicate > > > all > > > > > data to the new node and anything in the transaction log will > already > > > be > > > > > distributed through the rest of the machines.. > > > > > > > > > > > > because according to the official documentation here > > > > < > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Faul= t+Tolerance > > > > >: > > > > (Write side fault tolerant -> recovery) > > > > > > > > If a leader goes down, it may have sent requests to some replicas a= nd > > not > > > > > others. So when a new potential leader is identified, it runs a > synch > > > > > process against the other replicas. If this is successful, > everything > > > > > should be consistent, the leader registers as active, and normal > > > actions > > > > > proceed > > > > > > > > > > > > I think there is a possibility that an update is not sent by the > leader > > > but > > > > is kept in the local disk and after it comes up again it can sync t= he > > > > non-sent data. > > > > > > > > Furthermore: > > > > > > > > Achieved Replication Factor > > > > > When using a replication factor greater than one, an update reque= st > > may > > > > > succeed on the shard leader but fail on one or more of the > replicas. > > > For > > > > > instance, consider a collection with one shard and replication > factor > > > of > > > > > three. In this case, you have a shard leader and two additional > > > replicas. > > > > > If an update request succeeds on the leader but fails on both > > replicas, > > > > for > > > > > whatever reason, the update request is still considered successfu= l > > from > > > > the > > > > > perspective of the client. The replicas that missed the update wi= ll > > > sync > > > > > with the leader when they recover. > > > > > > > > > > > > They have implemented this parameter called *min_rf* that you can u= se > > > > (client-side) to make sure that your update was replicated to at > least > > > one > > > > replica (e.g.: min_rf > 1). > > > > > > > > This is why my concern about moving storage around, because then I > know > > > > when the shard leader comes back, solrcloud will run sync process f= or > > > those > > > > documents that couldn't be sent to the replicas. > > > > > > > > Am I missing something or misunderstood the documentation ? > > > > > > > > Cheers ! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 5 July 2016 at 19:49, Davis, Daniel (NIH/NLM) [C] < > > > daniel.davis@nih.gov > > > > > > > > > wrote: > > > > > > > > > Lorenzo, this probably comes late, but my systems guys just don't > > want > > > to > > > > > give me real disk. Although RAID-5 or LVM on-top of JBOD may be > > > better > > > > > than Amazon EBS, Amazon EBS is still much closer to real disk in > > terms > > > of > > > > > IOPS and latency than NFS ;) I even ran a mini test (not an > > official > > > > > benchmark), and found the response time for random reads to be > > better. > > > > > > > > > > If you are a young/smallish company, this may be all in the cloud= , > > but > > > if > > > > > you are in a large organization like mine, you may also need to > allow > > > for > > > > > other architectures, such as a "virtual" Netapp in the cloud that > > > > > communicates with a physical Netapp on-premises, and the > > > > throughput/latency > > > > > of that. The most important thing is to actually measure the > > numbers > > > > you > > > > > are getting, both for search and for simply raw I/O, or to get yo= ur > > > > > systems/storage guys to measure those numbers. If you get you= r > > > > > systems/storage guys to just measure storage - you will want to > care > > > > about > > > > > three things for indexing primarily: > > > > > > > > > > Sequential Write Throughput > > > > > Random Read Throughput > > > > > Random Read Response Time/Latency > > > > > > > > > > Hope this helps, > > > > > > > > > > Dan Davis, Systems/Applications Architect (Contractor), > > > > > Office of Computer and Communications Systems, > > > > > National Library of Medicine, NIH > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Lorenzo Fundar=C3=B3 [mailto:lorenzo.fundaro@dawandamail.co= m] > > > > > Sent: Tuesday, July 05, 2016 3:20 AM > > > > > To: solr-user@lucene.apache.org > > > > > Subject: Re: deploy solr on cloud providers > > > > > > > > > > Hi Shawn. Actually what im trying to find out is whether this is > the > > > best > > > > > approach for deploying solr in the cloud. I believe solrcloud > solves > > a > > > > lot > > > > > of problems in terms of High Availability but when it comes to > > storage > > > > > there seems to be a limitation that can be workaround of course b= ut > > > it's > > > > a > > > > > bit cumbersome and i was wondering if there is a better option fo= r > > this > > > > or > > > > > if im missing something with the way I'm doing it. I wonder if > there > > > are > > > > > some proved experience about how to solve the storage problem whe= n > > > > > deploying in the cloud. Any advise or point to some enlightening > > > > > documentation will be appreciated. Thanks. > > > > > On Jul 4, 2016 18:27, "Shawn Heisey" wrote: > > > > > > > > > > > On 7/4/2016 10:18 AM, Lorenzo Fundar=C3=B3 wrote: > > > > > > > when deploying solr (in solrcloud mode) in the cloud one has = to > > > take > > > > > > > care of storage, and as far as I understand it can be a probl= em > > > > > > > because the storage should go wherever the node is created. I= f > we > > > > > > > have for example, a node on EC2 with its own persistent disk, > > this > > > > > > > node happens to be the leader and at some point crashes but > > > couldn't > > > > > > > make the replication of the data that has in the transaction > log, > > > > > > > how do we do in that case ? Ideally the new node must use the > > > > > > > leftover data that the death node left, but this is a bit > > > cumbersome > > > > > > > in my opinion. What are the best practices for this ? > > > > > > > > > > > > I can't make any sense of this. What is the *exact* problem yo= u > > need > > > > > > to solve? The details can be very important. > > > > > > > > > > > > We might be dealing with this: > > > > > > > > > > > > http://people.apache.org/~hossman/#xyproblem > > > > > > > > > > > > Thanks, > > > > > > Shawn > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > -- > > > > Lorenzo Fundaro > > > > Backend Engineer > > > > E-Mail: lorenzo.fundaro@dawandamail.com > > > > > > > > Fax + 49 - (0)30 - 25 76 08 52 > > > > Tel + 49 - (0)179 - 51 10 982 > > > > > > > > DaWanda GmbH > > > > Windscheidstra=C3=9Fe 18 > > > > 10627 Berlin > > > > > > > > Gesch=C3=A4ftsf=C3=BChrer: Claudia Helming und Niels N=C3=BCssler > > > > AG Charlottenburg HRB 104695 B http://www.dawanda.com > > > > > > > > > > > > > > > -- > > > > -- > > Lorenzo Fundaro > > Backend Engineer > > E-Mail: lorenzo.fundaro@dawandamail.com > > > > Fax + 49 - (0)30 - 25 76 08 52 > > Tel + 49 - (0)179 - 51 10 982 > > > > DaWanda GmbH > > Windscheidstra=C3=9Fe 18 > > 10627 Berlin > > > > Gesch=C3=A4ftsf=C3=BChrer: Claudia Helming und Niels N=C3=BCssler > > AG Charlottenburg HRB 104695 B http://www.dawanda.com > > > --=20 --=20 Lorenzo Fundaro Backend Engineer E-Mail: lorenzo.fundaro@dawandamail.com Fax + 49 - (0)30 - 25 76 08 52 Tel + 49 - (0)179 - 51 10 982 DaWanda GmbH Windscheidstra=C3=9Fe 18 10627 Berlin Gesch=C3=A4ftsf=C3=BChrer: Claudia Helming und Niels N=C3=BCssler AG Charlottenburg HRB 104695 B http://www.dawanda.com --047d7b3a84a68d7df00537074043--