Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6411210D8A for ; Mon, 9 Dec 2013 15:08:28 +0000 (UTC) Received: (qmail 21702 invoked by uid 500); 9 Dec 2013 15:08:23 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 21634 invoked by uid 500); 9 Dec 2013 15:08:23 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 21624 invoked by uid 99); 9 Dec 2013 15:08:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 15:08:22 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of fgaule@despegar.com designates 64.76.243.92 as permitted sender) Received: from [64.76.243.92] (HELO mail4.despegar.com) (64.76.243.92) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 15:08:18 +0000 DomainKey-Signature: s=despegar; d=despegar.com; c=simple; q=dns; h=Received:X-Default-Received-SPF:Received:Return-Path: Received:X-Google-DKIM-Signature:MIME-Version:X-Received: Received:In-Reply-To:References:Date:Message-ID:Subject: From:To:Content-Type; b=qr2+4xUxKaZXXlpvcJMvNAeu6aC1tOp+GR0HWB43wLO64LSYvOJdUcAg XZ9Phw+FlWeTXn5VaPz9FCoqbsT9lXMZ4Uon660jZDTQZ957Zt2MO5TsY 0FAQ7t4MkQj+Blk3/4azb2LreAmT4TNN8S8jscoDGZc13cXl+9/YsWZuj bqBX1M2KO5DfQNCn4HDVuF1yQSjEyxIS5RwnRv4jlgCZVA5HW8DD7ZFoE jR0q7lnHdfDBTmxB08RwB6ycmUPqJQ3OU05pst5X/o82/evKtCXTj34pJ IwgM88G9+n3Tsnn4P7flemehYAYgp8k9kbv8NBAjMcH2WWGZbu6RIJDXH A==; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=despegar.com; i=@despegar.com; q=dns/txt; s=despegar; t=1386601697; x=1418137697; h=mime-version:in-reply-to:references:date:message-id: subject:from:to; bh=emrZoirW5wZF2njJh1OZ17sLj2PCccQ2/tOIqkeUmbY=; b=VtNURVoxYhEryOZ4kYXkL2/UMtHPlo1FCRtkmfW+IWsxMW4WRmpdW/6Z VmLMNbD4ix+NyNLHM6Tpv6bYENyZw/DMvuTaU7373m1HjSTV8AtL/FrZF LBP0z4L6RfzpZ5Gelt5uMnaDVC4WIDR48igmy0esCiO1FfQigALOTIF9T a4KPNoruZTjeBU7OZ/okJj8MsWdtRi/gP5XgvU372+Bt5+VFKdx0XT0HI /in0Vmsf+SaWdFcOsMzKJIB4ekOgV9vLJ9fvbBRJl7EuvsRujZ80fNdUQ jhMgtiJfNyzli9kK42HpLMWlw9YSprVZv/Gs6/n8NDu4mpYAW7qNaiJCE A==; Received: from mail01.despexds.net (HELO despegar.com) ([10.1.1.58]) by ironport.despexds.net with ESMTP; 09 Dec 2013 09:07:56 -0600 X-Default-Received-SPF: pass (skip=loggedin (res=PASS)); Received: from mail-wi0-f169.google.com (unverified [209.85.212.169]) by despegar.com (SurgeMail 6.4a) with ESMTP (TLS) id 169305514-1737607 for ; Mon, 09 Dec 2013 10:07:56 -0500 Received: by mail-wi0-f169.google.com with SMTP id hn6so3899070wib.4 for ; Mon, 09 Dec 2013 07:07:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=emrZoirW5wZF2njJh1OZ17sLj2PCccQ2/tOIqkeUmbY=; b=GB1UvBQSIuSC0QKRhTQGZEr7Lg8l1PRFK/Krkpz4gjgwACq3FEx4YDEVjNjsLpPnr8 azFLATsjPrv/SU5gSjgaakafOlaIM4H+DLoA/8akS1NvCvi5Ywxlr+1Hfi9XluK6PKcM dQ5vhzhuON4itx+cQX6dBWu8uc0hIVSAbRkh79brpHRvNWv4Z90QDQtA9PygIUi36a4C 0SyptonwQ7QP40OQvi8o6j4Pi6YcyPQGptEK6DUrKdLdVwrriE8/lUwB4tykoUFr7heI DFR5SRFd4I69qy8AAvgDPQtOHaQq+AIlyZa42B9gpSvtyLZ/tZNkA+q4QGrhOLi0Gq4I MnUg== MIME-Version: 1.0 X-Received: by 10.194.9.41 with SMTP id w9mr4201515wja.82.1386601674129; Mon, 09 Dec 2013 07:07:54 -0800 (PST) Received: by 10.194.175.97 with HTTP; Mon, 9 Dec 2013 07:07:53 -0800 (PST) In-Reply-To: References: Date: Mon, 9 Dec 2013 13:07:53 -0200 Message-ID: Subject: Re: RPC - Queue Time when handlers are all waiting From: Federico Gaule To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b5d23d23b477c04ed1b5c6d X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5d23d23b477c04ed1b5c6d Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Here is a thread saying what i think it should be ( http://grokbase.com/t/hbase/user/13bmndq53k/average-rpc-queue-time) "The RpcQueueTime metrics are a measurement of how long individual calls stay in this queued state. If your handlers were never 100% occupied, this value would be 0. An average of 3 hours is concerning, it basically means that when a call comes into the RegionServer it takes on average 3 hours to start processing, because handlers are all occupied for that amount of time." Is that correct? 2013/12/9 Federico Gaule > Correct me if i'm wrong, but, Queues should be used only when handlers ar= e > all busy, shouldn't it?. > If that's true, i don't get why there is activity related to queues. > > Maybe i'm missing some piece of knowledge about when hbase is using queue= s > :) > > Thanks > > > 2013/12/9 Jean-Marc Spaggiari > >> There might be something I'm missing ;) >> >> On cluster B, as you said, never more than 50% of your handlers are used= . >> Your Ganglia metrics are showing that there is activities (num ops is >> increasing), which is correct. >> >> Can you please confirm what you think is wrong from your charts? >> >> Thanks, >> >> JM >> >> >> 2013/12/9 Federico Gaule >> >> > Hi JM, >> > Cluster B is only receiving replication data (writes), but handlers ar= e >> > waiting most of the time (never 50% of them are used). As i have read, >> RPC >> > queue is only used when handlers are all waiting, does it count for >> > replication as well? >> > >> > Thanks! >> > >> > >> > 2013/12/9 Jean-Marc Spaggiari >> > >> > > Hi, >> > > >> > > When you say that B doesn't get any read/write operation, does it me= an >> > you >> > > stopped the replication? Or B is still getting the write operations >> from >> > A >> > > because of the replication? If so, that's why you RPC queue is used.= .. >> > > >> > > JM >> > > >> > > >> > > 2013/12/9 Federico Gaule >> > > >> > > > Not much information in RS logs (DEBUG level set to >> > > > org.apache.hadoop.hbase). Here is a sample of one regionserver >> showing >> > > > increasing rpc.metrics.RpcQueueTime_num_ops and >> > > > rpc.metrics.RpcQueueTime_avg_time >> > > > activity: >> > > > >> > > > 2013-12-09 08:09:10,699 DEBUG >> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=3D23.= 14 >> MB, >> > > > free=3D2.73 GB, max=3D2.75 GB, blocks=3D0, accesses=3D122442151, >> > hits=3D122168501, >> > > > hitRatio=3D99.77%, , cachingAccesses=3D122192927, cachingHits=3D12= 2162378, >> > > > cachingHitsRatio=3D99.97%, , evictions=3D0, evicted=3D6768, >> > > > evictedPerRun=3DInfinity >> > > > 2013-12-09 08:09:11,396 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 1 >> > > > 2013-12-09 08:09:14,979 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 2 >> > > > 2013-12-09 08:09:16,016 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 1 >> > > > ... >> > > > 2013-12-09 08:14:07,659 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 1 >> > > > 2013-12-09 08:14:08,713 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 3 >> > > > 2013-12-09 08:14:10,699 DEBUG >> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=3D23.= 14 >> MB, >> > > > free=3D2.73 GB, max=3D2.75 GB, blocks=3D0, accesses=3D122442151, >> > hits=3D122168501, >> > > > hitRatio=3D99.77%, , cachingAccesses=3D122192927, cachingHits=3D12= 2162378, >> > > > cachingHitsRatio=3D99.97%, , evictions=3D0, evicted=3D6768, >> > > > evictedPerRun=3DInfinity >> > > > 2013-12-09 08:14:12,711 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 1 >> > > > 2013-12-09 08:14:14,778 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 3 >> > > > ... >> > > > 2013-12-09 08:15:09,199 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 3 >> > > > 2013-12-09 08:15:12,243 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 2 >> > > > 2013-12-09 08:15:22,086 INFO >> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: >> Total >> > > > replicated: 2 >> > > > >> > > > Thanks >> > > > >> > > > >> > > > 2013/12/7 Bharath Vissapragada >> > > > >> > > > > I'd look into the RS logs to see whats happening there. Difficul= t >> to >> > > > guess >> > > > > from the given information! >> > > > > >> > > > > >> > > > > On Sat, Dec 7, 2013 at 8:52 PM, Federico Gaule < >> fgaule@despegar.com> >> > > > > wrote: >> > > > > >> > > > > > Any clue? >> > > > > > El dic 5, 2013 9:49 a.m., "Federico Gaule" > > >> > > > > escribi=F3: >> > > > > > >> > > > > > > Hi, >> > > > > > > >> > > > > > > I have 2 clusters, Master (a) - Slave (b) replication. >> > > > > > > B doesn't have client write or reads, all handlers (100) are >> > > waiting >> > > > > but >> > > > > > > rpc.metrics.RpcQueueTime_num_ops and >> > > > rpc.metrics.RpcQueueTime_avg_time >> > > > > > reports >> > > > > > > to be rpc calls to be queued. >> > > > > > > There are some screenshots below to show ganglia metrics. Ho= w >> is >> > > this >> > > > > > > behaviour explained? I have looked for metrics specification= s >> but >> > > > can't >> > > > > > > find much information. >> > > > > > > >> > > > > > > Handlers >> > > > > > > http://i42.tinypic.com/242ssoz.png >> > > > > > > >> > > > > > > NumOps >> > > > > > > http://tinypic.com/r/of2c8k/5 >> > > > > > > >> > > > > > > AvgTime >> > > > > > > http://tinypic.com/r/2lsvg5w/5 >> > > > > > > >> > > > > > > Cheers >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > Bharath Vissapragada >> > > > > >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > >> > > > [image: >> http://www.despegar.com/galeria/images/promos/isodespegar1.png >> > ] >> > > > >> > > > *Ing. Federico Gaule* >> > > > L=EDder T=E9cnico - PAM >> > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) >> > > > tel. +54 (11) 4894-3500 >> > > > >> > > > *[image: Seguinos en Twitter!] >> > > [image: >> > > > Seguinos en Facebook!] [image: >> > > Seguinos >> > > > en YouTube!] * >> > > > *Despegar.com, Inc. * >> > > > El mejor precio para tu viaje. >> > > > >> > > > Este mensaje es confidencial y puede contener informaci=C3=B3n amp= arada >> por >> > > el >> > > > secreto profesional. >> > > > Si usted ha recibido este e-mail por error, por favor >> comun=C3=ADquenoslo >> > > > inmediatamente respondiendo a este e-mail y luego elimin=C3=A1ndol= o de >> su >> > > > sistema. >> > > > El contenido de este mensaje no deber=C3=A1 ser copiado ni divulga= do a >> > > ninguna >> > > > persona. >> > > > >> > > >> > >> > >> > >> > -- >> > >> > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png= ] >> > >> > *Ing. Federico Gaule* >> > L=EDder T=E9cnico - PAM >> > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) >> > tel. +54 (11) 4894-3500 >> > >> > *[image: Seguinos en Twitter!] >> [image: >> > Seguinos en Facebook!] [image: >> Seguinos >> > en YouTube!] * >> > *Despegar.com, Inc. * >> > El mejor precio para tu viaje. >> > >> > Este mensaje es confidencial y puede contener informaci=C3=B3n amparad= a por >> el >> > secreto profesional. >> > Si usted ha recibido este e-mail por error, por favor comun=C3=ADqueno= slo >> > inmediatamente respondiendo a este e-mail y luego elimin=C3=A1ndolo de= su >> > sistema. >> > El contenido de este mensaje no deber=C3=A1 ser copiado ni divulgado a >> ninguna >> > persona. >> > >> > > > > -- > > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png] > > *Ing. Federico Gaule* > L=EDder T=E9cnico - PAM > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > tel. +54 (11) 4894-3500 > > > *[image: Seguinos en Twitter!] [image: > Seguinos en Facebook!] [image: Seguino= s > en YouTube!] * > *Despegar.com, Inc. * > > El mejor precio para tu viaje. > > Este mensaje es confidencial y puede contener informaci=C3=B3n amparada p= or el > secreto profesional. > Si usted ha recibido este e-mail por error, por favor comun=C3=ADquenoslo > inmediatamente respondiendo a este e-mail y luego elimin=C3=A1ndolo de su > sistema. > El contenido de este mensaje no deber=C3=A1 ser copiado ni divulgado a ni= nguna > persona. > --=20 [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png] *Ing. Federico Gaule* L=EDder T=E9cnico - PAM Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) tel. +54 (11) 4894-3500 *[image: Seguinos en Twitter!] [image: Seguinos en Facebook!] [image: Seguinos en YouTube!] * *Despegar.com, Inc. * El mejor precio para tu viaje. Este mensaje es confidencial y puede contener informaci=C3=B3n amparada por= el secreto profesional. Si usted ha recibido este e-mail por error, por favor comun=C3=ADquenoslo inmediatamente respondiendo a este e-mail y luego elimin=C3=A1ndolo de su sistema. El contenido de este mensaje no deber=C3=A1 ser copiado ni divulgado a ning= una persona. --047d7b5d23d23b477c04ed1b5c6d--