Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 075F7200B70 for ; Sat, 27 Aug 2016 13:46:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 05E86160AB2; Sat, 27 Aug 2016 11:46:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C8588160AA6 for ; Sat, 27 Aug 2016 13:46:39 +0200 (CEST) Received: (qmail 6723 invoked by uid 500); 27 Aug 2016 11:46:38 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 6713 invoked by uid 99); 27 Aug 2016 11:46:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Aug 2016 11:46:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7B89AC1365 for ; Sat, 27 Aug 2016 11:46:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.199 X-Spam-Level: * X-Spam-Status: No, score=1.199 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=teralytics.ch Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id mnuoZk4DvOVf for ; Sat, 27 Aug 2016 11:46:34 +0000 (UTC) Received: from mail-ua0-f171.google.com (mail-ua0-f171.google.com [209.85.217.171]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 3D73B5F2F0 for ; Sat, 27 Aug 2016 11:46:33 +0000 (UTC) Received: by mail-ua0-f171.google.com with SMTP id l94so115196128ual.0 for ; Sat, 27 Aug 2016 04:46:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=teralytics.ch; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=lh0h0GihH9yG8ckLyoJjQQGg3XichOi8X9Jh1jp/ERw=; b=ZSkvbZElOTIYnGsqdoNHR0ZF68vCBjoA+TpJJ/QsRECoWEzDi5ZNGnUNua4d9mXsqs xwZy4G6DF2qQWWGirEWC38kKfUA0OVM6/mYqNGxSa0jsIET5gcUwlmXw/ywiIPJVOQW5 eEYG2YJsxYxEcMnjOOi/ARlEVB4DTgEpHYmTY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=lh0h0GihH9yG8ckLyoJjQQGg3XichOi8X9Jh1jp/ERw=; b=PF6U0op7xvxmjrBYnOPjEmpaG8R+LWudmILQuSwhvhopEXRStctZxsdm+tsY5EIKoo UIS3bzQ+VuinZBVbdy46Zwl7yKI87bLdRgfJkmDvoNhyj1ulu91qTnSO0loCMoawHohw ea0y0slss087PV0LYem0jdGCN51lgzCc9X8UJx8KiAW5aXkACUnAmyA8RY0APB2/uKXi Ld21YnBkr8+MMmpGrt7suhRNa1pZQVauMw2pJ4vDBKjXWhQdZ9KU75TupAN1NJPyFuAK 8YM+y+0J6EShtBgNHTNhJBrwmyzOyb9lWpDtSD30Zp3xO1EfbG91XIxggD1ZH+bbGtj3 rXtw== X-Gm-Message-State: AE9vXwNc/o9v/rdNjGjrQKwE4A2gUtYsMVHcnESblqjWzJWGjLi8AmuRjLMP7idiPsyhh+HTS3D2GG8RUwettw== X-Received: by 10.31.205.66 with SMTP id d63mr4355530vkg.80.1472298385778; Sat, 27 Aug 2016 04:46:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.50.138 with HTTP; Sat, 27 Aug 2016 04:46:25 -0700 (PDT) In-Reply-To: References: <57C0C32C.3050402@bbn.com> From: Mario Pastorelli Date: Sat, 27 Aug 2016 13:46:25 +0200 Message-ID: Subject: Re: Profile a (batch) scan To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=001a114dd88c49c1fe053b0c2e6b archived-at: Sat, 27 Aug 2016 11:46:41 -0000 --001a114dd88c49c1fe053b0c2e6b Content-Type: text/plain; charset=UTF-8 That StopWatch class is a great idea, thanks! That's exactly what I need to check the performance of the iterators. On Sat, Aug 27, 2016 at 2:21 AM, Dylan Hutchison wrote: > If Zipkin isn't providing fine-grained enough information, one approach > you could take is to add an iterator to your scan that tracks performance. > Log the start times, seek ranges, and count throughput. Each tablet will > do this independently in separate threads. > > You might base this approach on Accumulo's StopWatch > > utility class. I adapted this class when I wanted to time particular code > segments in the Graphulo Watch > > class. You can see the Watch in action in the commented segments of this > old branch > . > It captures the number of times a code segment is called, the average > runtime in that section, and the max and min runtime in that section. It's > a kind of brute force approach, one that would have been better to > integrate with HTrace, but it worked well enough for my purpose. > > In your iterator, you would count and measure the time statistics for the > next and seek methods. > > On Fri, Aug 26, 2016 at 3:31 PM, Jonathan Lasko wrote: > >> I haven't had time to dig into it yet but am hoping the Zipkin will help >> with some of these insights. (Unless that is the distributed trace you were >> referring to?) >> >> -Jonathan >> >> >> On 08/26/2016 04:54 PM, Mario Pastorelli wrote: >> >> I would like to understand the performance of a batch scan and I would >> like to have some hints on how to proceed. I have enabled the distributed >> trace, and it tells me that some batch scanner threads take much more time >> than others to complete but this is not helpful enough because it's not >> telling me why some threads take more. My gut feeling is that one batch >> thread is scanning more data than the others, which means that the data is >> not well distributed for a query, but I use a random shard byte as prefix >> of the keys which should guarantee that data of the same range is almost >> equally distributed among the tservers. I enabled JMX on the tservers and >> attached jvisualvm to get an idea of the state of each tserver but I >> couldn't find anything meaningful. I would like to know if there is a way >> to profile what's going on on a single tserver for a single scan thread and >> by this I mean: >> >> 1. where are the tablets required by a scan? Which tablet server? >> 2. how fast was the lookups on the index for that scan? >> 3. how many bytes/records were read for that scan without the >> iterators >> 4. how many seeks are done by the scan and possibly why >> >> The main Accumulo UI is fine to get an overview of Accumulo but don't >> really give you any information about the performance of a single query and >> it seems to me that they are heavily affected by what iterators do. >> Profiling a single scan is much more interesting. Is there a way to profile >> a single (batch) scan in Accumulo such that I have a complete overview of >> the entire process of reading and sending back records to the driver? >> >> Thanks, >> Mario >> >> -- >> Mario Pastorelli | TERALYTICS >> >> *software engineer* >> >> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland >> phone: +41794381682 >> email: mario.pastorelli@teralytics.ch >> www.teralytics.net >> >> Company registration number: CH-020.3.037.709-7 | Trade register Canton >> Zurich >> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, >> Yann de Vries >> >> This e-mail message contains confidential information which is for the >> sole attention and use of the intended recipient. Please notify us at once >> if you think that it may not be intended for you and delete it immediately. >> >> >> > -- Mario Pastorelli | TERALYTICS *software engineer* Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland phone: +41794381682 email: mario.pastorelli@teralytics.ch www.teralytics.net Company registration number: CH-020.3.037.709-7 | Trade register Canton Zurich Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann de Vries This e-mail message contains confidential information which is for the sole attention and use of the intended recipient. Please notify us at once if you think that it may not be intended for you and delete it immediately. --001a114dd88c49c1fe053b0c2e6b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
That StopWatch class is a great idea, thanks! That's e= xactly what I need to check the performance of the iterators.

On Sat, Aug 27, 2016 = at 2:21 AM, Dylan Hutchison <dhutchis@cs.washington.edu> wrote:
If Zipkin i= sn't providing fine-grained enough information, one approach you could = take is to add an iterator to your scan that tracks performance.=C2=A0 Log = the start times, seek ranges, and count throughput.=C2=A0 Each tablet will = do this independently in separate threads.

You might bas= e this approach on Accumulo's StopWatch utility class.=C2=A0 I adapted this = class when I wanted to time particular code segments in the Graphulo Watch class.=C2=A0 You ca= n see the Watch in action in the commented segments of this old branch.=C2= =A0 It captures the number of times a code segment is called, the average r= untime in that section, and the max and min runtime in that section.=C2=A0 = It's a kind of brute force approach, one that would have been better to= integrate with HTrace, but it worked well enough for my purpose.

In your iterator, you would count and measure the time stat= istics for the next and seek methods. =C2=A0

On Fri, Aug 26, 2016 at 3:31 PM, Jonathan Lasko <<= a href=3D"mailto:jlasko@bbn.com" target=3D"_blank">jlasko@bbn.com> wrote:
=20 =20 =20
I haven't had time to dig into it yet but am hoping the Zipkin will help with some of these insights. (Unless that is the distributed trace you were referring to?)

-Jonathan


On 08/26/2016 04:54 PM, Mario Pastorelli wrote:
I would like to understand the performance of a batch scan and I would like to have some hints on how to proceed. I have enabled the distributed trace, and it tells me that some batch scanner threads take much more time than others to complete but this is not helpful enough because it's not telling me why some threads take more. My gut feeling is that one batch thread is scanning more data than the others, which means that the data is not well distributed for a query, but I use a random shard byte as prefix of the keys which should guarantee that data of the same range is almost equally distributed among the tservers. I enabled JMX on the tservers and attached jvisualvm to get an idea of the state of each tserver but I couldn't find anything meaningful. I would like t= o know if there is a way to profile what's going on on a single tserver for a single scan thread and by this I mean:
  1. where are the tablets required by a scan? Which tablet server?
  2. how fast was the lookups on the index for that scan?
  3. how many bytes/records were read for that scan without the iterators
  4. how many seeks are done by the scan and possibly why
The main Accumulo UI is fine to get an overview of Accumulo but don't really give you any information abou= t the performance of a single query and it seems to me that they are heavily affected by what iterators do. Profiling a single scan is much more interesting. Is there a way to profile a single (batch) scan in Accumulo such that I have a complete overview of the entire process of reading and sending back records to the driver?

Thanks,
Mario

--
Mario Pastorelli | TERALYTICS

= software engineer

Teralytics AG |=C2=A0Zollstrasse 62 | 8005 Zurich= =C2=A0| Switzerland=C2=A0
phone:
+41794381682
email: mario.pastorel= li@teralytics.ch

www.teralytics.net=

Company registration number: CH-020.3.037.709-7 | Trade register Canton Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann de Vries

This e-mail message contains confidential information which is for the sole attention and use of the intended recipient. Please notify us at once if you think that it may not be intended for you and delete it immediately.






--
Mario Pastorelli | TERALYTICS

software engineer

Teralytics AG |=C2=A0Zollstrasse 62 | 8005 Zurich=C2=A0| Switzerlan= d=C2=A0
phone:
+41794381682
email: mario.pastorelli@teralytics.ch=

www.teralytics.net

<= p style=3D"margin-bottom:0.0001pt;line-height:16pt;background-image:initial= ;background-repeat:initial">Company registration numbe= r: CH-020.3.037.709-7 | Trade register Canton Zurich
Board of directors: Georg Polzer, Luc= iano Franceschina, Mark Schmitz, Yann de Vries

This e-mail message contains confidential i= nformation which is for the sole attention and use of the intended recipient. Please notify us at once = if you think that it may not be intended for you and delete it immediately.

--001a114dd88c49c1fe053b0c2e6b--