Return-Path: X-Original-To: apmail-htrace-dev-archive@minotaur.apache.org Delivered-To: apmail-htrace-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 652D618CCF for ; Mon, 27 Jul 2015 21:33:44 +0000 (UTC) Received: (qmail 28495 invoked by uid 500); 27 Jul 2015 21:33:44 -0000 Delivered-To: apmail-htrace-dev-archive@htrace.apache.org Received: (qmail 28446 invoked by uid 500); 27 Jul 2015 21:33:44 -0000 Mailing-List: contact dev-help@htrace.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@htrace.incubator.apache.org Delivered-To: mailing list dev@htrace.incubator.apache.org Received: (qmail 28434 invoked by uid 99); 27 Jul 2015 21:33:43 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jul 2015 21:33:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 552141A79FC for ; Mon, 27 Jul 2015 21:33:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.099 X-Spam-Level: X-Spam-Status: No, score=-0.099 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=slice.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Yh7DOjbRljPE for ; Mon, 27 Jul 2015 21:33:36 +0000 (UTC) Received: from mail-ig0-f178.google.com (mail-ig0-f178.google.com [209.85.213.178]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 1C4AC21021 for ; Mon, 27 Jul 2015 21:33:36 +0000 (UTC) Received: by igbij6 with SMTP id ij6so90480446igb.1 for ; Mon, 27 Jul 2015 14:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=slice.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=O61pvxwxozde9MvFDX9r3KBN+Gjwn2i9MWLQNzVRvmM=; b=IpgOaZ6NxeOZ4XEB5sNLcnJhcq8sNhfhCl/2/VzQVBv2sU3kqQ6o+EujFdSKrH1mqn L32q69zEi8Pij7SAG7Spb7ENfEPsNLNhOePduA1sjOE7FR6F95UGixefNw3Su5Bt27F/ OBtHLLy2H9O/IxSWamzLnChU1xpHIlhI0iaH0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=O61pvxwxozde9MvFDX9r3KBN+Gjwn2i9MWLQNzVRvmM=; b=CgVyGtOQuTHp2IMI98IN/6Cq8HhGPn/+xfFkFDAdzVRu20bjsl4lGazWu3eOx0wZnv hIBou572xteMseFhv+dlJmeow/kIHP2P8g7CgdQO6pc0aQMAvqI6LCCfKlgS6Gf4Cmes 0va5kPqrXi5Die8od759DEWYfig2yEQpHBk/cRJG1X4Wv49D6KSYLed5oMbNwSKRxLgb j5xgLD532Ko+aXKZ+tEQ8d1Rb1hl+M6VmTgniyMfiwaKB04U6KEEGaoGhJ03oZqy/UWn 24Ndr8LQIpNAjlf4NHua235ufYZ6x1UG8OzJjiUXfqk93PSjvTavaz8UkRSZ5oYi6CJ7 M/rA== X-Gm-Message-State: ALoCoQm3rcLT+cZLhv7mNcmDhigU2oMtAiGnxaSMAkuQd/z73LmYPbx5DGSKmnOoDOdIKt6iiu3C X-Received: by 10.50.73.165 with SMTP id m5mr23002637igv.60.1438032770006; Mon, 27 Jul 2015 14:32:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.126.202 with HTTP; Mon, 27 Jul 2015 14:32:30 -0700 (PDT) In-Reply-To: References: <2308_1438022041_t6RIY0Q1003658_CAJQBSuyK=RF_xATKuVH88UE8D8uQxvoCVaEU4h=Gzy5134uN6Q@mail.gmail.com> From: Daniel Lee Date: Mon, 27 Jul 2015 14:32:30 -0700 Message-ID: Subject: Re: HTRACE-215 Simplify the Sampler type - discussion To: dev@htrace.incubator.apache.org Content-Type: text/plain; charset=UTF-8 Hi Colin, I'm not sure how Hadoop tracing is setup but I also enable tracing via a config setting. I'm not sure I agree creating multiple new Tracer objects each with their own Probability samplers is an acceptable solution from a usability standpoint. Consider an application that receives messages from clients and wants to trace different message and client types with different probabilities. Now, for every tuple of (message, client) type there has to be a new Tracer and Sampler created so this gets ugly quickly. It also sounds like having multiple tracers could get confusing quickly under this scenario. I'm just going to wrap everything in a custom class that includes the logic I used to have in the Sampler. Thanks, Daniel On Mon, Jul 27, 2015 at 12:00 PM, Colin P. McCabe wrote: > Hi Daniel, > > The problem with the "T" in Sampler is that it's > application-specific. The code for each application needs to be > modified specifically to make use of a different T. Ideally, Samplers > should be pluggable, so that you can use any sampler with any HTraced > code. For example, I might run a test application with sampling set > to "always" but in production, I would run with a probability sampler > with some specific sampling rate. But you can't do that when your > sampler depends on being passed some application-specific data. > You're stuck with only samplers that can work with that specific T. > > Consider a specific example: tracing Hadoop. I'd like to be able to > turn on tracing in Hadoop just by changing a config key. But if I'm > using a Sampler with a non-trivial T, I can't do that. I have to > tell the customer, "first apply this patch to your Hadoop code to add > the Ts, do a full build, and then put it into live production"... The > customer won't even follow me to step #1, let alone deploying the > patched code in production. It totally wrecks the usefulness of > HTrace if you need to rebuild your code to use it. > > Another thing to think about is that we'd like to reduce the > "boilerplate code" needed to add HTrace to an application. Ideally > the system would create the samplers you need from your > HTraceConfiguration, rather than requiring the application to create > and manage them manually. Of course, applications should be able to > programmatically add and remove Samplers as well, but only if they > have a specific need to do that. > > I think that tracing different events with different probabilities is > a nice feature. There is a way to do that through the new API that I > think is cleaner. You would create multiple Tracer objects (Tracer > will no longer be a singleton). Each tracer would be configured with > ProbabilitySampler, but they would have a different sampling rate set. > For the Foo code, you would call fooTracer.newTopLevelSpan(...), for > the Bar code, you would call barTracer.newTopLevelSpan(...), and so > forth. In the new API, spans are always created from a specific > Tracer and use the Samplers associated with that Tracer. > > This is similar to having different Log objects in log4j. Perhaps you > think the Foo system is not that interesting most of the time, so its > log level defaults to WARN. But if you think you're having a problem > in the Foo system, you can set its log level to TRACE and then you see > all the log messages that the Foo system has. Same thing here, except > that instead of Log objects, we have Tracer objects. Instead of log > messages, we have trace spans. But we still have a lot of flexibility > at runtime as a result of this. And we don't need to recompile to > trace. > > regards, > Colin > > On Mon, Jul 27, 2015 at 11:33 AM, Daniel Lee wrote: >> RE: https://issues.apache.org/jira/browse/HTRACE-215 >> >> I was previously making use of this feature. I was using it to trace >> different types of inputs with different probabilities. It looks like >> now I'll either have move all tracing logic completely outside of >> htrace related classes and only use Always and Never sampler which >> seems weird? Why even bother with providing ProbabilitySampler when >> (rand.nextDouble() < X ? AlwaysSampler.INSTANCE : >> NeverSampler.INSTANCE) is available. >> >> Daniel