htrace-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin P. McCabe" <cmcc...@apache.org>
Subject Re: Tracing a chain of iterators -- htrace 2.04
Date Tue, 25 Aug 2015 21:21:04 GMT
On Tue, Aug 25, 2015 at 11:34 AM, Andrew Mains <andrew.mains@upsight.com> wrote:
> Thanks for the response!
>
>> Is the concern just that there would be too many spans if each call to
>> next() created a span?  Does sampling help address this concern?
>
> That's part of it, though sampling would indeed help there. I think my concern is with
sampling on the trace calls in next, but not in the overall request trace, we wouldn't get
a full picture of how much time was spent in the iterators through the life of the request.
That is, if the next traces only happen 1% of the time, our timeline would look like one long
request, with small blips for each call to next (each of which is presumably fast individually).
Is my understanding there correct? I'll give it a try either way, to see what it looks like.

If you only want one long span, just create one span at the top level
before calling next().

>
>> what we'd really want, is to aggregate the time spent in
>> each call to next for each iterator, and then send the spans at the
>> end."  But HTrace already does this, right?  Most span receivers will
>> batch up the spans they receive and send them all in one big batch,
>> probably "at the end."  What am I missing?
> r.e. batching span sending--that's definitely good from a performance perspective. The
spans themselves would still be considered separate though, correct? What I was trying to
get at though was the creation of a single, aggregate span composed of the time taken in each
call to next() for a particular iterator, which could then be sent after exhaustion of the
iterator. Does that make sense? I don't think such a span necessarily fits into htrace's model
(nor into zipkin's, which we're currently using for visualization), so I've moved away from
that a bit.

It can be tough to decide how many spans to create.  It's best to
experiment until you get data out you feel is useful.

A new version of htrace will come out soon with much better
visualization capabilities.  There were some advances in 3.3 as well,
but in 4.0 the new web UI will really shine, I think.

best,
Colin

>
> Thanks again for all the help!
>
> Andrew
>
>
>
>
>> On Aug 18, 2015, at 11:39 PM, Colin P. McCabe <cmccabe@apache.org> wrote:
>>
>> Hi Andrew,
>>
>> Thanks for posting!
>>
>> Is the concern just that there would be too many spans if each call to
>> next() created a span?  Does sampling help address this concern?
>>
>> You said, "what we'd really want, is to aggregate the time spent in
>> each call to next for each iterator, and then send the spans at the
>> end."  But HTrace already does this, right?  Most span receivers will
>> batch up the spans they receive and send them all in one big batch,
>> probably "at the end."  What am I missing?
>>
>> cheers,
>> Colin
>>
>>
>> On Tue, Aug 18, 2015 at 3:03 PM, Andrew Mains
>> <andrew.mains@kontagent.com> wrote:
>>> Hi all,
>>>
>>> This is really more of a "user" question than a "dev" question, but I'm
>>> posting here since I was unable to find a user list for the project; hope
>>> that's alright.
>>>
>>> I was hoping to get some input on the best way to trace execution through a
>>> chain of iterators. Specifically, we have a database-like application which
>>> pipes data through multiple iterators, performing some transformation at
>>> each step. We'd like insight into how long each step is taking in total for
>>> a particular request. That is, for a chain of iterators iter_1... iter_i, we
>>> want the total time spent in each iter_i for that request.
>>>
>>> The naive implementation would be to start a span in each call to next, but
>>> that's far too fine grained, given that we'd be starting a new span for each
>>> row. What we'd really want, is to aggregate the time spent in each call to
>>> next for each iterator, and then send the spans at the end. This would
>>> require implementing a new span subclass, which is a bit tricky to integrate
>>> at the moment (since it prevents us from using the static helpers in Trace).
>>>
>>> Any thoughts on the best way to approach this issue? Is there something I'm
>>> missing, or some way that we can reframe the problem such that it makes
>>> sense with what's currently in htrace?
>>>
>>> Let me know if there's anything that's unclear, or any further info I can
>>> provide about our use case.
>>>
>>> Thanks for the help!
>>>
>>> Andrew

Mime
View raw message