chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <>
Subject Re: ChukwaAgent and MD% computation
Date Tue, 23 Jun 2009 23:19:44 GMT
I believe this fix needs to happen in only one or two places. How many
places actually specify a non-zero offset at adaptor creation time?


On Tue, Jun 23, 2009 at 4:03 PM, Ariel Rabkin<> wrote:
> A much better option would be for whoever  starts the adaptor to
> specify a name.  The control protocol already supports this.
> You just say:
>  add name = ...., and then the adaptor will be called "name".
> So if you want to take an MD5 of some params but not others, that's possible.
> --Ari
> On Tue, Jun 23, 2009 at 3:55 PM, Jerome Boulon<> wrote:
>> Param actually contains an offset and the fileName and assuming that we could have
more parameteres inside the param string there's no way for
>> The agent to build the correct MD5.
>> So, given that, if we add a method to the adaptor, the adaptor will then be able
to give you the correct MD5.
>> /Jerome.
>> On 6/23/09 3:40 PM, "Ariel Rabkin" <> wrote:
>> In the current codebase, adaptor names are unique, and an attempt to
>> create a duplicate will just return the previous adaptor.  By default,
>> the adaptor name is the MD5 hash is taken over the adaptor name, data
>> type, and params.  This means you can have two different adaptors look
>> at a file, or two adaptors with different datatype tags, but not two
>> instances of the same adaptor.
>> Offset should NOT be included in that hash. If it is, it's a bug. And
>> a fairly subtle one, because the code doesn't, on its face, do any
>> such thing.  If you have a test case showing misbehavior, can you post
>> it?
>> Note, by the way, that anybody who creates an adaptor can specify any
>> name they like -- including the file name, or a hash thereof.  So
>> there's a really easy workaround, in the client library.
>> On Tue, Jun 23, 2009 at 3:29 PM, Jerome Boulon<> wrote:
>>> Hi,
>>> I have some questions on the synthesizeAdaptorID method from ChukwaAgent.
>>> In previous version we used to have a check on fileName to avoid adding the
>>> same adaptor for the same file twice.
>>> This code is no longer there. Is this what we really want?
>>> Also current MD5 could not be used to replace that functionality since the
>>> offset is included in the MD5 computation. Is there any plan to fix this?
>>> Thanks,
>>>  Jerome.
>> --
>> Ari Rabkin
>> UC Berkeley Computer Science Department
> --
> Ari Rabkin
> UC Berkeley Computer Science Department

Ari Rabkin
UC Berkeley Computer Science Department

View raw message