chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <>
Subject Re: ChukwaAgent and MD% computation
Date Tue, 23 Jun 2009 22:55:33 GMT
Param actually contains an offset and the fileName and assuming that we could have more parameteres
inside the param string there's no way for
The agent to build the correct MD5.

So, given that, if we add a method to the adaptor, the adaptor will then be able to give you
the correct MD5.


On 6/23/09 3:40 PM, "Ariel Rabkin" <> wrote:

In the current codebase, adaptor names are unique, and an attempt to
create a duplicate will just return the previous adaptor.  By default,
the adaptor name is the MD5 hash is taken over the adaptor name, data
type, and params.  This means you can have two different adaptors look
at a file, or two adaptors with different datatype tags, but not two
instances of the same adaptor.

Offset should NOT be included in that hash. If it is, it's a bug. And
a fairly subtle one, because the code doesn't, on its face, do any
such thing.  If you have a test case showing misbehavior, can you post

Note, by the way, that anybody who creates an adaptor can specify any
name they like -- including the file name, or a hash thereof.  So
there's a really easy workaround, in the client library.

On Tue, Jun 23, 2009 at 3:29 PM, Jerome Boulon<> wrote:
> Hi,
> I have some questions on the synthesizeAdaptorID method from ChukwaAgent.
> In previous version we used to have a check on fileName to avoid adding the
> same adaptor for the same file twice.
> This code is no longer there. Is this what we really want?
> Also current MD5 could not be used to replace that functionality since the
> offset is included in the MD5 computation. Is there any plan to fix this?
> Thanks,
>  Jerome.

Ari Rabkin
UC Berkeley Computer Science Department

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message