chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiaqi Tan <tanji...@gmail.com>
Subject Re: units in MDL and HICC
Date Fri, 22 May 2009 18:10:01 GMT
On Fri, May 22, 2009 at 11:06 AM, Eric Yang <eyang@yahoo-inc.com> wrote:
> On 5/22/09 10:35 AM, "Ariel Rabkin" <asrabkin@gmail.com> wrote:
>
>> I'm okay with doing this in the parser.  Particularly if it's
>> declaratively driven with something like that conversion syntax,
>> embedded in demux-conf.   But we should probably be clear to ourselves
>> what constitutes "truth" when it comes to units and such.
>
> I like to use base unit of bytes and raw time millis in HDFS, if possible.
> The current base units are bytes, and per minute.  All data are in bytes at
> yahoo except HDFS capacity in GB from hadoop metrics.

Sure, but I think what Ari was referring to was when the sar-generated
numbers are not in the base units? Then in that case I think the Demux
processor should look at the column labels to determine what's the
source unit and convert that to the base unit?

- Jiaqi

>>
>> I found the conversion code you mention.  Right now, it lives in MDL.
>> Eric, are you saying that it should move to demux, so that the HDFS
>> copy of data is accurate?  If so, I think I agree.
>
> Yes, we have an agreement.
>
>>
>> On Fri, May 22, 2009 at 10:28 AM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>> There is a special keyword called conversion.<record_type>.<key>,
>>> By putting 1000, it will apply column_value*1000 before inserting into
>>> database.  I did not test much of this pre chukwa days.  I am not sure if it
>>> stil works.
>>>
>>> The right thing to do is to improve the parser to do the right thing hence,
>>> cleansed and unified truth is stored in hdfs.
>>>
>>> Rgards,
>>> Eric
>>>
>>> On 5/21/09 10:06 PM, "Ariel Rabkin" <asrabkin@gmail.com> wrote:
>>>
>>>> Howdy all.
>>>>
>>>> So I've noticed something.  The default mdl.xml has entries for
>>>> memused, and kbkached.
>>>> My version of sar outputs kbcached, and *kb*memused.    So memused
>>>> doesn't display right.
>>>>
>>>> In general though, I've gotten worried about units.
>>>> if I stick 1000 * kbmemused in mdl.xml, will that get pasted into a
>>>> SQL command and will the right thing happen?
>>>> Is there a better way to do unit conversion, other than hacking the Java?
>>>>
>>>> Is there any way to know what the right units are, actually?
>>>>
>>>> --Ari
>>>
>>>
>>
>>
>
>

Mime
View raw message