Return-Path: Delivered-To: apmail-hadoop-chukwa-dev-archive@minotaur.apache.org Received: (qmail 8270 invoked from network); 22 May 2009 18:01:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 May 2009 18:01:43 -0000 Received: (qmail 23971 invoked by uid 500); 22 May 2009 18:01:56 -0000 Delivered-To: apmail-hadoop-chukwa-dev-archive@hadoop.apache.org Received: (qmail 23954 invoked by uid 500); 22 May 2009 18:01:56 -0000 Mailing-List: contact chukwa-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-dev@hadoop.apache.org Delivered-To: mailing list chukwa-dev@hadoop.apache.org Received: (qmail 23942 invoked by uid 99); 22 May 2009 18:01:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 May 2009 18:01:55 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tanjiaqi@gmail.com designates 209.85.219.171 as permitted sender) Received: from [209.85.219.171] (HELO mail-ew0-f171.google.com) (209.85.219.171) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 May 2009 18:01:45 +0000 Received: by ewy19 with SMTP id 19so2247657ewy.29 for ; Fri, 22 May 2009 11:01:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=SGbr9yvfUtB4Hlq60OLDFTlmNjYqjEcz+cd+Mr3PVbw=; b=QExjkc35dMPFt08tXTmNQI6cPQ3GyyH3caOG1WzkaDPJOjeJ+EyhODpCsGcKw2Y26X 8p+6uFTheMs7wsUTp6FqZGuZJ6dh9povgYcb2pgde9jTZNiMdSPCUq0q4zVb0SHzZheD 7Qm4UCtCh5ihmbm0/mFf3J22t3xohRBKnqLRE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=UPWVFWEAOWAwvWiBJX317pXTJYS5S9zeBKqBh2NztcFa/H3e8YwlErCKqTfqHWdoQk JrDFbXoCrql3obIJkVnzL0PFmIYEWkoa45PV2xgLxwYjKrWY3s6I4Xc8bwFc+4iztDYy ODoCyqz9ebT5xo+R0qWZE2rgLMbfB96/GN+98= MIME-Version: 1.0 Received: by 10.216.28.200 with SMTP id g50mr823817wea.203.1243015285274; Fri, 22 May 2009 11:01:25 -0700 (PDT) In-Reply-To: References: <39b0afc00905212301s4de59466o5f6edac874f7c383@mail.gmail.com> From: Jiaqi Tan Date: Fri, 22 May 2009 11:01:05 -0700 Message-ID: Subject: Re: units in MDL and HICC To: Eric Yang Cc: chukwa-dev@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I think it's fine to have the semantics remain in the Demux, in that case perhaps the Demux processors can look at the sar-generated column labels to determine what are the units, and to standardize the units of the output? Jiaqi On Fri, May 22, 2009 at 10:56 AM, Eric Yang wrote: > Many solutions have been suggested in the past year, but there isn't one > fits all. =C2=A0Most of the promising library is in the GPL camp. =C2=A0U= nfortunately, > we can't use those. =C2=A0The closest thing in the apache camp is the gan= glia > metrics library. =C2=A0There are 2 bugs that they need to fix in the metr= ics > library. =C2=A0First, it uses float to store all values, hence the accura= cy > becomes somewhat questionable for large values. =C2=A0Second, one of the = metrics > only include value from first device. =C2=A0I forget it's either network = device > or disk. =C2=A0I dropped integration of ganglia metrics library after dis= covering > those bugs. =C2=A0However, we might want to revisit this, if it has been > improved. =C2=A0For the windows camp, we may need a completely different = solution > for measuring system metrics. > > I believe all parsing logic and data schematics should happen in demux > parser rather than MDL. =C2=A0Personally, I believe MDL should have zero > configuration. =C2=A0MDL's purpose is to load data into database by knowi= ng the > RecordType=3DTable, Key=3DColumn, Value=3DValue. =C2=A0This will definite= ly reduce the > places that we maintain data transformation. =C2=A0The data schematics sh= ould > happen in demux parser, and database_create_table.sql only. =C2=A0What do= you > guys think? > > Regards, > Eric > > On 5/21/09 11:01 PM, "Ariel Rabkin" wrote: > >> Howdy. >> >> I agree with your diagnosis -- this is the peril of external >> dependencies. =C2=A0There was discussion, back in the day, about doing >> something better. =C2=A0Poking at /proc is certainly one option. Another >> would be finding some apache-licensed library that does this. Sigar >> would fit the bill, but it's GPLed and so we can't link against it. >> Though there was discussion under HADOOP-4959 about a license >> exemption. That might solve our problem. >> >> There's a Java standard approach that does some subset of what we want >> -- >> http://java.sun.com/javase/6/docs/jre/api/management/extension/com/sun/m= anagem >> ent/UnixOperatingSystemMXBean.html >> >> What's peculiar about this issue is that right now, the actual Demux >> processors are largely independent of the versions -- those processors >> make assumptions about the syntax of the input, but almost none about >> the semantics. If the data comes in columns with headers, they do >> basically the right thing. =C2=A0However, when it comes time to do the >> database insert, the column names don't match the ones in mdl.xml, and >> so things start to fail. >> >> It seems a pity to dirty up the currently clean Java code with lots of >> special cases for canonical-izing data formats. =C2=A0I'm okay =C2=A0doi= ng some >> sort of parameterization, but I think in a lot of cases we can do >> something very simpleminded and still be okay. =C2=A0Perhaps as simple a= s >> "if you see field x in a SystemMetrics record, output field y as >> follows." >> >> On Thu, May 21, 2009 at 10:27 PM, Jiaqi Tan wrote: >>> Hi Ari, >>> >>> I think the real problem here is that sar metrics are being picked up >>> by an Exec adaptor which calls sar and there's no control over which >>> sar gets called (or at least not right now), and sar is ultimately an >>> external dependency which currently is just assumed to be sitting >>> there. >>> >>> Also, sar just directly emits unstructured plain text, so there's no >>> self-describing data format a la some XML which says what the units >>> are, so if sar is changing output units and stuff, then the parser in >>> the Demux needs to take care of that too. Even more generally, even >>> any change at all to sar's output would require an update of the >>> Demux. >>> >>> I think the fundamental problem is that having an Exec adaptor which >>> pulls the unstructured output of an external program and having a >>> Demux processor that makes assumptions about what that output looks >>> like and what it means, makes the whole workflow dependent on >>> something not under the control of Chukwa. >>> >>> I can imagine one way of working around that would be to not use sar >>> and write custom parsers for /proc so that Chukwa is itself aware of >>> what the proc data actually means without having to make assumptions >>> about the output of an external parser; it's reinventing the wheel >>> somewhat but it gives an end-to-end cleaner solution. >>> >>> The other answer would perhaps be the "web services" answer of having >>> a whole standardized way of passing data around in a structured way >>> but then that starts to look like a generalized pub/sub system. >>> >>> But in the meantime maybe the sar version on the system being >>> monitored could be picked up in some way (metadata in the Chunk?) and >>> the various Demux processors dependent on such external programs e.g. >>> IoStat, Df, etc. could be parameterized to handle output from >>> different versions/variants of the source program. Or to be even more >>> general, the Exec adaptor could send along an MD5 hash of the program >>> it's calling, and then you'd have a whole bunch of processors for >>> every possible variant of the program you want to support; that sounds >>> terribly hackish to me but I think that way at least the identity of >>> the external dependency can be identified. >>> > >