incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Preetam Patil <pvpatil.i...@gmail.com>
Subject Re: Pig scripts for Chukwa 0.5
Date Wed, 06 Jul 2011 16:41:13 GMT
Thanks Eric.
I was mainly interested in the per-job/task info, similar to thtat provided
by UserDailySummary.pig
because per-job/task info seems to be missing from the HBase info.
Any pointers on how to get a table of jobs into HBase are welcome :-)
-preetam

On Wed, Jul 6, 2011 at 9:12 PM, Eric Yang <eric818@gmail.com> wrote:

> Hi Preetam,
>
> ClusterSummary.pig is the only one that works with HBase.  Other pig
> scripts are designed to work on sequence files for Chukwa 0.4.  The
> scripts are thrown together at crunch time.  There is no
> documentation.  The ChukwaLoader/ChukwaStore function needs to be
> revised to use HBase to bring the scripts up to date with Chukwa 0.5.
> Hadoop_*.pig scripts are for down sampling of data from raw resolution
> into specified time resolution, i.e. 30 minutes average, or 180
> minutes average.  UserDailySummary.pig is design to aggregate data
> from JobHistory log files to generate a user usage report.  However,
> this was designed to work on JobHistory file for Hadoop 0.18.  I don't
> think it works with Hadoop 0.20+ because JobHistory format changed in
> Hadoop 0.20.
>
> regards,
> Eric
>
> On Wed, Jul 6, 2011 at 5:04 AM, Preetam Patil <pvpatil.iitb@gmail.com>
> wrote:
> > Hi,
> > I notice that there are a bunch of Pig scripts in scripts/pig directory,
> > only ClusterSummary.pig seems
> > to be mentioned in the documentation. The other scripts also seem to be
> > based on a storage model
> > other than HBase, but provide more info (e.g., per-job/task stats) than
> that
> > stored in HBase.
> > Are these compatible with 0.5, and if not, what needs to be done to get
> > them working and
> > where can I find any API info for them?
> > Thanks,
> > -preetam
> >
>

Mime
View raw message