chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-575) Cluster Summarization script
Date Tue, 04 Jan 2011 02:53:45 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977087#action_12977087
] 

Ari Rabkin commented on CHUKWA-575:
-----------------------------------

Tried this, got errors.  

I started with a clean HBase, let it collect metrics from the default adaptors for a bit.
Ran the script manually. The Pig-spawned tasks all fail. I got the following on the Reduce
side:

java.io.IOException: java.lang.IllegalArgumentException: Row key is invalid
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:438)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.IllegalArgumentException: Row key is invalid
	at org.apache.hadoop.hbase.client.Put.(Put.java:79)
	at org.apache.hadoop.hbase.client.Put.(Put.java:69)
	at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:355)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
	at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:436)
	... 7 more


------

Is it possible to run the scripts in local mode for debugging, and have them still pull data
from HBase? How do I configure that? I tried a bunch of things and got nowhere.

> Cluster Summarization script
> ----------------------------
>
>                 Key: CHUKWA-575
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-575
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: scripts
>         Environment: Java 6, Mac OS X 10.6
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>             Fix For: 0.5.0
>
>         Attachments: CHUKWA-575.patch
>
>
>  Chukwa record metrics from name node, data node, job tracker, task tracker, etc, but
the raw metrics does not help determine all aspect of the cluster health.  For now, we have
the following metrics in HBase:
>  * System
>  *   Disk
>  *   Memory
>  *   Network
>  * HDFS
>  *   Name Node
>  *   Data Node
>  * Map Reduce
>  *   Job Tracker
>  *   Task Tracker
> We can further analyze the data to provide a summary for the cluster as these categories:
>  * System - Performance profile of how busy the nodes are in the cluster
>  * HDFS - Capacity of the disk storage, and health of the data in the file system
>  * MapReduce - Capacity of the processing pipeline, and health of the processing system

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message