chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: Continuing Chukwa Installation Problems
Date Tue, 01 Jun 2010 21:41:24 GMT
Hi Alan,

The instruction is correct.

> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-namenode-hadoop1.out
> Error initializing ChukwaClient with list of currently registered
> adaptors, clearing our local list of adaptors
> log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")

This message is from Chukwa log4j appender, where the expiration of the log
file uses a certain regex to match the file name.  This is configured in
hadoop-metrics-log4j.properties, and bundled as part of hadoop-client-*.jar
file.  There should be no need to change hadoop-metrics-log4j.properties
because it is preprocessed during build time.

I am unfamiliar with the other authentication socket error.  It does not
look like it is generated from Chukwa.

Regards,
Eric


On 6/1/10 5:25 AM, "Ratner, Alan S (IS)" <Alan.Ratner@ngc.com> wrote:

> I'm now following the latest instructions on installing Chukwa.  When I
> launch Hadoop I get various Chukwa-related errors although they do not
> seem to interfere with my running Hadoop.
> 
> My collectors file looks like this:
> http://localhost:8080
> 
> My agents files currently looks like this:
> 10.64.147.3
> 10.64.147.4
> ...
> 10.64.147.12
> 10.64.147.13
> 
> My initial_adaptors file is the default:
> add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Iostat
> 60 /usr/bin/iostat -x -k 55 2 0
> add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Df 60
> /bin/df -l 0
> add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Sar 60
> /usr/bin/sar -q -r -n ALL 55 0
> add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Top 60
> /usr/bin/top -b -n 1 -c 0
> 
> Here's what happens when I launch Hadoop.  (I am assuming the
> initialization sequence is a) format namenode, b) launch Hadoop, c)
> start Chukwa agents, d) start Chukwa collector.)  It looks like I have 2
> sets of problems, presumably related:
> 1. bad adaptor file
> 2. some sort of password/authentification problem (Note that the agents
> file currently contains a subset of nodes with all nodes giving me
> authentication socket errors and agent nodes additionally giving me
> password and errors.)  These errors surprise me since I can ssh between
> any 2 servers in the cluster.
> 
> ngc@hadoop1:~/hadoop-0.20.2$ bin/start-all.sh
> starting namenode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-namenode-hadoop1.out
> Error initializing ChukwaClient with list of currently registered
> adaptors, clearing our local list of adaptors
> log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
> 10.64.147.7: starting datanode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-datanode-hadoop6.out
> 10.64.147.3: starting datanode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-datanode-hadoop2.out
> ...
> 10.64.147.30: starting datanode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-datanode-hadoop29.out
> 10.64.147.21: starting datanode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-datanode-hadoop20.out
> 10.64.147.2: starting secondarynamenode, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-secondarynamenode-hadoop1
> .out
> 10.64.147.2: Error initializing ChukwaClient with list of currently
> registered adaptors, clearing our local list of adaptors
> 10.64.147.2: log4j:ERROR cleanUpRegex == null ||
> !cleanUpRegex.contains("$fileName")
> starting jobtracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-jobtracker-hadoop1.out
> Error initializing ChukwaClient with list of currently registered
> adaptors, clearing our local list of adaptors
> log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
> ngc@10.64.147.7's password: 10.64.147.35: Error reading response length
> from authentication socket.
> ngc@10.64.147.8's password: 10.64.147.30: Error reading response length
> from authentication socket.
> 10.64.147.9: Error reading response length from authentication socket.
> 10.64.147.7: Error reading response length from authentication socket.
> ngc@10.64.147.5's password: 10.64.147.18: Error reading response length
> from authentication socket.
> 10.64.147.20: Error reading response length from authentication socket.
> 10.64.147.5: Error reading response length from authentication socket.
> 10.64.147.21: Error reading response length from authentication socket.
> ngc@10.64.147.10's password: 10.64.147.10: Error reading response length
> from authentication socket.
> 10.64.147.17: Error reading response length from authentication socket.
> 10.64.147.26: Error reading response length from authentication socket.
> 10.64.147.33: Error reading response length from authentication socket.
> 10.64.147.25: Error reading response length from authentication socket.
> ...
> 10.64.147.3: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop2.out
> 10.64.147.29: Error reading response length from authentication socket.
> 10.64.147.39: Error reading response length from authentication socket.
> 10.64.147.40: Error reading response length from authentication socket.
> 10.64.147.37: Error reading response length from authentication socket.
> 10.64.147.38: Error reading response length from authentication socket.
> ngc@10.64.147.4's password: 10.64.147.4: Error reading response length
> from authentication socket.
> 10.64.147.42: Error reading response length from authentication socket.
> 10.64.147.41: Error reading response length from authentication socket.
> 10.64.147.36: Error reading response length from authentication socket.
> 10.64.147.31: Error reading response length from authentication socket.
> 10.64.147.27: Error reading response length from authentication socket.
> 10.64.147.34: Error reading response length from authentication socket.
> 10.64.147.13: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop12.out
> 10.64.147.11: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop10.out
> 10.64.147.21: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop20.out
> ...
> 10.64.147.15: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop14.out
> 10.64.147.38: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop37.out
> 10.64.147.37: starting tasktracker, logging to
> /home/ngc/hadoop-0.20.2/bin/../logs/hadoop-ngc-tasktracker-hadoop36.out
> 10.64.147.3: Error initializing ChukwaClient with list of currently
> registered adaptors, clearing our local list of adaptors
> 10.64.147.3: log4j:ERROR cleanUpRegex == null ||
> !cleanUpRegex.contains("$fileName")
> ...


Mime
View raw message