incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <asrab...@gmail.com>
Subject Re: start-data-processors.sh
Date Thu, 28 Jan 2010 19:58:14 GMT
We don't use demux at my site, so I'd love to have Eric or Jerome jump
in here.  But that said:

I believe the typical way to set this up is to have conf/chukwa-env.sh
define HADOOP_CONF_DIR; the filesystem is then specified via the
Hadoop configuration. (fs.default.name)  You shouldn't need to change
chukwa-demux-conf.

In re processSinkFiles -- What version of Chukwa are you using?  In
Chukwa 0.3, the only formal release we've done so far, there's no
processSinkFiles.sh, and the line in start-data-processors that
references it has been commented out.  You don't need it; references
to it are a historical artifact that should go away in the next
release.

--Ari

On Thu, Jan 28, 2010 at 11:15 AM, Corbin Hoenes <corbin@tynt.com> wrote:
> I'm having some difficulty with the demux part of setting up chukwa.  I assume I am
supposed to run the start-data-processors.sh script to startup all the map reduce jobs that
handle demux and archiving.
>
> My goal is to pull the logs we are collecting out of the sink files and into something
we can start to run our pig scripts on.
>
> When I run start-data-processors it gives me this though:
>
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/chukwa/demuxProcessing/mrInput
>        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
>        at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
>        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
>        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1290)
>        at org.apache.hadoop.chukwa.extraction.demux.Demux.run(Demux.java:192)
>
> Which seems like I need to configure it to try to connect to hdfs rather than file:/
>
> Only docs I've found are here: http://hadoop.apache.org/chukwa/docs/current/admin.html
> Is there a guide to configuring chukwa-demux-conf.xml?
>
> I also noticed start-data-processors.sh tries to start processSinkFiles.sh which doesn't
exist for me--do I need to get this script?s
>
>
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Mime
View raw message