incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@netflix.com>
Subject Re: Chukwa can't find Demux class
Date Wed, 28 Apr 2010 23:00:01 GMT
Are you using the same version of Java for your jar and Hadoop?
/Jerome.

On 4/28/10 3:33 PM, "Kirk True" <kirk@mustardgrain.com> wrote:

> Hi Eric,
>  
> I added these to Hadoop's mapred-site.xml:
>  
>  
>   <property>
>         <name>keep.failed.task.files</name>
>         <value>true</value>
>   </property>
>   <property>
>         <name>mapred.job.tracker.persist.jobstatus.active</name>
>         <value>true</value>
>   </property>
>  
>  
> This seems to have caused the task tracker directory to stick around after the
> job is complete. So, for example, I have this directory:
>  
>  
> /tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281519_0001
>  
>  
> Under this directory I have the following files:
>  
>  
> jars/
> job.jar
> org/ . . .
> job.xml
>  
> My Demux (XmlBasedDemux) doesn't appear in the job.jar or the (apparently
> exploded job.jar) jars/org/... directory. However, my demux JAR appears in
> three places in the job.xml:
>  
>  
> <property>
>     <name>mapred.job.classpath.files</name>
>     
> 
<value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
>
> </property>
> <property>
>     <name>mapred.jar</name>
>     
> <value>/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281519_000
> 1/jars/job.jar</value>
> </property>
> <property>
>     <name>mapred.cache.files</name>
>     
> 
<value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
>
> </property>
>  
>  
> So it looks like when Demux.addParsers calls
> DistributedCache.addFileToClassPath it's working as the above job conf
> properties include my JAR.
>  
> Here's my JAR contents:
>  
>  
> [kirk@skinner data-collection]$ unzip -l
> data-collection-demux/target/data-collection-demux-0.1.jar
> Archive:  data-collection-demux/target/data-collection-demux-0.1.jar
>   Length     Date   Time    Name
>  --------    ----   ----    ----
>         0  04-28-10 15:19   META-INF/
>       123  04-28-10 15:19   META-INF/MANIFEST.MF
>         0  04-28-10 15:19   org/
>         0  04-28-10 15:19   org/apache/
>         0  04-28-10 15:19   org/apache/hadoop/
>         0  04-28-10 15:19   org/apache/hadoop/chukwa/
>         0  04-28-10 15:19   org/apache/hadoop/chukwa/extraction/
>         0  04-28-10 15:19   org/apache/hadoop/chukwa/extraction/demux/
>         0  04-28-10 15:19
> org/apache/hadoop/chukwa/extraction/demux/processor/
>         0  04-28-10 15:19
> org/apache/hadoop/chukwa/extraction/demux/processor/mapper/
>      1697  04-28-10 15:19
> org/apache/hadoop/chukwa/extraction/demux/processor/mapper/XmlBasedDemux.class
>         0  04-28-10 15:19   META-INF/maven/
>         0  04-28-10 15:19   META-INF/maven/com.cisco.flip.datacollection/
>         0  04-28-10 15:19
> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/
>      1448  04-28-10 00:23
> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/pom.xml
>       133  04-28-10 15:19
> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/pom.propert
> ies
>  --------                   -------
>      3401                   16 files
>  
>  
> Here's how I'm copying the JAR into HDFS:
>  
>  
> hadoop fs -mkdir /chukwa/demux
> hadoop fs -copyFromLocal /path/to/data-collection-demux-0.1.jar /chukwa/demux
>  
> Any ideas of more things to try?
>  
> Thanks,
> Kirk
>  
>  
> On Wed, 28 Apr 2010 14:48 -0700, "Eric Yang" <eyang@yahoo-inc.com> wrote:
>> > Kirk,
>> >
>> > The shell script and job related information are stored temporarily in
>> > 
>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281320_0xx
>> > x/, while the job is running.
>> >
>> > You should go into the jars directory and find out if the compressed jar
>> > contains your class file.
>> >
>> > Regards,
>> > Eric
>> >
>> > On 4/28/10 1:57 PM, "Kirk True" <kirk@mustardgrain.com> wrote:
>> >
>>> > > Hi Eric,
>>> > >
>>> > > I updated MapProcessorFactory.getProcessor to dump the URLs from the
>>> > > URLClassLoader from the MapProcessorFactory.class. This is what I see:
>>> > >
>>> > >
>>> > > file:/home/kirk/bin/hadoop-0.20.2/conf/
>>> > > file:/home/kirk/bin/jdk1.6.0_18/lib/tools.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/
>>> > > file:/home/kirk/bin/hadoop-0.20.2/hadoop-0.20.2-core.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-cli-1.2.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-codec-1.3.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-el-1.0.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-httpclient-3.0.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-logging-1.0.4.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-logging-api-1.0.4.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/commons-net-1.4.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/core-3.1.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/hsqldb-1.8.0.10.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jasper-compiler-5.5.12.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jasper-runtime-5.5.12.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jets3t-0.6.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jetty-6.1.14.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jetty-util-6.1.14.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/junit-3.8.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/kfs-0.2.2.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/log4j-1.2.15.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/mockito-all-1.8.0.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/oro-2.0.8.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/servlet-api-2.5-6.1.14.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/slf4j-api-1.4.3.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/slf4j-log4j12-1.4.3.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/xmlenc-0.52.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jsp-2.1/jsp-2.1.jar
>>> > > file:/home/kirk/bin/hadoop-0.20.2/lib/jsp-2.1/jsp-api-2.1.jar
>>> > > 
>>> 
file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281320_0001/
>>> > > attempt_201004281320_0001_m_000000_0/work/
>>> > > 
>>> 
file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281320_0001/
>>> > > jars/classes
>>> > > 
>>> 
file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281320_0001/
>>> > > jars/
>>> > > 
>>> 
file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281320_0001/
>>> > > attempt_201004281320_0001_m_000000_0/work/
>>> > >
>>> > >
>>> > > Is that the expected classpath? I don't see any reference to my JAR
or
the
>>> > > Chukwa JARs.
>>> > >
>>> > > Also, when I try to view the contents of my "job_<timestamp>_0001"
>>> directory,
>>> > > it's automatically removed, so I can't really do any forensics after
the
>>> fact.
>>> > > I know this is probably a Hadoop question, is it possible to prevent
>>>
that
>>> > > auto-removal from occurring?
>>> > >
>>> > > Thanks,
>>> > > Kirk
>>> > >
>>> > > On Wed, 28 Apr 2010 13:16 -0700, "Kirk True" <kirk@mustardgrain.com>
>>> wrote:
>>>> > >> Hi Eric,
>>>> > >>
>>>> > >> On 4/28/10 10:23 AM, Eric Yang wrote:
>>>>> > >>> Hi Kirk,
>>>>> > >>>
>>>>> > >>> Is the ownership of the jar file setup correctly as
the user that
runs
>>>>> > >>> demux?
>>>> > >>
>>>> > >> When browsing via the NameNode web UI, it lists permissions
of
>>>> > >> "rw-r--r--" and "kirk" as the owner (which is also the user
ID running
>>>> > >> the Hadoop and Chukwa processes).
>>>> > >>
>>>>> > >>>    You may find more information by looking at running
mapper task
or
>>>>> > >>> reducer task, and try to find out the task attempt
shell script.
>>>> > >>
>>>> > >> Where is the task attempt shell script located?
>>>> > >>
>>>>> > >>>    Make sure
>>>>> > >>> the files are downloaded correctly from distributed
cache, and
>>>>> referenced in
>>>>> > >>> the locally generated jar file.  Hope this helps.
>>>>> > >>>
>>>> > >>
>>>> > >> Sorry for asking such basic questions, but where is the locally
>>>> > >> generated JAR file found? I'm assuming under /tmp/hadoop-<user>,
by
>>>> > >> default? I saw one file named job_<timstamp>.jar but
it appeared to be
a
>>>> > >> byte-for-byte copy of chukwa-core-0.4.0.jar, i.e. my "XmlBasedDemux"
>>>> > >> class was nowhere to be found.
>>>> > >>
>>>> > >> Thanks,
>>>> > >> Kirk
>>>> > >>
>>>>> > >>> Regards,
>>>>> > >>> Eric
>>>>> > >>>
>>>>> > >>> On 4/28/10 9:37 AM, "Kirk True"<kirk@mustardgrain.com>
 wrote:
>>>>> > >>>
>>>>> > >>>
>>>>>> > >>>> Hi guys,
>>>>>> > >>>>
>>>>>> > >>>> I have a custom Demux that I need to run to
process my input, but
I'm
>>>>>> > >>>> getting
>>>>>> > >>>> ClassNotFoundException when running in Hadoop.
This is with the
>>>>>> released
>>>>>> > >>>> 0.4.0
>>>>>> > >>>> build.
>>>>>> > >>>>
>>>>>> > >>>> I've done the following:
>>>>>> > >>>>
>>>>>> > >>>> 1. I put my Demux class in the correct package
>>>>>> > >>>> (org.apache.hadoop.chukwa.extraction.demux.processor.mapper)
>>>>>> > >>>> 2. I've added the JAR containing the Demux
implementation to HDFS
at
>>>>>> > >>>> /chuka/demux
>>>>>> > >>>> 3. I've added an alias to it in chukwa-demux-conf.xml
>>>>>> > >>>>
>>>>>> > >>>> The map/reduce job is picking up on the fact
that I have a custom
>>>>>> Demux and
>>>>>> > >>>> is
>>>>>> > >>>> trying to load it, but I get a ClassNotFoundException.
The
>>>>>> HDFS-based URL
>>>>>> > >>>> to
>>>>>> > >>>> the JAR is showing up in the job configuration
in Hadoop, which is
>>>>>> another
>>>>>> > >>>> evidence that Chukwa and Hadoop know where
the JAR lives and that
>>>>>> it's part
>>>>>> > >>>> of
>>>>>> > >>>> the Chukwa-initiated job.
>>>>>> > >>>>
>>>>>> > >>>> My Demux is very simple. I've stripped it down
to a
>>>>>> System.out.println with
>>>>>> > >>>> dependencies on no other classes/JARs other
than Chukwa, Hadoop,
>>>>>> and the
>>>>>> > >>>> core
>>>>>> > >>>> JDK. I've double-checked that my JAR is being
built up correctly.
I'm
>>>>>> > >>>> completely flummoxed as to what I'm doing wrong.
>>>>>> > >>>>
>>>>>> > >>>> Any ideas what I'm missing? What other information
can I provide?
>>>>>> > >>>>
>>>>>> > >>>> Thanks!
>>>>>> > >>>> Kirk
>>>>>> > >>>>
>>>>>> > >>>>
>>>>> > >>>
>>>> > >>
>>> > >
>>> > >
>> >
>> >
>  
> 


Mime
View raw message