incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kirk True <k...@mustardgrain.com>
Subject Re: Chukwa can't find Demux class - POSSIBLE FIX
Date Thu, 20 May 2010 15:13:56 GMT
Hi Eric,

I've added CHUKWA-488 to track this issue. I hope to have a patch by EOD 
that fixes it (for me). If needed, it can be cleaned up before committing.

Thanks,
Kirk

On 5/19/10 6:51 PM, Eric Yang wrote:
> Kirk,
>
> Yes, it should be trivial to filter fs.default.name in Demux.java.  Please
> file a jira.  Thanks
>
> Regards,
> Eric
>
> On 5/19/10 6:01 PM, "Kirk True"<kirk@mustardgrain.com>  wrote:
>
>    
>> Hi Eric,
>>
>> On 4/29/10 9:55 AM, Eric Yang wrote:
>>      
>>> Kirk,
>>>
>>> Is your tasktracker node on the same machine?  If it's refering to
>>> hdfs://localhost:9000, it means that your tasktracker will attempt to
>>> contact localhost as the namenode.  Make sure your fs.default.name is
>>> configured as your real hostname instead of localhost to prevent certain
>>> unexpected corner case similar to this.
>>>
>>>        
>> I grabbed the latest from SVN, and still see this problem :( I'm no
>> longer specifying the HDFS URL in Chukwa as CHUKWA-460 no longer
>> requires it. I updated the $HADOOP_HOME/conf/core-site.xml to specify
>> the actual host name (both full and short forms) and it still leaves
>> "hdfs://host.example.com" prefix in the classpath properties that Hadoop
>> is using. According to "Pro Hadoop" (as mentioned previously in this
>> email thread), the DistributedCache API wants the Path object to be
>> "/chukwa/demux/mydemux.jar", not
>> "hdfs://host.example.com:9000/chukwa/demux/mydemux.jar".
>>
>> Would it be possible to (somehow) grab the value of the
>> "fs.default.name" property in Demux.java and strip it off the path
>> before calling the DistributedCache API?
>>
>> Thanks,
>> Kirk
>>
>>      
>>> Regards,
>>> Eric
>>>
>>> On 4/29/10 9:46 AM, "Eric Yang"<eyang@yahoo-inc.com>   wrote:
>>>
>>>
>>>        
>>>> Kirk,
>>>>
>>>> On my system, it is returning /chukwa/demux/parsers.jar as URL.  I think
>>>> it¹s best to fix this in the code level.  Please file a jira, and I will
>>>> take care of this.  Thanks.
>>>>
>>>> Regards,
>>>> Eric
>>>>
>>>> On 4/28/10 6:50 PM, "Kirk True"<kirk@mustardgrain.com>   wrote:
>>>>
>>>>
>>>>          
>>>>> Hi Eric,
>>>>>
>>>>> If I grep "hdfs://" in $CHUKWA_HOME/conf, the string shows up in two
>>>>> places:
>>>>> one is in the README and the other is in chukwa-collector-conf.xml for
the
>>>>> writer.hdfs.filesystem property. I didn't change this file, so that should
>>>>> be
>>>>> the default. chukwa-common.xml's chukwa.data.dir is still just "/chukwa".
>>>>>
>>>>> Thanks,
>>>>> Kirk
>>>>>
>>>>> On 4/28/10 6:34 PM, Eric Yang wrote:
>>>>>
>>>>>            
>>>>>>    Re: Chukwa can't find Demux class - POSSIBLE FIX Hi Kirk,
>>>>>>
>>>>>> Check chukwa-common.xml and make sure that chukwa.data.dir does not
have
>>>>>> hdfs://localhost:9000 pre-append to it.  It¹s best to leave namenode
>>>>>> address
>>>>>> out of this path for portability.
>>>>>>
>>>>>> Regards,
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>> On 4/28/10 6:19 PM, "Kirk True"<kirk@mustardgrain.com>   wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>              
>>>>>>> Hi all,
>>>>>>>
>>>>>>> The problem seems to stem from the fact that the call to
>>>>>>> DistributedCache.addFileToClassPath is passing in a Path that
is in URI
>>>>>>> form, i.e. hdfs://localhost:9000/chukwa/demux/mydemux.jar whereas
the
>>>>>>> DistributedCache API expects it to be a filesystem-based path
(i.e.
>>>>>>> /chukwa/demux/mydemux.jar). I'm not sure why, but the FileStatus
object
>>>>>>> returned by FileSystem.listStatus is returning a URL-based path
instead
>>>>>>> of
>>>>>>> a
>>>>>>> filesystem-based path.
>>>>>>>
>>>>>>> I kludged the Demux class' addParsers to strip the
>>>>>>> "hdfs://localhost:9000"
>>>>>>> portion of the string and now my class is found.
>>>>>>>
>>>>>>> It's frustrating when stuff silently fails :) I even turned up
the
>>>>>>> logging
>>>>>>> in Hadoop and Chukwa to TRACE and nothing was reported.
>>>>>>>
>>>>>>> So, my question is, do I have something misconfigured that causes
>>>>>>> FileSystem.listStatus to return a URL-based path? Or does the
code need
>>>>>>> to
>>>>>>> be changed?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kirk
>>>>>>>
>>>>>>> On 4/28/10 5:41 PM, Kirk True wrote:
>>>>>>>
>>>>>>>
>>>>>>>                
>>>>>>>>    Hi all,
>>>>>>>>
>>>>>>>> Just for grins I copied the Java source byte-for-byte to
the Chukwa
>>>>>>>> source
>>>>>>>> folder and then ran:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                  
>>>>>>>>> ant clean main&&   cp build/*.jar .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                    
>>>>>>>> And it worked, as expected.
>>>>>>>>
>>>>>>>> When one adds custom demux classes to a JAR, sticks it in
>>>>>>>> hdfs://localhost:9000/chukwa/demux/mydemux.jar, is that JAR
somehow
>>>>>>>> magically merged with chukwa-core-0.4.0.jar to produce "job.jar"
or do
>>>>>>>> they
>>>>>>>> remain separate?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Kirk
>>>>>>>>
>>>>>>>> On 4/28/10 5:09 PM, Kirk True wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>                  
>>>>>>>>>     Hi Jerome,
>>>>>>>>>
>>>>>>>>> Yes, they're all using $JAVA_HOME which is 1.6.0_18.
>>>>>>>>>
>>>>>>>>> I did notice that the JAVA_PLATFORM environment variable
in
>>>>>>>>> chukwa-env.sh
>>>>>>>>> was set to 32-bit but Hadoop was defaulting to 64-bit
(this is a 64-bit
>>>>>>>>> machine), but setting that to Linux-amd64-64 didn't make
any
>>>>>>>>> difference.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kirk
>>>>>>>>>
>>>>>>>>> On 4/28/10 4:00 PM, Jerome Boulon wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                    
>>>>>>>>>>    Re: Chukwa can't find Demux class Are you using
the same version of
>>>>>>>>>> Java
>>>>>>>>>> for your jar and Hadoop?
>>>>>>>>>> /Jerome.
>>>>>>>>>>
>>>>>>>>>> On 4/28/10 3:33 PM, "Kirk True"<kirk@mustardgrain.com>
  wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                      
>>>>>>>>>>> Hi Eric,
>>>>>>>>>>>
>>>>>>>>>>> I added these to Hadoop's mapred-site.xml:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>     <property>
>>>>>>>>>>>           <name>keep.failed.task.files</name>
>>>>>>>>>>>           <value>true</value>
>>>>>>>>>>>     </property>
>>>>>>>>>>>     <property>
>>>>>>>>>>>           <name>mapred.job.tracker.persist.jobstatus.active</name>
>>>>>>>>>>>           <value>true</value>
>>>>>>>>>>>     </property>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This seems to have caused the task tracker directory
to stick around
>>>>>>>>>>> after the job is complete. So, for example, I
have this directory:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> /tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281519_000>>>>>>
>>>        
>>>>>            
>>> 1
>>>
>>>        
>>>>>>>>>>>
>>>>>>>>>>> Under this directory I have the following files:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> jars/
>>>>>>>>>>> job.jar
>>>>>>>>>>> org/ . . .
>>>>>>>>>>> job.xml
>>>>>>>>>>>
>>>>>>>>>>> My Demux (XmlBasedDemux) doesn't appear in the
job.jar or the
>>>>>>>>>>> (apparently exploded job.jar) jars/org/... directory.
However, my
>>>>>>>>>>> demux
>>>>>>>>>>> JAR appears in three places in the job.xml:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    <property>
>>>>>>>>>>>       <name>mapred.job.classpath.files</name>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar>>>>>>
>>>        
>>>>>            
>>> <
>>>
>>>        
>>>>>>>>>>> /value>
>>>>>>>>>>> </property>
>>>>>>>>>>> <property>
>>>>>>>>>>>       <name>mapred.jar</name>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> <value>/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281>>>>>>
>>>        
>>>>>            
>>> 5
>>>
>>>        
>>>>>>>>>>> 19_0001/jars/job.jar</value>
>>>>>>>>>>> </property>
>>>>>>>>>>> <property>
>>>>>>>>>>>       <name>mapred.cache.files</name>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar>>>>>>
>>>        
>>>>>            
>>> <
>>>
>>>        
>>>>>>>>>>> /value>
>>>>>>>>>>> </property>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So it looks like when Demux.addParsers calls
>>>>>>>>>>> DistributedCache.addFileToClassPath it's working
as the above job
>>>>>>>>>>> conf
>>>>>>>>>>> properties include my JAR.
>>>>>>>>>>>
>>>>>>>>>>> Here's my JAR contents:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    [kirk@skinner data-collection]$ unzip -l
>>>>>>>>>>> data-collection-demux/target/data-collection-demux-0.1.jar
>>>>>>>>>>> Archive:  data-collection-demux/target/data-collection-demux-0.1.jar
>>>>>>>>>>>     Length     Date   Time    Name
>>>>>>>>>>>    --------    ----   ----    ----
>>>>>>>>>>>           0  04-28-10 15:19   META-INF/
>>>>>>>>>>>         123  04-28-10 15:19   META-INF/MANIFEST.MF
>>>>>>>>>>>           0  04-28-10 15:19   org/
>>>>>>>>>>>           0  04-28-10 15:19   org/apache/
>>>>>>>>>>>           0  04-28-10 15:19   org/apache/hadoop/
>>>>>>>>>>>           0  04-28-10 15:19   org/apache/hadoop/chukwa/
>>>>>>>>>>>           0  04-28-10 15:19   org/apache/hadoop/chukwa/extraction/
>>>>>>>>>>>           0  04-28-10 15:19
>>>>>>>>>>> org/apache/hadoop/chukwa/extraction/demux/
>>>>>>>>>>>           0  04-28-10 15:19
>>>>>>>>>>> org/apache/hadoop/chukwa/extraction/demux/processor/
>>>>>>>>>>>           0  04-28-10 15:19
>>>>>>>>>>> org/apache/hadoop/chukwa/extraction/demux/processor/mapper/
>>>>>>>>>>>        1697  04-28-10 15:19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> org/apache/hadoop/chukwa/extraction/demux/processor/mapper/XmlBasedDemu>>>>>>
>>>        
>>>>>            
>>> x
>>>
>>>        
>>>>>>>>>>> .class
>>>>>>>>>>>           0  04-28-10 15:19   META-INF/maven/
>>>>>>>>>>>           0  04-28-10 15:19
>>>>>>>>>>> META-INF/maven/com.cisco.flip.datacollection/
>>>>>>>>>>>           0  04-28-10 15:19
>>>>>>>>>>> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/
>>>>>>>>>>>        1448  04-28-10 00:23
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/pom.>>>>>>
>>>        
>>>>>            
>>> x
>>>
>>>        
>>>>>>>>>>> ml
>>>>>>>>>>>         133  04-28-10 15:19
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>> META-INF/maven/com.cisco.flip.datacollection/data-collection-demux/pom.>>>>>>
>>>        
>>>>>            
>>> p
>>>
>>>        
>>>>>>>>>>> roperties
>>>>>>>>>>>    --------                   -------
>>>>>>>>>>>        3401                   16 files
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's how I'm copying the JAR into HDFS:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    hadoop fs -mkdir /chukwa/demux
>>>>>>>>>>> hadoop fs -copyFromLocal /path/to/data-collection-demux-0.1.jar
>>>>>>>>>>> /chukwa/demux
>>>>>>>>>>>
>>>>>>>>>>> Any ideas of more things to try?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Kirk
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 28 Apr 2010 14:48 -0700, "Eric Yang"<eyang@yahoo-inc.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>                        
>>>>>>>>>>>> Kirk,
>>>>>>>>>>>>
>>>>>>>>>>>> The shell script and job related information
are stored temporarily
>>>>>>>>>>>> in
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                          
>>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_2010042813>>>>>>>
>>>        
>>>>>            
>>> 2
>>>
>>>        
>>>>>>>>>>>> 0_0xx
>>>>>>>>>>>> x/, while the job is running.
>>>>>>>>>>>>
>>>>>>>>>>>> You should go into the jars directory and
find out if the compressed
>>>>>>>>>>>> jar
>>>>>>>>>>>> contains your class file.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Eric
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/28/10 1:57 PM, "Kirk True"<kirk@mustardgrain.com>
  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                          
>>>>>>>>>>>>> Hi Eric,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I updated MapProcessorFactory.getProcessor
to dump the URLs from
>>>>>>>>>>>>> the
>>>>>>>>>>>>> URLClassLoader from the MapProcessorFactory.class.
This is what I
>>>>>>>>>>>>> see:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/conf/
>>>>>>>>>>>>> file:/home/kirk/bin/jdk1.6.0_18/lib/tools.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/hadoop-0.20.2-core.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-cli-1.2.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-codec-1.3.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-el-1.0.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-httpclient-3.0.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-logging-1.0.4.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-logging-api-1.0.4.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/commons-net-1.4.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/core-3.1.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/hsqldb-1.8.0.10.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jasper-compiler-5.5.12.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jasper-runtime-5.5.12.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jets3t-0.6.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jetty-6.1.14.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jetty-util-6.1.14.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/junit-3.8.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/kfs-0.2.2.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/log4j-1.2.15.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/mockito-all-1.8.0.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/oro-2.0.8.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/servlet-api-2.5-6.1.14.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/slf4j-api-1.4.3.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/slf4j-log4j12-1.4.3.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/xmlenc-0.52.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jsp-2.1/jsp-2.1.jar
>>>>>>>>>>>>> file:/home/kirk/bin/hadoop-0.20.2/lib/jsp-2.1/jsp-api-2.1.jar
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                            
>>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281>>>>>>>>
>>>        
>>>>>            
>>> 3
>>>
>>>        
>>>>>>>>>>>>> 20_0001/
>>>>>>>>>>>>> attempt_201004281320_0001_m_000000_0/work/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                            
>>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281>>>>>>>>
>>>        
>>>>>            
>>> 3
>>>
>>>        
>>>>>>>>>>>>> 20_0001/
>>>>>>>>>>>>> jars/classes
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                            
>>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281>>>>>>>>
>>>        
>>>>>            
>>> 3
>>>
>>>        
>>>>>>>>>>>>> 20_0001/
>>>>>>>>>>>>> jars/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                            
>>> file:/tmp/hadoop-kirk/mapred/local/taskTracker/jobcache/job_201004281>>>>>>>>
>>>        
>>>>>            
>>> 3
>>>
>>>        
>>>>>>>>>>>>> 20_0001/
>>>>>>>>>>>>> attempt_201004281320_0001_m_000000_0/work/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that the expected classpath? I don't
see any reference to my JAR
>>>>>>>>>>>>> or
>>>>>>>>>>>>> the
>>>>>>>>>>>>> Chukwa JARs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, when I try to view the contents
of my "job_<timestamp>_0001"
>>>>>>>>>>>>> directory,
>>>>>>>>>>>>> it's automatically removed, so I can't
really do any forensics
>>>>>>>>>>>>> after
>>>>>>>>>>>>> the fact.
>>>>>>>>>>>>> I know this is probably a Hadoop question,
is it possible to
>>>>>>>>>>>>> prevent
>>>>>>>>>>>>> that
>>>>>>>>>>>>> auto-removal from occurring?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Kirk
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 28 Apr 2010 13:16 -0700, "Kirk
True"<kirk@mustardgrain.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Eric,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/28/10 10:23 AM, Eric Yang wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Kirk,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is the ownership of the jar file setup
correctly as the user that
>>>>>>>>>>>>> runs
>>>>>>>>>>>>> demux?
>>>>>>>>>>>>>
>>>>>>>>>>>>> When browsing via the NameNode web UI,
it lists permissions of
>>>>>>>>>>>>> "rw-r--r--" and "kirk" as the owner (which
is also the user ID
>>>>>>>>>>>>> running
>>>>>>>>>>>>> the Hadoop and Chukwa processes).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>      You may find more information by
looking at running mapper task
>>>>>>>>>>>>> or
>>>>>>>>>>>>> reducer task, and try to find out the
task attempt shell script.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Where is the task attempt shell script
located?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>      Make sure
>>>>>>>>>>>>> the files are downloaded correctly from
distributed cache, and
>>>>>>>>>>>>> referenced in
>>>>>>>>>>>>> the locally generated jar file.  Hope
this helps.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for asking such basic questions,
but where is the locally
>>>>>>>>>>>>> generated JAR file found? I'm assuming
under /tmp/hadoop-<user>, by
>>>>>>>>>>>>> default? I saw one file named job_<timstamp>.jar
but it appeared to
>>>>>>>>>>>>> be a
>>>>>>>>>>>>> byte-for-byte copy of chukwa-core-0.4.0.jar,
i.e. my
>>>>>>>>>>>>> "XmlBasedDemux"
>>>>>>>>>>>>> class was nowhere to be found.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Kirk
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/28/10 9:37 AM, "Kirk True"<kirk@mustardgrain.com>
   wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi guys,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a custom Demux that I need to
run to process my input, but
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>> getting
>>>>>>>>>>>>> ClassNotFoundException when running in
Hadoop. This is with the
>>>>>>>>>>>>> released
>>>>>>>>>>>>> 0.4.0
>>>>>>>>>>>>> build.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've done the following:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. I put my Demux class in the correct
package
>>>>>>>>>>>>> (org.apache.hadoop.chukwa.extraction.demux.processor.mapper)
>>>>>>>>>>>>> 2. I've added the JAR containing the
Demux implementation to HDFS
>>>>>>>>>>>>> at
>>>>>>>>>>>>> /chuka/demux
>>>>>>>>>>>>> 3. I've added an alias to it in chukwa-demux-conf.xml
>>>>>>>>>>>>>
>>>>>>>>>>>>> The map/reduce job is picking up on the
fact that I have a custom
>>>>>>>>>>>>> Demux and
>>>>>>>>>>>>> is
>>>>>>>>>>>>> trying to load it, but I get a ClassNotFoundException.
The
>>>>>>>>>>>>> HDFS-based URL
>>>>>>>>>>>>> to
>>>>>>>>>>>>> the JAR is showing up in the job configuration
in Hadoop, which is
>>>>>>>>>>>>> another
>>>>>>>>>>>>> evidence that Chukwa and Hadoop know
where the JAR lives and that
>>>>>>>>>>>>> it's part
>>>>>>>>>>>>> of
>>>>>>>>>>>>> the Chukwa-initiated job.
>>>>>>>>>>>>>
>>>>>>>>>>>>> My Demux is very simple. I've stripped
it down to a
>>>>>>>>>>>>> System.out.println with
>>>>>>>>>>>>> dependencies on no other classes/JARs
other than Chukwa, Hadoop,
>>>>>>>>>>>>> and the
>>>>>>>>>>>>> core
>>>>>>>>>>>>> JDK. I've double-checked that my JAR
is being built up correctly.
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>> completely flummoxed as to what I'm doing
wrong.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas what I'm missing? What other
information can I provide?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>> Kirk
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                            
>>>>>>>>>>>>
>>>>>>>>>>>>                          
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                        
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                      
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                    
>>>>>>>>
>>>>>>>>                  
>>>>>>>
>>>>>>>
>>>>>>>                
>>>>>
>>>>>            
>>>>
>>>>          
>>>
>>>        
>    

Mime
View raw message