accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: importdirectory in accumulo
Date Fri, 05 Apr 2013 22:01:47 GMT
I ran into this issue. Look in your log files for a directory not found
exceotion which is not bubbled up to the bash shell.
On Apr 5, 2013 11:37 AM, "Aji Janis" <aji1705@gmail.com> wrote:

> I agree with you that changing HADOOP_CLASSPATH like you said should be
> done. I couldn't quite do that just yet (people have jobs running and don't
> want to risk it).
>
> However, I did a work around. (I am going off the theory that my
> Hadoop_classpath is bad so it can't accept all the libraries I am passing
> to it so I decided to package all the libraries I needed into a jar.
> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/)
> I downloaded the source code and made a shaded (uber) jar to include all
> the libraries I needed. Then I submitted the hadoop job with my uber jar
> like any other map reduce job. My mappers and reducers finish the job but I
> got an exception for waitForTableOperation. I think this proves my theory
> of bad classpath but clearly I have more issues to deal with. If you have
> any suggestions on how to even debug that would be awesome!
>
> My console output(removed a lot of server specific stuff for security) is
> below. I modified BulkIngestExample.java to add some print statements.
> Modified lines shown below also.
>
>
> [user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar
> instance zookeepers user password table inputdir tmp/bulk
>
> 3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/04/05 11:20:53 INFO mapred.JobClient: Running job: job_201304021611_0045
> 13/04/05 11:20:54 INFO mapred.JobClient:  map 0% reduce 0%
> 13/04/05 11:21:10 INFO mapred.JobClient:  map 100% reduce 0%
> 13/04/05 11:21:25 INFO mapred.JobClient:  map 100% reduce 50%
> 13/04/05 11:21:26 INFO mapred.JobClient:  map 100% reduce 100%
> 13/04/05 11:21:31 INFO mapred.JobClient: Job complete:
> job_201304021611_0045
> 13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25
> 13/04/05 11:21:31 INFO mapred.JobClient:   Job Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched reduce tasks=2
> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15842
> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Rack-local map tasks=1
> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched map tasks=1
> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25891
> 13/04/05 11:21:31 INFO mapred.JobClient:   File Output Format Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Written=496
> 13/04/05 11:21:31 INFO mapred.JobClient:   FileSystemCounters
> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_READ=312
> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_READ=421
> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=68990
> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=496
> 13/04/05 11:21:31 INFO mapred.JobClient:   File Input Format Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Read=280
> 13/04/05 11:21:31 INFO mapred.JobClient:   Map-Reduce Framework
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input groups=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output materialized
> bytes=312
> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine output records=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map input records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce shuffle bytes=186
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce output records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Spilled Records=20
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output bytes=280
> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine input records=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=141
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input records=10
>
> Here is the exception caught:
> org.apache.accumulo.core.client.AccumuloException: Internal error
> processing waitForTableOperation
>
> E.getMessage returns:
> Internal error processing waitForTableOperation
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.accumulo.core.client.AccumuloException: Internal error
> processing waitForTableOperation
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:151)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.main(BulkIngestExample.java:166)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: org.apache.accumulo.core.client.AccumuloException: Internal
> error processing waitForTableOperation
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945)
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:146)
>         ... 7 more
> Caused by: org.apache.thrift.TApplicationException: Internal error
> processing waitForTableOperation
>         at
> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>         at
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
>         at
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at
> org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:84)
>         at $Proxy5.waitForTableOperation(Unknown Source)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:230)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:272)
>         ... 10 more
> [user@nodebulk]$
>
>
> Modification in BulkIngestExample
>
> line 146     connector.tableOperations().importDirectory(tableName,
> workDir + "/files", workDir + "/failures", false);
>
>     } catch (Exception e) {
>       System.out.println("\nHere is the exception caught:\n"+ e);
>       System.out.println("\nE.getMessage returns:\n"+ e.getMessage());
> line 151      throw new RuntimeException(e);
>     } finally {
>       if (out != null)
>         out.close();
> line 166     int res = ToolRunner.run(CachedConfiguration.getInstance(),
> new BulkIngestExample(), args);
>
>
> On Thu, Apr 4, 2013 at 3:51 PM, Billie Rinaldi <billie@apache.org> wrote:
>
>> On Thu, Apr 4, 2013 at 12:26 PM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> I haven't tried the classpath option yet, but I executed the below
>>> command as hadoop user ... this seemed to be the command that accumulo was
>>> trying to execute anyway and I am not sure but I would think this should
>>> have avoided the custom classpath issue... Right/Wrong?
>>>
>>
>> No, the jar needs to be both in the libjars and on the classpath.  There
>> are classes that need to be accessed on the local machine in the process of
>> submitting the MapReduce job, and this only can see the classpath, not the
>> libjars.
>>
>> The HADOOP_CLASSPATH you have is unusual.  More often, HADOOP_CLASSPATH
>> is not set at all in hadoop-env.sh, but if it is it should generally be of
>> the form newstuff:$HADOOP_CLASSPATH to avoid this issue.
>>
>> You will have to restart Hadoop after making the change to hadoop-env.sh.
>>
>> Billie
>>
>>
>>
>>>
>>>
>>> Got the same error:
>>> *[hadoop@node]$ /opt/hadoop/bin/hadoop jar
>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> *
>>>  *
>>> *
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> org/apache/accumulo/core/client/Instance
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.accumulo.core.client.Instance
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         ... 3 more
>>>
>>>
>>>
>>> On Thu, Apr 4, 2013 at 2:51 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>
>>>> On Thu, Apr 4, 2013 at 11:41 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>>> *[accumulo@node accumulo]$ cat /opt/hadoop/conf/hadoop-env.sh | grep
>>>>> HADOOP_CLASSPATH*
>>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:
>>>>>
>>>>
>>>> To preserve custom HADOOP_CLASSPATHs, this line should be:
>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:$HADOOP_CLASSPATH
>>>>
>>>> Billie
>>>>
>>>>
>>>>
>>>>>
>>>>> looks like it is overwriting everything. Isn't this the default
>>>>> behavior? Is you hadoop-env.sh missing that line?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 4, 2013 at 2:25 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>
>>>>>> On Thu, Apr 4, 2013 at 10:27 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>>
>>>>>>> I thought about the permissions issue too. All the accumulo stuff is
>>>>>>> under accumulo user so I started running the commands as accumulo ... only
>>>>>>> to get the same result.
>>>>>>> -The errors happen right away
>>>>>>> -the box has both accumulo and hadoop on it
>>>>>>> -the jar contains the instance class. But note that the instance
>>>>>>> class is part of accumulo-core and not examples-simple-1.4.2.jar .... (can
>>>>>>> this be the issue?)
>>>>>>>
>>>>>>
>>>>>> No, that isn't the issue.  tool.sh is finding the accumulo-core jar
>>>>>> and putting it on the HADOOP_CLASSPATH and in the libjars.
>>>>>>
>>>>>> I wonder if your hadoop environment is set up to override the
>>>>>> HADOOP_CLASSPATH.  Check in your hadoop-env.sh to see if HADOOP_CLASSPATH
>>>>>> is set there.
>>>>>>
>>>>>> The reason your commands of the form "tool.sh lib/*jar" aren't
>>>>>> working is that the regex is finding multiple jars and putting them all on
>>>>>> the command line.  tool.sh expects at most one jar followed by a class
>>>>>> name, so whatever jar comes second when the regex is expanded is being
>>>>>> interpreted as a class name.
>>>>>>
>>>>>> Billie
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Commands I ran:
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ whoami*
>>>>>>> accumulo
>>>>>>> *[accumulo@node accumulo]$ ls -l*
>>>>>>> total 184
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 10:25 bin
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 24263 Oct 22 15:30 CHANGES
>>>>>>> drwxr-xr-x 3 accumulo accumulo  4096 Apr  3 10:17 conf
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Jan 15 13:35 contrib
>>>>>>> -rwxr-xr-x 1 accumulo accumulo   695 Nov 18  2011 DISCLAIMER
>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Jan 15 13:35 docs
>>>>>>> drwxr-xr-x 4 accumulo accumulo  4096 Jan 15 13:35 lib
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 56494 Mar 21  2012 LICENSE
>>>>>>> drwxr-xr-x 2 accumulo accumulo 12288 Apr  3 14:43 logs
>>>>>>> -rwxr-xr-x 1 accumulo accumulo  2085 Mar 21  2012 NOTICE
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 27814 Oct 17 08:32 pom.xml
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 12449 Oct 17 08:32 README
>>>>>>> drwxr-xr-x 9 accumulo accumulo  4096 Nov  8 13:40 src
>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Nov  8 13:40 test
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 09:09 walogs
>>>>>>> *[accumulo@node accumulo]$ ls bin/*
>>>>>>> accumulo           check-slaves  etc_initd_accumulo  start-all.sh
>>>>>>> start-server.sh  stop-here.sh    tdown.sh  tup.sh
>>>>>>> catapultsetup.acc  config.sh     LogForwarder.sh     start-here.sh
>>>>>>>  stop-all.sh      stop-server.sh  tool.sh   upgrade.sh
>>>>>>> *[accumulo@node accumulo]$ ls lib/*
>>>>>>> accumulo-core-1.4.2.jar            accumulo-start-1.4.2.jar
>>>>>>>  commons-collections-3.2.jar    commons-logging-1.0.4.jar
>>>>>>>  jline-0.9.94.jar
>>>>>>> accumulo-core-1.4.2-javadoc.jar    accumulo-start-1.4.2-javadoc.jar
>>>>>>>  commons-configuration-1.5.jar  commons-logging-api-1.0.4.jar
>>>>>>>  libthrift-0.6.1.jar
>>>>>>> accumulo-core-1.4.2-sources.jar    accumulo-start-1.4.2-sources.jar
>>>>>>>  commons-io-1.4.jar             examples-simple-1.4.2.jar
>>>>>>>  log4j-1.2.16.jar
>>>>>>> accumulo-server-1.4.2.jar          cloudtrace-1.4.2.jar
>>>>>>>  commons-jci-core-1.0.jar       examples-simple-1.4.2-javadoc.jar  native
>>>>>>> accumulo-server-1.4.2-javadoc.jar  cloudtrace-1.4.2-javadoc.jar
>>>>>>>  commons-jci-fam-1.0.jar        examples-simple-1.4.2-sources.jar
>>>>>>>  wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>> accumulo-server-1.4.2-sources.jar  cloudtrace-1.4.2-sources.jar
>>>>>>>  commons-lang-2.4.jar           ext
>>>>>>>  wikisearch-query-1.4.2-javadoc.jar
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ jar -tf
>>>>>>> /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>> org/apache/accumulo/core/client/Instance*
>>>>>>> org/apache/accumulo/core/client/Instance.class
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ jar -tf
>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar | grep
>>>>>>> org/apache/accumulo/core/client/Instance*
>>>>>>> *
>>>>>>> *
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^cs].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-server-1.4.2.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-server-1.4.2.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-server-1.4.2.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*.jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-javadoc.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-core-1.4.2-javadoc.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-core-1.4.2-javadoc.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^c].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>>  USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-sources.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-core-1.4.2-sources.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-core-1.4.2-sources.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh
>>>>>>> lib/examples-simple-*[^c].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> default node14.catapult.dev.boozallenet.com:2181 root password
>>>>>>> test_aj /user/559599/input tmp/ajbulktest*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/examples-simple-1.4.2-sources.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/examples-simple-1.4.2.jar
>>>>>>> lib/examples-simple-1.4.2-sources.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.examples-simple-1.4.2-sources.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>> *[accumulo@node accumulo]$*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Apr 4, 2013 at 11:55 AM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>>>
>>>>>>>> On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> *Billie, I checked the values in tool.sh they match. I
>>>>>>>>> uncommented the echo statements and reran the cmd here is what I have:
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> *$ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>> instance zookeeper usr pswd table inputdir tmp/bulk*
>>>>>>>>>
>>>>>>>>> USERJARS=
>>>>>>>>>
>>>>>>>>> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>
>>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>>> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>> -libjars
>>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>>>  Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>> Method)
>>>>>>>>>         at
>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>         ... 3 more
>>>>>>>>>
>>>>>>>>>
>>>>>>>> The command looks right.  Instance should be packaged in the
>>>>>>>> accumulo core jar.  To verify that, you could run:
>>>>>>>> jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>
>>>>>>>> I'm not sure what's going on here.  If that error is happening
>>>>>>>> right away, it seems like it can't load the jar on the local machine.  If
>>>>>>>> you're running multiple machines, and if the error were happening later
>>>>>>>> during the MapReduce, I would suggest that you make sure accumulo is
>>>>>>>> present on all the machines.
>>>>>>>>
>>>>>>>> You asked about the user; is the owner of the jars different than
>>>>>>>> the user you're running as?  In that case, it could be a permissions
>>>>>>>> issue.  Could the permissions be set so that you can list that directory
>>>>>>>> but not read the jar?
>>>>>>>>
>>>>>>>> Billie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *org/apache/accumulo/core/client/Instance is located in the
>>>>>>>>> src/... folder which I am not is what is packaged in the examples-
>>>>>>>>> simple-[^c].jar ? *
>>>>>>>>> *Sorry folks for the constant emails... just trying to get this
>>>>>>>>> to work but I really appreciate the help.*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org>wrote:
>>>>>>>>>
>>>>>>>>>> If you run tool.sh with sh -x, it will step through the script so
>>>>>>>>>> you can see what jars it is picking up and perhaps why it's missing them
>>>>>>>>>> for you.
>>>>>>>>>>
>>>>>>>>>> Sent from my phone, please pardon the typos and brevity.
>>>>>>>>>> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> What user are you running the commands as ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Where did you put all your java files?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <
>>>>>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I was able to run the example, as written in
>>>>>>>>>>>>> docs/examples/README.bulkIngest substituting my
>>>>>>>>>>>>> instance/zookeeper/user/password information:
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ pwd
>>>>>>>>>>>>> /home/ecn/workspace/1.4.3
>>>>>>>>>>>>> $ ls
>>>>>>>>>>>>> bin      conf     docs  LICENSE  NOTICE   README  src     test
>>>>>>>>>>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test
>>>>>>>>>>>>> localhost root secret test_bulk row_00000333 row_00000666
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0 1000
>>>>>>>>>>>>> bulk/test_1.txt
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample test
>>>>>>>>>>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>>>>>>>>>>
>>>>>>>>>>>>> $./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
>>>>>>>>>>>>> localhost root secret test_bulk 0 1000
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Eric
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am not sure its just a regular expression issue. Below is
>>>>>>>>>>>>>> my console output. Not sure why this ClassDefFoundError occurs. Has anyone
>>>>>>>>>>>>>> tried to do it successfully? Can you please tell me your env set up if you
>>>>>>>>>>>>>> did.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [user@mynode bulk]$ pwd
>>>>>>>>>>>>>> /home/user/bulk
>>>>>>>>>>>>>> [user@mynode bulk]$ ls
>>>>>>>>>>>>>> BulkIngestExample.java  GenerateTestData.java
>>>>>>>>>>>>>>  SetupTable.java  test_1.txt  VerifyIngest.java
>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <
>>>>>>>>>>>>>> billie@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <
>>>>>>>>>>>>>>> ctubbsii@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Try with -libjars:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> tool.sh automatically adds libjars.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem is the regular expression for the
>>>>>>>>>>>>>>> examples-simple jar.  It's trying to exclude the javadoc jar with ^c, but
>>>>>>>>>>>>>>> it isn't excluding the sources jar.
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you can just
>>>>>>>>>>>>>>> specify the jar exactly, /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> */opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Billie
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Christopher L Tubbs II
>>>>>>>>>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2
>>>>>>>>>>>>>>>> accumulo) and I am not
>>>>>>>>>>>>>>>> > able to run the following steps. Here is the error I get:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>> > myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>>>>>> > Exception in thread "main"
>>>>>>>>>>>>>>>> java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>>> >         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>> >         at
>>>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>>>>>>>>>>> > commons-collections-3.2.jar
>>>>>>>>>>>>>>>> > commons-logging-1.0.4.jar
>>>>>>>>>>>>>>>> > jline-0.9.94.jar
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > commons-configuration-1.5.jar
>>>>>>>>>>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>>>>>>>>>>> > libthrift-0.6.1.jar
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > commons-io-1.4.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>> > log4j-1.2.16.jar
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>>>>>>>>>>> > commons-jci-core-1.0.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > native
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > commons-lang-2.4.jar
>>>>>>>>>>>>>>>> >  ext
>>>>>>>>>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Clearly, the libraries and source file exist so I am not
>>>>>>>>>>>>>>>> sure whats going
>>>>>>>>>>>>>>>> > on. I tried putting in
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > instead then it complains BulkIngestExample ClassNotFound.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Suggestions?
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <
>>>>>>>>>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> You will have to write your own InputFormat class which
>>>>>>>>>>>>>>>> will parse your
>>>>>>>>>>>>>>>> >> file and pass records to your reducer.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> -Eric
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> Looking at the BulkIngestExample, it uses
>>>>>>>>>>>>>>>> GenerateTestData and creates a
>>>>>>>>>>>>>>>> >>> .txt file which contians Key: Value pair and correct me
>>>>>>>>>>>>>>>> if I am wrong but
>>>>>>>>>>>>>>>> >>> each new line is a new row right?
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> I need to know how to have family and qualifiers also.
>>>>>>>>>>>>>>>> In other words,
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> 1) Do I set up a .txt file that can be converted into
>>>>>>>>>>>>>>>> an Accumulo RF File
>>>>>>>>>>>>>>>> >>> using AccumuloFileOutputFormat  which can then be
>>>>>>>>>>>>>>>> imported into my table?
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> 2) if yes, what is the format of the .txt file.
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <
>>>>>>>>>>>>>>>> eric.newton@gmail.com>
>>>>>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> Your data needs to be in the RFile format, and more
>>>>>>>>>>>>>>>> importantly it needs
>>>>>>>>>>>>>>>> >>>> to be sorted.
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort
>>>>>>>>>>>>>>>> your data.  See the
>>>>>>>>>>>>>>>> >>>> BulkIngestExample.
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> -Eric
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>> format.
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> I want to import this data into a table in accumulo.
>>>>>>>>>>>>>>>> My end goal is to
>>>>>>>>>>>>>>>> >>>>> understand how to use the BulkImport feature in
>>>>>>>>>>>>>>>> accumulo. I tried to login
>>>>>>>>>>>>>>>> >>>>> to the accumulo shell as root and then run:
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> #table mytable
>>>>>>>>>>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir true
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> but it didn't work. My data file was saved as
>>>>>>>>>>>>>>>> data.txt in
>>>>>>>>>>>>>>>> >>>>> /home/inputDir. I tried to create the dir/file
>>>>>>>>>>>>>>>> structure in hdfs and linux
>>>>>>>>>>>>>>>> >>>>> but neither worked. When trying locally, it keeps
>>>>>>>>>>>>>>>> complaining about
>>>>>>>>>>>>>>>> >>>>> failureDir not existing.
>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>> >>>>> java.io.FileNotFoundException: File does not exist:
>>>>>>>>>>>>>>>> failures
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> When trying with files on hdfs, I get no error on the
>>>>>>>>>>>>>>>> console but the
>>>>>>>>>>>>>>>> >>>>> logger had the following messages:
>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>> hdfs://node....//inputDir/data.txt does
>>>>>>>>>>>>>>>> >>>>> not have a valid extension, ignoring
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> or,
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>>>>>>>>>>> >>>>> a map file, ignoring
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> Suggestions? Am I not setting up the job right? Thank
>>>>>>>>>>>>>>>> you for help in
>>>>>>>>>>>>>>>> >>>>> advance.
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>> >>>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>> format:
>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message