accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <bil...@apache.org>
Subject Re: importdirectory in accumulo
Date Fri, 05 Apr 2013 20:06:58 GMT
Sometimes a thrift error can indicate that the accumulo-core jar you're
using isn't the same version as the accumulo server that is running.
However, I haven't seen this particular error before so that might not be
the case here.  If it is the case, there are many ways that it could
happen.  You could have the wrong jar, or multiple jars, in your uber jar.
There could be another version of the accumulo-core jar on the hadoop
classpath (either directly, or packaged in someone else's uber jar -- which
is a good possibility if anyone else has gone through what you are doing
now).  Based on the HADOOP_CLASSPATH you have set, you'd have to check ./
(not sure what that is relative to, it might be the hadoop conf dir?),
/conf, /build/*, and the standard hadoop lib directory.

Billie


On Fri, Apr 5, 2013 at 8:36 AM, Aji Janis <aji1705@gmail.com> wrote:

> I agree with you that changing HADOOP_CLASSPATH like you said should be
> done. I couldn't quite do that just yet (people have jobs running and don't
> want to risk it).
>
> However, I did a work around. (I am going off the theory that my
> Hadoop_classpath is bad so it can't accept all the libraries I am passing
> to it so I decided to package all the libraries I needed into a jar.
> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/)
> I downloaded the source code and made a shaded (uber) jar to include all
> the libraries I needed. Then I submitted the hadoop job with my uber jar
> like any other map reduce job. My mappers and reducers finish the job but I
> got an exception for waitForTableOperation. I think this proves my theory
> of bad classpath but clearly I have more issues to deal with. If you have
> any suggestions on how to even debug that would be awesome!
>
> My console output(removed a lot of server specific stuff for security) is
> below. I modified BulkIngestExample.java to add some print statements.
> Modified lines shown below also.
>
>
> [user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar
> instance zookeepers user password table inputdir tmp/bulk
>
> 3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/04/05 11:20:53 INFO mapred.JobClient: Running job: job_201304021611_0045
> 13/04/05 11:20:54 INFO mapred.JobClient:  map 0% reduce 0%
> 13/04/05 11:21:10 INFO mapred.JobClient:  map 100% reduce 0%
> 13/04/05 11:21:25 INFO mapred.JobClient:  map 100% reduce 50%
> 13/04/05 11:21:26 INFO mapred.JobClient:  map 100% reduce 100%
> 13/04/05 11:21:31 INFO mapred.JobClient: Job complete:
> job_201304021611_0045
> 13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25
> 13/04/05 11:21:31 INFO mapred.JobClient:   Job Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched reduce tasks=2
> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15842
> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Rack-local map tasks=1
> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched map tasks=1
> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25891
> 13/04/05 11:21:31 INFO mapred.JobClient:   File Output Format Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Written=496
> 13/04/05 11:21:31 INFO mapred.JobClient:   FileSystemCounters
> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_READ=312
> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_READ=421
> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=68990
> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=496
> 13/04/05 11:21:31 INFO mapred.JobClient:   File Input Format Counters
> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Read=280
> 13/04/05 11:21:31 INFO mapred.JobClient:   Map-Reduce Framework
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input groups=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output materialized
> bytes=312
> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine output records=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map input records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce shuffle bytes=186
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce output records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     Spilled Records=20
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output bytes=280
> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine input records=0
> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output records=10
> 13/04/05 11:21:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=141
> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input records=10
>
> Here is the exception caught:
> org.apache.accumulo.core.client.AccumuloException: Internal error
> processing waitForTableOperation
>
> E.getMessage returns:
> Internal error processing waitForTableOperation
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.accumulo.core.client.AccumuloException: Internal error
> processing waitForTableOperation
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:151)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.main(BulkIngestExample.java:166)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: org.apache.accumulo.core.client.AccumuloException: Internal
> error processing waitForTableOperation
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945)
>         at
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:146)
>         ... 7 more
> Caused by: org.apache.thrift.TApplicationException: Internal error
> processing waitForTableOperation
>         at
> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>         at
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
>         at
> org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at
> org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:84)
>         at $Proxy5.waitForTableOperation(Unknown Source)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:230)
>         at
> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:272)
>         ... 10 more
> [user@nodebulk]$
>
>
> Modification in BulkIngestExample
>
> line 146     connector.tableOperations().importDirectory(tableName,
> workDir + "/files", workDir + "/failures", false);
>
>     } catch (Exception e) {
>       System.out.println("\nHere is the exception caught:\n"+ e);
>       System.out.println("\nE.getMessage returns:\n"+ e.getMessage());
> line 151      throw new RuntimeException(e);
>     } finally {
>       if (out != null)
>         out.close();
> line 166     int res = ToolRunner.run(CachedConfiguration.getInstance(),
> new BulkIngestExample(), args);
>
>
> On Thu, Apr 4, 2013 at 3:51 PM, Billie Rinaldi <billie@apache.org> wrote:
>
>> On Thu, Apr 4, 2013 at 12:26 PM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> I haven't tried the classpath option yet, but I executed the below
>>> command as hadoop user ... this seemed to be the command that accumulo was
>>> trying to execute anyway and I am not sure but I would think this should
>>> have avoided the custom classpath issue... Right/Wrong?
>>>
>>
>> No, the jar needs to be both in the libjars and on the classpath.  There
>> are classes that need to be accessed on the local machine in the process of
>> submitting the MapReduce job, and this only can see the classpath, not the
>> libjars.
>>
>> The HADOOP_CLASSPATH you have is unusual.  More often, HADOOP_CLASSPATH
>> is not set at all in hadoop-env.sh, but if it is it should generally be of
>> the form newstuff:$HADOOP_CLASSPATH to avoid this issue.
>>
>> You will have to restart Hadoop after making the change to hadoop-env.sh.
>>
>> Billie
>>
>>
>>
>>>
>>>
>>> Got the same error:
>>> *[hadoop@node]$ /opt/hadoop/bin/hadoop jar
>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> *
>>>  *
>>> *
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> org/apache/accumulo/core/client/Instance
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.accumulo.core.client.Instance
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         ... 3 more
>>>
>>>
>>>
>>> On Thu, Apr 4, 2013 at 2:51 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>
>>>> On Thu, Apr 4, 2013 at 11:41 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>>> *[accumulo@node accumulo]$ cat /opt/hadoop/conf/hadoop-env.sh | grep
>>>>> HADOOP_CLASSPATH*
>>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:
>>>>>
>>>>
>>>> To preserve custom HADOOP_CLASSPATHs, this line should be:
>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:$HADOOP_CLASSPATH
>>>>
>>>> Billie
>>>>
>>>>
>>>>
>>>>>
>>>>> looks like it is overwriting everything. Isn't this the default
>>>>> behavior? Is you hadoop-env.sh missing that line?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 4, 2013 at 2:25 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>
>>>>>> On Thu, Apr 4, 2013 at 10:27 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>>
>>>>>>> I thought about the permissions issue too. All the accumulo stuff is
>>>>>>> under accumulo user so I started running the commands as accumulo ... only
>>>>>>> to get the same result.
>>>>>>> -The errors happen right away
>>>>>>> -the box has both accumulo and hadoop on it
>>>>>>> -the jar contains the instance class. But note that the instance
>>>>>>> class is part of accumulo-core and not examples-simple-1.4.2.jar .... (can
>>>>>>> this be the issue?)
>>>>>>>
>>>>>>
>>>>>> No, that isn't the issue.  tool.sh is finding the accumulo-core jar
>>>>>> and putting it on the HADOOP_CLASSPATH and in the libjars.
>>>>>>
>>>>>> I wonder if your hadoop environment is set up to override the
>>>>>> HADOOP_CLASSPATH.  Check in your hadoop-env.sh to see if HADOOP_CLASSPATH
>>>>>> is set there.
>>>>>>
>>>>>> The reason your commands of the form "tool.sh lib/*jar" aren't
>>>>>> working is that the regex is finding multiple jars and putting them all on
>>>>>> the command line.  tool.sh expects at most one jar followed by a class
>>>>>> name, so whatever jar comes second when the regex is expanded is being
>>>>>> interpreted as a class name.
>>>>>>
>>>>>> Billie
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Commands I ran:
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ whoami*
>>>>>>> accumulo
>>>>>>> *[accumulo@node accumulo]$ ls -l*
>>>>>>> total 184
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 10:25 bin
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 24263 Oct 22 15:30 CHANGES
>>>>>>> drwxr-xr-x 3 accumulo accumulo  4096 Apr  3 10:17 conf
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Jan 15 13:35 contrib
>>>>>>> -rwxr-xr-x 1 accumulo accumulo   695 Nov 18  2011 DISCLAIMER
>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Jan 15 13:35 docs
>>>>>>> drwxr-xr-x 4 accumulo accumulo  4096 Jan 15 13:35 lib
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 56494 Mar 21  2012 LICENSE
>>>>>>> drwxr-xr-x 2 accumulo accumulo 12288 Apr  3 14:43 logs
>>>>>>> -rwxr-xr-x 1 accumulo accumulo  2085 Mar 21  2012 NOTICE
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 27814 Oct 17 08:32 pom.xml
>>>>>>> -rwxr-xr-x 1 accumulo accumulo 12449 Oct 17 08:32 README
>>>>>>> drwxr-xr-x 9 accumulo accumulo  4096 Nov  8 13:40 src
>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Nov  8 13:40 test
>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 09:09 walogs
>>>>>>> *[accumulo@node accumulo]$ ls bin/*
>>>>>>> accumulo           check-slaves  etc_initd_accumulo  start-all.sh
>>>>>>> start-server.sh  stop-here.sh    tdown.sh  tup.sh
>>>>>>> catapultsetup.acc  config.sh     LogForwarder.sh     start-here.sh
>>>>>>>  stop-all.sh      stop-server.sh  tool.sh   upgrade.sh
>>>>>>> *[accumulo@node accumulo]$ ls lib/*
>>>>>>> accumulo-core-1.4.2.jar            accumulo-start-1.4.2.jar
>>>>>>>  commons-collections-3.2.jar    commons-logging-1.0.4.jar
>>>>>>>  jline-0.9.94.jar
>>>>>>> accumulo-core-1.4.2-javadoc.jar    accumulo-start-1.4.2-javadoc.jar
>>>>>>>  commons-configuration-1.5.jar  commons-logging-api-1.0.4.jar
>>>>>>>  libthrift-0.6.1.jar
>>>>>>> accumulo-core-1.4.2-sources.jar    accumulo-start-1.4.2-sources.jar
>>>>>>>  commons-io-1.4.jar             examples-simple-1.4.2.jar
>>>>>>>  log4j-1.2.16.jar
>>>>>>> accumulo-server-1.4.2.jar          cloudtrace-1.4.2.jar
>>>>>>>  commons-jci-core-1.0.jar       examples-simple-1.4.2-javadoc.jar  native
>>>>>>> accumulo-server-1.4.2-javadoc.jar  cloudtrace-1.4.2-javadoc.jar
>>>>>>>  commons-jci-fam-1.0.jar        examples-simple-1.4.2-sources.jar
>>>>>>>  wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>> accumulo-server-1.4.2-sources.jar  cloudtrace-1.4.2-sources.jar
>>>>>>>  commons-lang-2.4.jar           ext
>>>>>>>  wikisearch-query-1.4.2-javadoc.jar
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ jar -tf
>>>>>>> /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>> org/apache/accumulo/core/client/Instance*
>>>>>>> org/apache/accumulo/core/client/Instance.class
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ jar -tf
>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar | grep
>>>>>>> org/apache/accumulo/core/client/Instance*
>>>>>>> *
>>>>>>> *
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^cs].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-server-1.4.2.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-server-1.4.2.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-server-1.4.2.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*.jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-javadoc.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-core-1.4.2-javadoc.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-core-1.4.2-javadoc.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^c].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>>  USERJARS=
>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-sources.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>> lib/accumulo-core-1.4.2-sources.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.accumulo-core-1.4.2-sources.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>
>>>>>>> *[accumulo@node accumulo]$ ./bin/tool.sh
>>>>>>> lib/examples-simple-*[^c].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> default node14.catapult.dev.boozallenet.com:2181 root password
>>>>>>> test_aj /user/559599/input tmp/ajbulktest*
>>>>>>> USERJARS=
>>>>>>> CLASSNAME=lib/examples-simple-1.4.2-sources.jar
>>>>>>>
>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/examples-simple-1.4.2.jar
>>>>>>> lib/examples-simple-1.4.2-sources.jar -libjars
>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> lib.examples-simple-1.4.2-sources.jar
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>> *[accumulo@node accumulo]$*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Apr 4, 2013 at 11:55 AM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>>>
>>>>>>>> On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> *Billie, I checked the values in tool.sh they match. I
>>>>>>>>> uncommented the echo statements and reran the cmd here is what I have:
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> *
>>>>>>>>> *$ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>> instance zookeeper usr pswd table inputdir tmp/bulk*
>>>>>>>>>
>>>>>>>>> USERJARS=
>>>>>>>>>
>>>>>>>>> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>
>>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>>> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>> -libjars
>>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>>>  Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>> Method)
>>>>>>>>>         at
>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>         ... 3 more
>>>>>>>>>
>>>>>>>>>
>>>>>>>> The command looks right.  Instance should be packaged in the
>>>>>>>> accumulo core jar.  To verify that, you could run:
>>>>>>>> jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>
>>>>>>>> I'm not sure what's going on here.  If that error is happening
>>>>>>>> right away, it seems like it can't load the jar on the local machine.  If
>>>>>>>> you're running multiple machines, and if the error were happening later
>>>>>>>> during the MapReduce, I would suggest that you make sure accumulo is
>>>>>>>> present on all the machines.
>>>>>>>>
>>>>>>>> You asked about the user; is the owner of the jars different than
>>>>>>>> the user you're running as?  In that case, it could be a permissions
>>>>>>>> issue.  Could the permissions be set so that you can list that directory
>>>>>>>> but not read the jar?
>>>>>>>>
>>>>>>>> Billie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *org/apache/accumulo/core/client/Instance is located in the
>>>>>>>>> src/... folder which I am not is what is packaged in the examples-
>>>>>>>>> simple-[^c].jar ? *
>>>>>>>>> *Sorry folks for the constant emails... just trying to get this
>>>>>>>>> to work but I really appreciate the help.*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org>wrote:
>>>>>>>>>
>>>>>>>>>> If you run tool.sh with sh -x, it will step through the script so
>>>>>>>>>> you can see what jars it is picking up and perhaps why it's missing them
>>>>>>>>>> for you.
>>>>>>>>>>
>>>>>>>>>> Sent from my phone, please pardon the typos and brevity.
>>>>>>>>>> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> What user are you running the commands as ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Where did you put all your java files?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <
>>>>>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I was able to run the example, as written in
>>>>>>>>>>>>> docs/examples/README.bulkIngest substituting my
>>>>>>>>>>>>> instance/zookeeper/user/password information:
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ pwd
>>>>>>>>>>>>> /home/ecn/workspace/1.4.3
>>>>>>>>>>>>> $ ls
>>>>>>>>>>>>> bin      conf     docs  LICENSE  NOTICE   README  src     test
>>>>>>>>>>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test
>>>>>>>>>>>>> localhost root secret test_bulk row_00000333 row_00000666
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0 1000
>>>>>>>>>>>>> bulk/test_1.txt
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample test
>>>>>>>>>>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>>>>>>>>>>
>>>>>>>>>>>>> $./bin/accumulo
>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
>>>>>>>>>>>>> localhost root secret test_bulk 0 1000
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Eric
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am not sure its just a regular expression issue. Below is
>>>>>>>>>>>>>> my console output. Not sure why this ClassDefFoundError occurs. Has anyone
>>>>>>>>>>>>>> tried to do it successfully? Can you please tell me your env set up if you
>>>>>>>>>>>>>> did.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [user@mynode bulk]$ pwd
>>>>>>>>>>>>>> /home/user/bulk
>>>>>>>>>>>>>> [user@mynode bulk]$ ls
>>>>>>>>>>>>>> BulkIngestExample.java  GenerateTestData.java
>>>>>>>>>>>>>>  SetupTable.java  test_1.txt  VerifyIngest.java
>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <
>>>>>>>>>>>>>> billie@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <
>>>>>>>>>>>>>>> ctubbsii@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Try with -libjars:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> tool.sh automatically adds libjars.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem is the regular expression for the
>>>>>>>>>>>>>>> examples-simple jar.  It's trying to exclude the javadoc jar with ^c, but
>>>>>>>>>>>>>>> it isn't excluding the sources jar.
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you can just
>>>>>>>>>>>>>>> specify the jar exactly, /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> */opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Billie
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Christopher L Tubbs II
>>>>>>>>>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2
>>>>>>>>>>>>>>>> accumulo) and I am not
>>>>>>>>>>>>>>>> > able to run the following steps. Here is the error I get:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>> > myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>>>>>> > Exception in thread "main"
>>>>>>>>>>>>>>>> java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>>> >         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>> >         at
>>>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>>>>>>>>>>> > commons-collections-3.2.jar
>>>>>>>>>>>>>>>> > commons-logging-1.0.4.jar
>>>>>>>>>>>>>>>> > jline-0.9.94.jar
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > commons-configuration-1.5.jar
>>>>>>>>>>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>>>>>>>>>>> > libthrift-0.6.1.jar
>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > commons-io-1.4.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>> > log4j-1.2.16.jar
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>>>>>>>>>>> > commons-jci-core-1.0.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > native
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>>>>>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > commons-lang-2.4.jar
>>>>>>>>>>>>>>>> >  ext
>>>>>>>>>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Clearly, the libraries and source file exist so I am not
>>>>>>>>>>>>>>>> sure whats going
>>>>>>>>>>>>>>>> > on. I tried putting in
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>> > instead then it complains BulkIngestExample ClassNotFound.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > Suggestions?
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <
>>>>>>>>>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> You will have to write your own InputFormat class which
>>>>>>>>>>>>>>>> will parse your
>>>>>>>>>>>>>>>> >> file and pass records to your reducer.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> -Eric
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> Looking at the BulkIngestExample, it uses
>>>>>>>>>>>>>>>> GenerateTestData and creates a
>>>>>>>>>>>>>>>> >>> .txt file which contians Key: Value pair and correct me
>>>>>>>>>>>>>>>> if I am wrong but
>>>>>>>>>>>>>>>> >>> each new line is a new row right?
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> I need to know how to have family and qualifiers also.
>>>>>>>>>>>>>>>> In other words,
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> 1) Do I set up a .txt file that can be converted into
>>>>>>>>>>>>>>>> an Accumulo RF File
>>>>>>>>>>>>>>>> >>> using AccumuloFileOutputFormat  which can then be
>>>>>>>>>>>>>>>> imported into my table?
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> 2) if yes, what is the format of the .txt file.
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <
>>>>>>>>>>>>>>>> eric.newton@gmail.com>
>>>>>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> Your data needs to be in the RFile format, and more
>>>>>>>>>>>>>>>> importantly it needs
>>>>>>>>>>>>>>>> >>>> to be sorted.
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort
>>>>>>>>>>>>>>>> your data.  See the
>>>>>>>>>>>>>>>> >>>> BulkIngestExample.
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> -Eric
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>> format.
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> I want to import this data into a table in accumulo.
>>>>>>>>>>>>>>>> My end goal is to
>>>>>>>>>>>>>>>> >>>>> understand how to use the BulkImport feature in
>>>>>>>>>>>>>>>> accumulo. I tried to login
>>>>>>>>>>>>>>>> >>>>> to the accumulo shell as root and then run:
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> #table mytable
>>>>>>>>>>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir true
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> but it didn't work. My data file was saved as
>>>>>>>>>>>>>>>> data.txt in
>>>>>>>>>>>>>>>> >>>>> /home/inputDir. I tried to create the dir/file
>>>>>>>>>>>>>>>> structure in hdfs and linux
>>>>>>>>>>>>>>>> >>>>> but neither worked. When trying locally, it keeps
>>>>>>>>>>>>>>>> complaining about
>>>>>>>>>>>>>>>> >>>>> failureDir not existing.
>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>> >>>>> java.io.FileNotFoundException: File does not exist:
>>>>>>>>>>>>>>>> failures
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> When trying with files on hdfs, I get no error on the
>>>>>>>>>>>>>>>> console but the
>>>>>>>>>>>>>>>> >>>>> logger had the following messages:
>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>> hdfs://node....//inputDir/data.txt does
>>>>>>>>>>>>>>>> >>>>> not have a valid extension, ignoring
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> or,
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>>>>>>>>>>> >>>>> a map file, ignoring
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> Suggestions? Am I not setting up the job right? Thank
>>>>>>>>>>>>>>>> you for help in
>>>>>>>>>>>>>>>> >>>>> advance.
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <
>>>>>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>> >>>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>> format:
>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message