accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: importdirectory in accumulo
Date Mon, 08 Apr 2013 18:14:41 GMT
On Fri, Apr 5, 2013 at 6:01 PM, David Medinets <david.medinets@gmail.com> wrote:
> I ran into this issue. Look in your log files for a directory not found
> exceotion which is not bubbled up to the bash shell.

Could the following issue be the problem?

https://issues.apache.org/jira/browse/ACCUMULO-1171

David, for the issue you ran into.  If you know of a situation where
bulk import errors are not propagating back to client, can you open a
ticket?

>
> On Apr 5, 2013 11:37 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>
>> I agree with you that changing HADOOP_CLASSPATH like you said should be
>> done. I couldn't quite do that just yet (people have jobs running and don't
>> want to risk it).
>>
>> However, I did a work around. (I am going off the theory that my
>> Hadoop_classpath is bad so it can't accept all the libraries I am passing to
>> it so I decided to package all the libraries I needed into a jar.
>> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/)
>> I downloaded the source code and made a shaded (uber) jar to include all the
>> libraries I needed. Then I submitted the hadoop job with my uber jar like
>> any other map reduce job. My mappers and reducers finish the job but I got
>> an exception for waitForTableOperation. I think this proves my theory of bad
>> classpath but clearly I have more issues to deal with. If you have any
>> suggestions on how to even debug that would be awesome!
>>
>> My console output(removed a lot of server specific stuff for security) is
>> below. I modified BulkIngestExample.java to add some print statements.
>> Modified lines shown below also.
>>
>>
>> [user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar
>> instance zookeepers user password table inputdir tmp/bulk
>>
>> 3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process
>> : 1
>> 13/04/05 11:20:53 INFO mapred.JobClient: Running job:
>> job_201304021611_0045
>> 13/04/05 11:20:54 INFO mapred.JobClient:  map 0% reduce 0%
>> 13/04/05 11:21:10 INFO mapred.JobClient:  map 100% reduce 0%
>> 13/04/05 11:21:25 INFO mapred.JobClient:  map 100% reduce 50%
>> 13/04/05 11:21:26 INFO mapred.JobClient:  map 100% reduce 100%
>> 13/04/05 11:21:31 INFO mapred.JobClient: Job complete:
>> job_201304021611_0045
>> 13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25
>> 13/04/05 11:21:31 INFO mapred.JobClient:   Job Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched reduce tasks=2
>> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=15842
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all
>> reduces waiting after reserving slots (ms)=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Total time spent by all maps
>> waiting after reserving slots (ms)=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Rack-local map tasks=1
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Launched map tasks=1
>> 13/04/05 11:21:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25891
>> 13/04/05 11:21:31 INFO mapred.JobClient:   File Output Format Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Written=496
>> 13/04/05 11:21:31 INFO mapred.JobClient:   FileSystemCounters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_READ=312
>> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_READ=421
>> 13/04/05 11:21:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=68990
>> 13/04/05 11:21:31 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=496
>> 13/04/05 11:21:31 INFO mapred.JobClient:   File Input Format Counters
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Bytes Read=280
>> 13/04/05 11:21:31 INFO mapred.JobClient:   Map-Reduce Framework
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input groups=10
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output materialized
>> bytes=312
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine output records=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map input records=10
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce shuffle bytes=186
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce output records=10
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Spilled Records=20
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output bytes=280
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Combine input records=0
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Map output records=10
>> 13/04/05 11:21:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=141
>> 13/04/05 11:21:31 INFO mapred.JobClient:     Reduce input records=10
>>
>> Here is the exception caught:
>> org.apache.accumulo.core.client.AccumuloException: Internal error
>> processing waitForTableOperation
>>
>> E.getMessage returns:
>> Internal error processing waitForTableOperation
>> Exception in thread "main" java.lang.RuntimeException:
>> org.apache.accumulo.core.client.AccumuloException: Internal error processing
>> waitForTableOperation
>>         at
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:151)
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>         at
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.main(BulkIngestExample.java:166)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:601)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> Caused by: org.apache.accumulo.core.client.AccumuloException: Internal
>> error processing waitForTableOperation
>>         at
>> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290)
>>         at
>> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258)
>>         at
>> org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945)
>>         at
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:146)
>>         ... 7 more
>> Caused by: org.apache.thrift.TApplicationException: Internal error
>> processing waitForTableOperation
>>         at
>> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>>         at
>> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684)
>>         at
>> org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:601)
>>         at
>> org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:84)
>>         at $Proxy5.waitForTableOperation(Unknown Source)
>>         at
>> org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:230)
>>         at
>> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:272)
>>         ... 10 more
>> [user@nodebulk]$
>>
>>
>> Modification in BulkIngestExample
>>
>> line 146     connector.tableOperations().importDirectory(tableName,
>> workDir + "/files", workDir + "/failures", false);
>>
>>     } catch (Exception e) {
>>       System.out.println("\nHere is the exception caught:\n"+ e);
>>       System.out.println("\nE.getMessage returns:\n"+ e.getMessage());
>> line 151      throw new RuntimeException(e);
>>     } finally {
>>       if (out != null)
>>         out.close();
>> line 166     int res = ToolRunner.run(CachedConfiguration.getInstance(),
>> new BulkIngestExample(), args);
>>
>>
>> On Thu, Apr 4, 2013 at 3:51 PM, Billie Rinaldi <billie@apache.org> wrote:
>>>
>>> On Thu, Apr 4, 2013 at 12:26 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>> I haven't tried the classpath option yet, but I executed the below
>>>> command as hadoop user ... this seemed to be the command that accumulo was
>>>> trying to execute anyway and I am not sure but I would think this should
>>>> have avoided the custom classpath issue... Right/Wrong?
>>>
>>>
>>> No, the jar needs to be both in the libjars and on the classpath.  There
>>> are classes that need to be accessed on the local machine in the process of
>>> submitting the MapReduce job, and this only can see the classpath, not the
>>> libjars.
>>>
>>> The HADOOP_CLASSPATH you have is unusual.  More often, HADOOP_CLASSPATH
>>> is not set at all in hadoop-env.sh, but if it is it should generally be of
>>> the form newstuff:$HADOOP_CLASSPATH to avoid this issue.
>>>
>>> You will have to restart Hadoop after making the change to hadoop-env.sh.
>>>
>>> Billie
>>>
>>>
>>>>
>>>>
>>>>
>>>> Got the same error:
>>>> [hadoop@node]$ /opt/hadoop/bin/hadoop jar
>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>> -libjars
>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>
>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>> org/apache/accumulo/core/client/Instance
>>>>         at java.lang.Class.forName0(Native Method)
>>>>         at java.lang.Class.forName(Class.java:264)
>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> org.apache.accumulo.core.client.Instance
>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>         ... 3 more
>>>>
>>>>
>>>>
>>>> On Thu, Apr 4, 2013 at 2:51 PM, Billie Rinaldi <billie@apache.org>
>>>> wrote:
>>>>>
>>>>> On Thu, Apr 4, 2013 at 11:41 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>>
>>>>>> [accumulo@node accumulo]$ cat /opt/hadoop/conf/hadoop-env.sh | grep
>>>>>> HADOOP_CLASSPATH
>>>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:
>>>>>
>>>>>
>>>>> To preserve custom HADOOP_CLASSPATHs, this line should be:
>>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:$HADOOP_CLASSPATH
>>>>>
>>>>> Billie
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> looks like it is overwriting everything. Isn't this the default
>>>>>> behavior? Is you hadoop-env.sh missing that line?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 4, 2013 at 2:25 PM, Billie Rinaldi <billie@apache.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Thu, Apr 4, 2013 at 10:27 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>>>>
>>>>>>>> I thought about the permissions issue too. All the accumulo stuff is
>>>>>>>> under accumulo user so I started running the commands as accumulo ... only
>>>>>>>> to get the same result.
>>>>>>>> -The errors happen right away
>>>>>>>> -the box has both accumulo and hadoop on it
>>>>>>>> -the jar contains the instance class. But note that the instance
>>>>>>>> class is part of accumulo-core and not examples-simple-1.4.2.jar .... (can
>>>>>>>> this be the issue?)
>>>>>>>
>>>>>>>
>>>>>>> No, that isn't the issue.  tool.sh is finding the accumulo-core jar
>>>>>>> and putting it on the HADOOP_CLASSPATH and in the libjars.
>>>>>>>
>>>>>>> I wonder if your hadoop environment is set up to override the
>>>>>>> HADOOP_CLASSPATH.  Check in your hadoop-env.sh to see if HADOOP_CLASSPATH is
>>>>>>> set there.
>>>>>>>
>>>>>>> The reason your commands of the form "tool.sh lib/*jar" aren't
>>>>>>> working is that the regex is finding multiple jars and putting them all on
>>>>>>> the command line.  tool.sh expects at most one jar followed by a class name,
>>>>>>> so whatever jar comes second when the regex is expanded is being interpreted
>>>>>>> as a class name.
>>>>>>>
>>>>>>> Billie
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Commands I ran:
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ whoami
>>>>>>>> accumulo
>>>>>>>> [accumulo@node accumulo]$ ls -l
>>>>>>>> total 184
>>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 10:25 bin
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo 24263 Oct 22 15:30 CHANGES
>>>>>>>> drwxr-xr-x 3 accumulo accumulo  4096 Apr  3 10:17 conf
>>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Jan 15 13:35 contrib
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo   695 Nov 18  2011 DISCLAIMER
>>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Jan 15 13:35 docs
>>>>>>>> drwxr-xr-x 4 accumulo accumulo  4096 Jan 15 13:35 lib
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo 56494 Mar 21  2012 LICENSE
>>>>>>>> drwxr-xr-x 2 accumulo accumulo 12288 Apr  3 14:43 logs
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo  2085 Mar 21  2012 NOTICE
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo 27814 Oct 17 08:32 pom.xml
>>>>>>>> -rwxr-xr-x 1 accumulo accumulo 12449 Oct 17 08:32 README
>>>>>>>> drwxr-xr-x 9 accumulo accumulo  4096 Nov  8 13:40 src
>>>>>>>> drwxr-xr-x 5 accumulo accumulo  4096 Nov  8 13:40 test
>>>>>>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 09:09 walogs
>>>>>>>> [accumulo@node accumulo]$ ls bin/
>>>>>>>> accumulo           check-slaves  etc_initd_accumulo  start-all.sh
>>>>>>>> start-server.sh  stop-here.sh    tdown.sh  tup.sh
>>>>>>>> catapultsetup.acc  config.sh     LogForwarder.sh     start-here.sh
>>>>>>>> stop-all.sh      stop-server.sh  tool.sh   upgrade.sh
>>>>>>>> [accumulo@node accumulo]$ ls lib/
>>>>>>>> accumulo-core-1.4.2.jar            accumulo-start-1.4.2.jar
>>>>>>>> commons-collections-3.2.jar    commons-logging-1.0.4.jar
>>>>>>>> jline-0.9.94.jar
>>>>>>>> accumulo-core-1.4.2-javadoc.jar    accumulo-start-1.4.2-javadoc.jar
>>>>>>>> commons-configuration-1.5.jar  commons-logging-api-1.0.4.jar
>>>>>>>> libthrift-0.6.1.jar
>>>>>>>> accumulo-core-1.4.2-sources.jar    accumulo-start-1.4.2-sources.jar
>>>>>>>> commons-io-1.4.jar             examples-simple-1.4.2.jar
>>>>>>>> log4j-1.2.16.jar
>>>>>>>> accumulo-server-1.4.2.jar          cloudtrace-1.4.2.jar
>>>>>>>> commons-jci-core-1.0.jar       examples-simple-1.4.2-javadoc.jar  native
>>>>>>>> accumulo-server-1.4.2-javadoc.jar  cloudtrace-1.4.2-javadoc.jar
>>>>>>>> commons-jci-fam-1.0.jar        examples-simple-1.4.2-sources.jar
>>>>>>>> wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>> accumulo-server-1.4.2-sources.jar  cloudtrace-1.4.2-sources.jar
>>>>>>>> commons-lang-2.4.jar           ext
>>>>>>>> wikisearch-query-1.4.2-javadoc.jar
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ jar -tf
>>>>>>>> /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>> org/apache/accumulo/core/client/Instance.class
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ jar -tf
>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar | grep
>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*[^cs].jar
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>> USERJARS=
>>>>>>>> CLASSNAME=lib/accumulo-server-1.4.2.jar
>>>>>>>>
>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>>> lib/accumulo-server-1.4.2.jar -libjars
>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>> lib.accumulo-server-1.4.2.jar
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>> Method)
>>>>>>>>         at
>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*.jar
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>> USERJARS=
>>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-javadoc.jar
>>>>>>>>
>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>>> lib/accumulo-core-1.4.2-javadoc.jar -libjars
>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>> lib.accumulo-core-1.4.2-javadoc.jar
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>> Method)
>>>>>>>>         at
>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*[^c].jar
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>> USERJARS=
>>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-sources.jar
>>>>>>>>
>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>>>>>>> lib/accumulo-core-1.4.2-sources.jar -libjars
>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>> lib.accumulo-core-1.4.2-sources.jar
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>> Method)
>>>>>>>>         at
>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>
>>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh
>>>>>>>> lib/examples-simple-*[^c].jar
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample default
>>>>>>>> node14.catapult.dev.boozallenet.com:2181 root password test_aj
>>>>>>>> /user/559599/input tmp/ajbulktest
>>>>>>>> USERJARS=
>>>>>>>> CLASSNAME=lib/examples-simple-1.4.2-sources.jar
>>>>>>>>
>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/examples-simple-1.4.2.jar
>>>>>>>> lib/examples-simple-1.4.2-sources.jar -libjars
>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>> lib.examples-simple-1.4.2-sources.jar
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>> Method)
>>>>>>>>         at
>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>> [accumulo@node accumulo]$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Apr 4, 2013 at 11:55 AM, Billie Rinaldi <billie@apache.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <aji1705@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Billie, I checked the values in tool.sh they match. I uncommented
>>>>>>>>>> the echo statements and reran the cmd here is what I have:
>>>>>>>>>>
>>>>>>>>>> $ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>> instance zookeeper usr pswd table inputdir tmp/bulk
>>>>>>>>>>
>>>>>>>>>> USERJARS=
>>>>>>>>>>
>>>>>>>>>> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>
>>>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>>>>>>> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>> -libjars
>>>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>> Method)
>>>>>>>>>>         at
>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>         ... 3 more
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The command looks right.  Instance should be packaged in the
>>>>>>>>> accumulo core jar.  To verify that, you could run:
>>>>>>>>> jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>
>>>>>>>>> I'm not sure what's going on here.  If that error is happening
>>>>>>>>> right away, it seems like it can't load the jar on the local machine.  If
>>>>>>>>> you're running multiple machines, and if the error were happening later
>>>>>>>>> during the MapReduce, I would suggest that you make sure accumulo is present
>>>>>>>>> on all the machines.
>>>>>>>>>
>>>>>>>>> You asked about the user; is the owner of the jars different than
>>>>>>>>> the user you're running as?  In that case, it could be a permissions issue.
>>>>>>>>> Could the permissions be set so that you can list that directory but not
>>>>>>>>> read the jar?
>>>>>>>>>
>>>>>>>>> Billie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org/apache/accumulo/core/client/Instance is located in the src/...
>>>>>>>>>> folder which I am not is what is packaged in the examples-simple-[^c].jar ?
>>>>>>>>>> Sorry folks for the constant emails... just trying to get this to
>>>>>>>>>> work but I really appreciate the help.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> If you run tool.sh with sh -x, it will step through the script so
>>>>>>>>>>> you can see what jars it is picking up and perhaps why it's missing them for
>>>>>>>>>>> you.
>>>>>>>>>>>
>>>>>>>>>>> Sent from my phone, please pardon the typos and brevity.
>>>>>>>>>>>
>>>>>>>>>>> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> What user are you running the commands as ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Where did you put all your java files?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton
>>>>>>>>>>>>> <eric.newton@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was able to run the example, as written in
>>>>>>>>>>>>>> docs/examples/README.bulkIngest substituting my
>>>>>>>>>>>>>> instance/zookeeper/user/password information:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ pwd
>>>>>>>>>>>>>> /home/ecn/workspace/1.4.3
>>>>>>>>>>>>>> $ ls
>>>>>>>>>>>>>> bin      conf     docs  LICENSE  NOTICE   README  src     test
>>>>>>>>>>>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test localhost
>>>>>>>>>>>>>> root secret test_bulk row_00000333 row_00000666
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ ./bin/accumulo
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0 1000
>>>>>>>>>>>>>> bulk/test_1.txt
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample test
>>>>>>>>>>>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $./bin/accumulo
>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
>>>>>>>>>>>>>> localhost root secret test_bulk 0 1000
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Eric
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am not sure its just a regular expression issue. Below is
>>>>>>>>>>>>>>> my console output. Not sure why this ClassDefFoundError occurs. Has anyone
>>>>>>>>>>>>>>> tried to do it successfully? Can you please tell me your env set up if you
>>>>>>>>>>>>>>> did.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [user@mynode bulk]$ pwd
>>>>>>>>>>>>>>> /home/user/bulk
>>>>>>>>>>>>>>> [user@mynode bulk]$ ls
>>>>>>>>>>>>>>> BulkIngestExample.java  GenerateTestData.java
>>>>>>>>>>>>>>> SetupTable.java  test_1.txt  VerifyIngest.java
>>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>>>>>>> Method)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>>>>>>         ... 3 more
>>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>         at
>>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi
>>>>>>>>>>>>>>> <billie@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher
>>>>>>>>>>>>>>>> <ctubbsii@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Try with -libjars:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> tool.sh automatically adds libjars.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The problem is the regular expression for the
>>>>>>>>>>>>>>>> examples-simple jar.  It's trying to exclude the javadoc jar with ^c, but it
>>>>>>>>>>>>>>>> isn't excluding the sources jar.
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you can just
>>>>>>>>>>>>>>>> specify the jar exactly, /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Billie
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Christopher L Tubbs II
>>>>>>>>>>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis
>>>>>>>>>>>>>>>>> <aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2
>>>>>>>>>>>>>>>>> > accumulo) and I am not
>>>>>>>>>>>>>>>>> > able to run the following steps. Here is the error I get:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>>>>>>> > myinstance zookeepers user pswd tableName inputDir
>>>>>>>>>>>>>>>>> > tmp/bulkWork
>>>>>>>>>>>>>>>>> > Exception in thread "main"
>>>>>>>>>>>>>>>>> > java.lang.ClassNotFoundException:
>>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>>>>>>> >         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>>>>>>> >         at
>>>>>>>>>>>>>>>>> > org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>>>>>>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>>>>>>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>>>>>>>>>>>> > commons-collections-3.2.jar
>>>>>>>>>>>>>>>>> > commons-logging-1.0.4.jar
>>>>>>>>>>>>>>>>> > jline-0.9.94.jar
>>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > commons-configuration-1.5.jar
>>>>>>>>>>>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>>>>>>>>>>>> > libthrift-0.6.1.jar
>>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > commons-io-1.4.jar
>>>>>>>>>>>>>>>>> > examples-simple-1.4.2.jar
>>>>>>>>>>>>>>>>> > log4j-1.2.16.jar
>>>>>>>>>>>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>>>>>>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>>>>>>>>>>>> > commons-jci-core-1.0.jar
>>>>>>>>>>>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > native
>>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>>>>>>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > commons-lang-2.4.jar
>>>>>>>>>>>>>>>>> >  ext
>>>>>>>>>>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Clearly, the libraries and source file exist so I am not
>>>>>>>>>>>>>>>>> > sure whats going
>>>>>>>>>>>>>>>>> > on. I tried putting in
>>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>>>>>>>>>>>> > instead then it complains BulkIngestExample
>>>>>>>>>>>>>>>>> > ClassNotFound.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Suggestions?
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton
>>>>>>>>>>>>>>>>> > <eric.newton@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> You will have to write your own InputFormat class which
>>>>>>>>>>>>>>>>> >> will parse your
>>>>>>>>>>>>>>>>> >> file and pass records to your reducer.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> -Eric
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis
>>>>>>>>>>>>>>>>> >> <aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> Looking at the BulkIngestExample, it uses
>>>>>>>>>>>>>>>>> >>> GenerateTestData and creates a
>>>>>>>>>>>>>>>>> >>> .txt file which contians Key: Value pair and correct me
>>>>>>>>>>>>>>>>> >>> if I am wrong but
>>>>>>>>>>>>>>>>> >>> each new line is a new row right?
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> I need to know how to have family and qualifiers also.
>>>>>>>>>>>>>>>>> >>> In other words,
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> 1) Do I set up a .txt file that can be converted into
>>>>>>>>>>>>>>>>> >>> an Accumulo RF File
>>>>>>>>>>>>>>>>> >>> using AccumuloFileOutputFormat  which can then be
>>>>>>>>>>>>>>>>> >>> imported into my table?
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> 2) if yes, what is the format of the .txt file.
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton
>>>>>>>>>>>>>>>>> >>> <eric.newton@gmail.com>
>>>>>>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> Your data needs to be in the RFile format, and more
>>>>>>>>>>>>>>>>> >>>> importantly it needs
>>>>>>>>>>>>>>>>> >>>> to be sorted.
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort
>>>>>>>>>>>>>>>>> >>>> your data.  See the
>>>>>>>>>>>>>>>>> >>>> BulkIngestExample.
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> -Eric
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis
>>>>>>>>>>>>>>>>> >>>> <aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>>> >>>>> format.
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>>>>>>>>>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> I want to import this data into a table in accumulo.
>>>>>>>>>>>>>>>>> >>>>> My end goal is to
>>>>>>>>>>>>>>>>> >>>>> understand how to use the BulkImport feature in
>>>>>>>>>>>>>>>>> >>>>> accumulo. I tried to login
>>>>>>>>>>>>>>>>> >>>>> to the accumulo shell as root and then run:
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> #table mytable
>>>>>>>>>>>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir true
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> but it didn't work. My data file was saved as
>>>>>>>>>>>>>>>>> >>>>> data.txt in
>>>>>>>>>>>>>>>>> >>>>> /home/inputDir. I tried to create the dir/file
>>>>>>>>>>>>>>>>> >>>>> structure in hdfs and linux
>>>>>>>>>>>>>>>>> >>>>> but neither worked. When trying locally, it keeps
>>>>>>>>>>>>>>>>> >>>>> complaining about
>>>>>>>>>>>>>>>>> >>>>> failureDir not existing.
>>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>>> >>>>> java.io.FileNotFoundException: File does not exist:
>>>>>>>>>>>>>>>>> >>>>> failures
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> When trying with files on hdfs, I get no error on the
>>>>>>>>>>>>>>>>> >>>>> console but the
>>>>>>>>>>>>>>>>> >>>>> logger had the following messages:
>>>>>>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>>> >>>>> hdfs://node....//inputDir/data.txt does
>>>>>>>>>>>>>>>>> >>>>> not have a valid extension, ignoring
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> or,
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>>>>>>>>>>> >>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>>>>>>>>>>>> >>>>> a map file, ignoring
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> Suggestions? Am I not setting up the job right? Thank
>>>>>>>>>>>>>>>>> >>>>> you for help in
>>>>>>>>>>>>>>>>> >>>>> advance.
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis
>>>>>>>>>>>>>>>>> >>>>> <aji1705@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>> I have some data in a text file in the following
>>>>>>>>>>>>>>>>> >>>>>> format:
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message