accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <bil...@apache.org>
Subject Re: importdirectory in accumulo
Date Thu, 04 Apr 2013 18:51:41 GMT
On Thu, Apr 4, 2013 at 11:41 AM, Aji Janis <aji1705@gmail.com> wrote:

> *[accumulo@node accumulo]$ cat /opt/hadoop/conf/hadoop-env.sh | grep
> HADOOP_CLASSPATH*
> export HADOOP_CLASSPATH=./:/conf:/build/*:
>

To preserve custom HADOOP_CLASSPATHs, this line should be:
export HADOOP_CLASSPATH=./:/conf:/build/*:$HADOOP_CLASSPATH

Billie



>
> looks like it is overwriting everything. Isn't this the default behavior?
> Is you hadoop-env.sh missing that line?
>
>
>
>
> On Thu, Apr 4, 2013 at 2:25 PM, Billie Rinaldi <billie@apache.org> wrote:
>
>> On Thu, Apr 4, 2013 at 10:27 AM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> I thought about the permissions issue too. All the accumulo stuff is
>>> under accumulo user so I started running the commands as accumulo ... only
>>> to get the same result.
>>> -The errors happen right away
>>> -the box has both accumulo and hadoop on it
>>> -the jar contains the instance class. But note that the instance class
>>> is part of accumulo-core and not examples-simple-1.4.2.jar .... (can this
>>> be the issue?)
>>>
>>
>> No, that isn't the issue.  tool.sh is finding the accumulo-core jar and
>> putting it on the HADOOP_CLASSPATH and in the libjars.
>>
>> I wonder if your hadoop environment is set up to override the
>> HADOOP_CLASSPATH.  Check in your hadoop-env.sh to see if HADOOP_CLASSPATH
>> is set there.
>>
>> The reason your commands of the form "tool.sh lib/*jar" aren't working is
>> that the regex is finding multiple jars and putting them all on the command
>> line.  tool.sh expects at most one jar followed by a class name, so
>> whatever jar comes second when the regex is expanded is being interpreted
>> as a class name.
>>
>> Billie
>>
>>
>>
>>>
>>> Commands I ran:
>>>
>>> *[accumulo@node accumulo]$ whoami*
>>> accumulo
>>> *[accumulo@node accumulo]$ ls -l*
>>> total 184
>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 10:25 bin
>>> -rwxr-xr-x 1 accumulo accumulo 24263 Oct 22 15:30 CHANGES
>>> drwxr-xr-x 3 accumulo accumulo  4096 Apr  3 10:17 conf
>>> drwxr-xr-x 2 accumulo accumulo  4096 Jan 15 13:35 contrib
>>> -rwxr-xr-x 1 accumulo accumulo   695 Nov 18  2011 DISCLAIMER
>>> drwxr-xr-x 5 accumulo accumulo  4096 Jan 15 13:35 docs
>>> drwxr-xr-x 4 accumulo accumulo  4096 Jan 15 13:35 lib
>>> -rwxr-xr-x 1 accumulo accumulo 56494 Mar 21  2012 LICENSE
>>> drwxr-xr-x 2 accumulo accumulo 12288 Apr  3 14:43 logs
>>> -rwxr-xr-x 1 accumulo accumulo  2085 Mar 21  2012 NOTICE
>>> -rwxr-xr-x 1 accumulo accumulo 27814 Oct 17 08:32 pom.xml
>>> -rwxr-xr-x 1 accumulo accumulo 12449 Oct 17 08:32 README
>>> drwxr-xr-x 9 accumulo accumulo  4096 Nov  8 13:40 src
>>> drwxr-xr-x 5 accumulo accumulo  4096 Nov  8 13:40 test
>>> drwxr-xr-x 2 accumulo accumulo  4096 Apr  4 09:09 walogs
>>> *[accumulo@node accumulo]$ ls bin/*
>>> accumulo           check-slaves  etc_initd_accumulo  start-all.sh
>>> start-server.sh  stop-here.sh    tdown.sh  tup.sh
>>> catapultsetup.acc  config.sh     LogForwarder.sh     start-here.sh
>>>  stop-all.sh      stop-server.sh  tool.sh   upgrade.sh
>>> *[accumulo@node accumulo]$ ls lib/*
>>> accumulo-core-1.4.2.jar            accumulo-start-1.4.2.jar
>>>  commons-collections-3.2.jar    commons-logging-1.0.4.jar
>>>  jline-0.9.94.jar
>>> accumulo-core-1.4.2-javadoc.jar    accumulo-start-1.4.2-javadoc.jar
>>>  commons-configuration-1.5.jar  commons-logging-api-1.0.4.jar
>>>  libthrift-0.6.1.jar
>>> accumulo-core-1.4.2-sources.jar    accumulo-start-1.4.2-sources.jar
>>>  commons-io-1.4.jar             examples-simple-1.4.2.jar
>>>  log4j-1.2.16.jar
>>> accumulo-server-1.4.2.jar          cloudtrace-1.4.2.jar
>>>  commons-jci-core-1.0.jar       examples-simple-1.4.2-javadoc.jar  native
>>> accumulo-server-1.4.2-javadoc.jar  cloudtrace-1.4.2-javadoc.jar
>>>  commons-jci-fam-1.0.jar        examples-simple-1.4.2-sources.jar
>>>  wikisearch-ingest-1.4.2-javadoc.jar
>>> accumulo-server-1.4.2-sources.jar  cloudtrace-1.4.2-sources.jar
>>>  commons-lang-2.4.jar           ext
>>>  wikisearch-query-1.4.2-javadoc.jar
>>>
>>> *[accumulo@node accumulo]$ jar -tf
>>> /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>> org/apache/accumulo/core/client/Instance*
>>> org/apache/accumulo/core/client/Instance.class
>>>
>>> *[accumulo@node accumulo]$ jar -tf
>>> /opt/accumulo/lib/examples-simple-1.4.2.jar | grep
>>> org/apache/accumulo/core/client/Instance*
>>> *
>>> *
>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^cs].jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>> USERJARS=
>>> CLASSNAME=lib/accumulo-server-1.4.2.jar
>>>
>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>> lib/accumulo-server-1.4.2.jar -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>> lib.accumulo-server-1.4.2.jar
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>
>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*.jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>> USERJARS=
>>> CLASSNAME=lib/accumulo-core-1.4.2-javadoc.jar
>>>
>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>> lib/accumulo-core-1.4.2-javadoc.jar -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>> lib.accumulo-core-1.4.2-javadoc.jar
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>
>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/*[^c].jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>  USERJARS=
>>> CLASSNAME=lib/accumulo-core-1.4.2-sources.jar
>>>
>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar
>>> lib/accumulo-core-1.4.2-sources.jar -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>> lib.accumulo-core-1.4.2-sources.jar
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>
>>> *[accumulo@node accumulo]$ ./bin/tool.sh lib/examples-simple-*[^c].jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> default node14.catapult.dev.boozallenet.com:2181 root password test_aj
>>> /user/559599/input tmp/ajbulktest*
>>> USERJARS=
>>> CLASSNAME=lib/examples-simple-1.4.2-sources.jar
>>>
>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>> exec /opt/hadoop/bin/hadoop jar lib/examples-simple-1.4.2.jar
>>> lib/examples-simple-1.4.2-sources.jar -libjars
>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>> lib.examples-simple-1.4.2-sources.jar
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         at java.lang.Class.forName0(Native Method)
>>>         at java.lang.Class.forName(Class.java:264)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>> *[accumulo@node accumulo]$*
>>>
>>>
>>>
>>> On Thu, Apr 4, 2013 at 11:55 AM, Billie Rinaldi <billie@apache.org>wrote:
>>>
>>>> On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>>> *Billie, I checked the values in tool.sh they match. I uncommented
>>>>> the echo statements and reran the cmd here is what I have:*
>>>>> *
>>>>> *
>>>>> *$ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>> instance zookeeper usr pswd table inputdir tmp/bulk*
>>>>>
>>>>> USERJARS=
>>>>>
>>>>> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>
>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
>>>>> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>> -libjars
>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
>>>>>  Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>> org/apache/accumulo/core/client/Instance
>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.apache.accumulo.core.client.Instance
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>         ... 3 more
>>>>>
>>>>>
>>>> The command looks right.  Instance should be packaged in the accumulo
>>>> core jar.  To verify that, you could run:
>>>> jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
>>>> org/apache/accumulo/core/client/Instance
>>>>
>>>> I'm not sure what's going on here.  If that error is happening right
>>>> away, it seems like it can't load the jar on the local machine.  If you're
>>>> running multiple machines, and if the error were happening later during the
>>>> MapReduce, I would suggest that you make sure accumulo is present on all
>>>> the machines.
>>>>
>>>> You asked about the user; is the owner of the jars different than the
>>>> user you're running as?  In that case, it could be a permissions issue.
>>>> Could the permissions be set so that you can list that directory but not
>>>> read the jar?
>>>>
>>>> Billie
>>>>
>>>>
>>>>
>>>>>
>>>>> *org/apache/accumulo/core/client/Instance is located in the src/...
>>>>> folder which I am not is what is packaged in the examples-simple-[^c].jar
>>>>> ? *
>>>>> *Sorry folks for the constant emails... just trying to get this to
>>>>> work but I really appreciate the help.*
>>>>>
>>>>>
>>>>> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org>
wrote:
>>>>>
>>>>>> If you run tool.sh with sh -x, it will step through the script so
you
>>>>>> can see what jars it is picking up and perhaps why it's missing them
for
>>>>>> you.
>>>>>>
>>>>>> Sent from my phone, please pardon the typos and brevity.
>>>>>> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>>>>>
>>>>>>> What user are you running the commands as ?
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com>
wrote:
>>>>>>>
>>>>>>>> Where did you put all your java files?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <eric.newton@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> I was able to run the example, as written in
>>>>>>>>> docs/examples/README.bulkIngest substituting my
>>>>>>>>> instance/zookeeper/user/password information:
>>>>>>>>>
>>>>>>>>> $ pwd
>>>>>>>>> /home/ecn/workspace/1.4.3
>>>>>>>>> $ ls
>>>>>>>>> bin      conf     docs  LICENSE  NOTICE   README  src
    test
>>>>>>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>>>>>>
>>>>>>>>> $ ./bin/accumulo
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable
test
>>>>>>>>> localhost root secret test_bulk row_00000333 row_00000666
>>>>>>>>>
>>>>>>>>> $ ./bin/accumulo
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData
0 1000
>>>>>>>>> bulk/test_1.txt
>>>>>>>>>
>>>>>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
test
>>>>>>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>>>>>>
>>>>>>>>> $./bin/accumulo
>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest
test
>>>>>>>>> localhost root secret test_bulk 0 1000
>>>>>>>>>
>>>>>>>>> -Eric
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> I am not sure its just a regular expression issue.
Below is my
>>>>>>>>>> console output. Not sure why this ClassDefFoundError
occurs. Has anyone
>>>>>>>>>> tried to do it successfully? Can you please tell
me your env set up if you
>>>>>>>>>> did.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [user@mynode bulk]$ pwd
>>>>>>>>>> /home/user/bulk
>>>>>>>>>> [user@mynode bulk]$ ls
>>>>>>>>>> BulkIngestExample.java  GenerateTestData.java  SetupTable.java
>>>>>>>>>>  test_1.txt  VerifyIngest.java
>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
tmp/bulkWork*
>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>> Method)
>>>>>>>>>>         at
>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>         ... 3 more
>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
tmp/bulkWork*
>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>>>>>         at java.security.AccessController.doPrivileged(Native
>>>>>>>>>> Method)
>>>>>>>>>>         at
>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>>>>>         ... 3 more
>>>>>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
tmp/bulkWork*
>>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>> [user@mynode bulk]$
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <billie@apache.org
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <ctubbsii@apache.org
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Try with -libjars:
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> tool.sh automatically adds libjars.
>>>>>>>>>>>
>>>>>>>>>>> The problem is the regular expression for the
examples-simple
>>>>>>>>>>> jar.  It's trying to exclude the javadoc jar
with ^c, but it isn't
>>>>>>>>>>> excluding the sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar
may
>>>>>>>>>>> work, or you can just specify the jar exactly,
>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>>>>>
>>>>>>>>>>> */opt/accumulo/bin/tool.sh
>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir
tmp/bulkWork*
>>>>>>>>>>>
>>>>>>>>>>> Billie
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>>
>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>> myinstance zookeepers user pswd tableName
inputDir tmp/bulkWork
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Christopher L Tubbs II
>>>>>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis
<aji1705@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> > I am trying to run the BulkIngest example
(on 1.4.2 accumulo)
>>>>>>>>>>>> and I am not
>>>>>>>>>>>> > able to run the following steps. Here
is the error I get:
>>>>>>>>>>>> >
>>>>>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>>>>>> >
>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>>>>>> > myinstance zookeepers user pswd tableName
inputDir
>>>>>>>>>>>> tmp/bulkWork
>>>>>>>>>>>> > Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>>>>>> >         at java.lang.Class.forName0(Native
Method)
>>>>>>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>>>>>>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>>>>>>> > commons-collections-3.2.jar
>>>>>>>>>>>> > commons-logging-1.0.4.jar
>>>>>>>>>>>> > jline-0.9.94.jar
>>>>>>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>>>>>>> > commons-configuration-1.5.jar
>>>>>>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>>>>>>> > libthrift-0.6.1.jar
>>>>>>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>>>>>>> > commons-io-1.4.jar
>>>>>>>>>>>> > examples-simple-1.4.2.jar
>>>>>>>>>>>> > log4j-1.2.16.jar
>>>>>>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>>>>>>> > commons-jci-core-1.0.jar
>>>>>>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>>>>>>> > native
>>>>>>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>>>>>>> > commons-lang-2.4.jar
>>>>>>>>>>>> >  ext
>>>>>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>>>>>>> >
>>>>>>>>>>>> > [user@mynode bulk]$
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > Clearly, the libraries and source file
exist so I am not sure
>>>>>>>>>>>> whats going
>>>>>>>>>>>> > on. I tried putting in
>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>>>>>>> > instead then it complains BulkIngestExample
ClassNotFound.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Suggestions?
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric
Newton <
>>>>>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> You will have to write your own
InputFormat class which will
>>>>>>>>>>>> parse your
>>>>>>>>>>>> >> file and pass records to your reducer.
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> -Eric
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM,
Aji Janis <aji1705@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> Looking at the BulkIngestExample,
it uses GenerateTestData
>>>>>>>>>>>> and creates a
>>>>>>>>>>>> >>> .txt file which contians Key:
Value pair and correct me if
>>>>>>>>>>>> I am wrong but
>>>>>>>>>>>> >>> each new line is a new row right?
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> I need to know how to have family
and qualifiers also. In
>>>>>>>>>>>> other words,
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> 1) Do I set up a .txt file that
can be converted into an
>>>>>>>>>>>> Accumulo RF File
>>>>>>>>>>>> >>> using AccumuloFileOutputFormat
 which can then be imported
>>>>>>>>>>>> into my table?
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> 2) if yes, what is the format
of the .txt file.
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19
PM, Eric Newton <
>>>>>>>>>>>> eric.newton@gmail.com>
>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Your data needs to be in
the RFile format, and more
>>>>>>>>>>>> importantly it needs
>>>>>>>>>>>> >>>> to be sorted.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> It's handy to use a Map/Reduce
job to convert/sort your
>>>>>>>>>>>> data.  See the
>>>>>>>>>>>> >>>> BulkIngestExample.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> -Eric
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15
PM, Aji Janis <
>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> I have some data in
a text file in the following format.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> rowid1 columnFamily1
colQualifier1 value
>>>>>>>>>>>> >>>>> rowid1 columnFamily1
colQualifier2 value
>>>>>>>>>>>> >>>>> rowid1 columnFamily2
colQualifier1 value
>>>>>>>>>>>> >>>>> rowid2 columnFamily1
colQualifier1 value
>>>>>>>>>>>> >>>>> rowid3 columnFamily1
colQualifier1 value
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> I want to import this
data into a table in accumulo. My
>>>>>>>>>>>> end goal is to
>>>>>>>>>>>> >>>>> understand how to use
the BulkImport feature in accumulo.
>>>>>>>>>>>> I tried to login
>>>>>>>>>>>> >>>>> to the accumulo shell
as root and then run:
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> #table mytable
>>>>>>>>>>>> >>>>> #importdirectory /home/inputDir
/home/failureDir true
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> but it didn't work.
My data file was saved as data.txt in
>>>>>>>>>>>> >>>>> /home/inputDir. I tried
to create the dir/file structure
>>>>>>>>>>>> in hdfs and linux
>>>>>>>>>>>> >>>>> but neither worked.
When trying locally, it keeps
>>>>>>>>>>>> complaining about
>>>>>>>>>>>> >>>>> failureDir not existing.
>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>> >>>>> java.io.FileNotFoundException:
File does not exist:
>>>>>>>>>>>> failures
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> When trying with files
on hdfs, I get no error on the
>>>>>>>>>>>> console but the
>>>>>>>>>>>> >>>>> logger had the following
messages:
>>>>>>>>>>>> >>>>> ...
>>>>>>>>>>>> >>>>> [tableOps.BulkImport]
WARN :
>>>>>>>>>>>> hdfs://node....//inputDir/data.txt does
>>>>>>>>>>>> >>>>> not have a valid extension,
ignoring
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> or,
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> [tableOps.BulkImport]
WARN :
>>>>>>>>>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>>>>>>> >>>>> a map file, ignoring
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> Suggestions? Am I not
setting up the job right? Thank you
>>>>>>>>>>>> for help in
>>>>>>>>>>>> >>>>> advance.
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>> On Wed, Apr 3, 2013
at 2:04 PM, Aji Janis <
>>>>>>>>>>>> aji1705@gmail.com> wrote:
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> I have some data
in a text file in the following format:
>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>> >>>>>> rowid1 columnFamily
colQualifier value
>>>>>>>>>>>> >>>>>> rowid1 columnFamily
colQualifier value
>>>>>>>>>>>> >>>>>> rowid1 columnFamily
colQualifier value
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>>
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message