accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <bil...@apache.org>
Subject Re: importdirectory in accumulo
Date Thu, 04 Apr 2013 15:55:51 GMT
On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <aji1705@gmail.com> wrote:

> *Billie, I checked the values in tool.sh they match. I uncommented the
> echo statements and reran the cmd here is what I have:*
> *
> *
> *$ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
> instance zookeeper usr pswd table inputdir tmp/bulk*
>
> USERJARS=
>
> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>
> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
> -libjars
> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/accumulo/core/client/Instance
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:264)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.accumulo.core.client.Instance
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         ... 3 more
>
>
The command looks right.  Instance should be packaged in the accumulo core
jar.  To verify that, you could run:
jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep
org/apache/accumulo/core/client/Instance

I'm not sure what's going on here.  If that error is happening right away,
it seems like it can't load the jar on the local machine.  If you're
running multiple machines, and if the error were happening later during the
MapReduce, I would suggest that you make sure accumulo is present on all
the machines.

You asked about the user; is the owner of the jars different than the user
you're running as?  In that case, it could be a permissions issue.  Could
the permissions be set so that you can list that directory but not read the
jar?

Billie



>
> *org/apache/accumulo/core/client/Instance is located in the src/...
> folder which I am not is what is packaged in the examples-simple-[^c].jar
> ? *
> *Sorry folks for the constant emails... just trying to get this to work
> but I really appreciate the help.*
>
>
> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org> wrote:
>
>> If you run tool.sh with sh -x, it will step through the script so you can
>> see what jars it is picking up and perhaps why it's missing them for you.
>>
>> Sent from my phone, please pardon the typos and brevity.
>> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>>
>>> What user are you running the commands as ?
>>>
>>>
>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>
>>>> Where did you put all your java files?
>>>>
>>>>
>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <eric.newton@gmail.com>wrote:
>>>>
>>>>> I was able to run the example, as written in
>>>>> docs/examples/README.bulkIngest substituting my
>>>>> instance/zookeeper/user/password information:
>>>>>
>>>>> $ pwd
>>>>> /home/ecn/workspace/1.4.3
>>>>> $ ls
>>>>> bin      conf     docs  LICENSE  NOTICE   README  src     test
>>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>>
>>>>> $ ./bin/accumulo
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test
>>>>> localhost root secret test_bulk row_00000333 row_00000666
>>>>>
>>>>> $ ./bin/accumulo
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0
1000
>>>>> bulk/test_1.txt
>>>>>
>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
test
>>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>>
>>>>> $./bin/accumulo
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
>>>>> localhost root secret test_bulk 0 1000
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>
>>>>>> I am not sure its just a regular expression issue. Below is my
>>>>>> console output. Not sure why this ClassDefFoundError occurs. Has
anyone
>>>>>> tried to do it successfully? Can you please tell me your env set
up if you
>>>>>> did.
>>>>>>
>>>>>>
>>>>>> [user@mynode bulk]$ pwd
>>>>>> /home/user/bulk
>>>>>> [user@mynode bulk]$ ls
>>>>>> BulkIngestExample.java  GenerateTestData.java  SetupTable.java
>>>>>>  test_1.txt  VerifyIngest.java
>>>>>> [user@mynode bulk]$
>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>         ... 3 more
>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> org/apache/accumulo/core/client/Instance
>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.accumulo.core.client.Instance
>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>>         ... 3 more
>>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>> [user@mynode bulk]$
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>>
>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <ctubbsii@apache.org>wrote:
>>>>>>>
>>>>>>>> Try with -libjars:
>>>>>>>>
>>>>>>>
>>>>>>> tool.sh automatically adds libjars.
>>>>>>>
>>>>>>> The problem is the regular expression for the examples-simple
jar.
>>>>>>> It's trying to exclude the javadoc jar with ^c, but it isn't
excluding the
>>>>>>> sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar may
work, or you
>>>>>>> can just specify the jar exactly,
>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>>
>>>>>>> */opt/accumulo/bin/tool.sh
>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>>
>>>>>>> Billie
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> /opt/accumulo/bin/tool.sh
>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>>
>>>>>>>> --
>>>>>>>> Christopher L Tubbs II
>>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <aji1705@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2
accumulo) and
>>>>>>>> I am not
>>>>>>>> > able to run the following steps. Here is the error I
get:
>>>>>>>> >
>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>>> >
>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>>> > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>> > Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>>> >         at java.lang.Class.forName0(Native Method)
>>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>>> > [user@mynode bulk]$
>>>>>>>> > [user@mynode bulk]$
>>>>>>>> > [user@mynode bulk]$
>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>>> > commons-collections-3.2.jar
>>>>>>>> > commons-logging-1.0.4.jar
>>>>>>>> > jline-0.9.94.jar
>>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>>> > commons-configuration-1.5.jar
>>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>>> > libthrift-0.6.1.jar
>>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>>> > commons-io-1.4.jar
>>>>>>>> > examples-simple-1.4.2.jar
>>>>>>>> > log4j-1.2.16.jar
>>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>>> > commons-jci-core-1.0.jar
>>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>>> > native
>>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>>> > commons-lang-2.4.jar
>>>>>>>> >  ext
>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>>> >
>>>>>>>> > [user@mynode bulk]$
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Clearly, the libraries and source file exist so I am
not sure
>>>>>>>> whats going
>>>>>>>> > on. I tried putting in
>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>>> > instead then it complains BulkIngestExample ClassNotFound.
>>>>>>>> >
>>>>>>>> > Suggestions?
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <
>>>>>>>> eric.newton@gmail.com> wrote:
>>>>>>>> >>
>>>>>>>> >> You will have to write your own InputFormat class
which will
>>>>>>>> parse your
>>>>>>>> >> file and pass records to your reducer.
>>>>>>>> >>
>>>>>>>> >> -Eric
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <aji1705@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >>>
>>>>>>>> >>> Looking at the BulkIngestExample, it uses GenerateTestData
and
>>>>>>>> creates a
>>>>>>>> >>> .txt file which contians Key: Value pair and
correct me if I am
>>>>>>>> wrong but
>>>>>>>> >>> each new line is a new row right?
>>>>>>>> >>>
>>>>>>>> >>> I need to know how to have family and qualifiers
also. In other
>>>>>>>> words,
>>>>>>>> >>>
>>>>>>>> >>> 1) Do I set up a .txt file that can be converted
into an
>>>>>>>> Accumulo RF File
>>>>>>>> >>> using AccumuloFileOutputFormat  which can then
be imported into
>>>>>>>> my table?
>>>>>>>> >>>
>>>>>>>> >>> 2) if yes, what is the format of the .txt file.
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton
<
>>>>>>>> eric.newton@gmail.com>
>>>>>>>> >>> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Your data needs to be in the RFile format,
and more
>>>>>>>> importantly it needs
>>>>>>>> >>>> to be sorted.
>>>>>>>> >>>>
>>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort
your data.
>>>>>>>>  See the
>>>>>>>> >>>> BulkIngestExample.
>>>>>>>> >>>>
>>>>>>>> >>>> -Eric
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis
<aji1705@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >>>>>
>>>>>>>> >>>>> I have some data in a text file in the
following format.
>>>>>>>> >>>>>
>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>>>> >>>>>
>>>>>>>> >>>>> I want to import this data into a table
in accumulo. My end
>>>>>>>> goal is to
>>>>>>>> >>>>> understand how to use the BulkImport
feature in accumulo. I
>>>>>>>> tried to login
>>>>>>>> >>>>> to the accumulo shell as root and then
run:
>>>>>>>> >>>>>
>>>>>>>> >>>>> #table mytable
>>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir
true
>>>>>>>> >>>>>
>>>>>>>> >>>>> but it didn't work. My data file was
saved as data.txt in
>>>>>>>> >>>>> /home/inputDir. I tried to create the
dir/file structure in
>>>>>>>> hdfs and linux
>>>>>>>> >>>>> but neither worked. When trying locally,
it keeps complaining
>>>>>>>> about
>>>>>>>> >>>>> failureDir not existing.
>>>>>>>> >>>>> ...
>>>>>>>> >>>>> java.io.FileNotFoundException: File
does not exist: failures
>>>>>>>> >>>>>
>>>>>>>> >>>>> When trying with files on hdfs, I get
no error on the console
>>>>>>>> but the
>>>>>>>> >>>>> logger had the following messages:
>>>>>>>> >>>>> ...
>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>> hdfs://node....//inputDir/data.txt does
>>>>>>>> >>>>> not have a valid extension, ignoring
>>>>>>>> >>>>>
>>>>>>>> >>>>> or,
>>>>>>>> >>>>>
>>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>>> >>>>> a map file, ignoring
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>> Suggestions? Am I not setting up the
job right? Thank you for
>>>>>>>> help in
>>>>>>>> >>>>> advance.
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji
Janis <aji1705@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> I have some data in a text file
in the following format:
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> rowid1 columnFamily colQualifier
value
>>>>>>>> >>>>>> rowid1 columnFamily colQualifier
value
>>>>>>>> >>>>>> rowid1 columnFamily colQualifier
value
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>
>>>>>>>> >>>
>>>>>>>> >>
>>>>>>>> >
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Mime
View raw message