accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aji Janis <aji1...@gmail.com>
Subject Re: importdirectory in accumulo
Date Thu, 04 Apr 2013 14:46:24 GMT
*Billie, I checked the values in tool.sh they match. I uncommented the echo
statements and reran the cmd here is what I have:*
*
*
*$ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
instance zookeeper usr pswd table inputdir tmp/bulk*

USERJARS=
CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar:
exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
-libjars
"/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar"
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/accumulo/core/client/Instance
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
Caused by: java.lang.ClassNotFoundException:
org.apache.accumulo.core.client.Instance
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 3 more



*org/apache/accumulo/core/client/Instance is located in the src/... folder
which I am not is what is packaged in the examples-simple-[^c].jar ? *
*Sorry folks for the constant emails... just trying to get this to work but
I really appreciate the help.*


On Thu, Apr 4, 2013 at 10:18 AM, John Vines <vines@apache.org> wrote:

> If you run tool.sh with sh -x, it will step through the script so you can
> see what jars it is picking up and perhaps why it's missing them for you.
>
> Sent from my phone, please pardon the typos and brevity.
> On Apr 4, 2013 10:15 AM, "Aji Janis" <aji1705@gmail.com> wrote:
>
>> What user are you running the commands as ?
>>
>>
>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> Where did you put all your java files?
>>>
>>>
>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <eric.newton@gmail.com>wrote:
>>>
>>>> I was able to run the example, as written in
>>>> docs/examples/README.bulkIngest substituting my
>>>> instance/zookeeper/user/password information:
>>>>
>>>> $ pwd
>>>> /home/ecn/workspace/1.4.3
>>>> $ ls
>>>> bin      conf     docs  LICENSE  NOTICE   README  src     test
>>>> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>>>>
>>>> $ ./bin/accumulo
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test
>>>> localhost root secret test_bulk row_00000333 row_00000666
>>>>
>>>> $ ./bin/accumulo
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0 1000
>>>> bulk/test_1.txt
>>>>
>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample test
>>>> localhost root secret test_bulk bulk tmp/bulkWork
>>>>
>>>> $./bin/accumulo
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
>>>> localhost root secret test_bulk 0 1000
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>>> I am not sure its just a regular expression issue. Below is my console
>>>>> output. Not sure why this ClassDefFoundError occurs. Has anyone tried
to do
>>>>> it successfully? Can you please tell me your env set up if you did.
>>>>>
>>>>>
>>>>> [user@mynode bulk]$ pwd
>>>>> /home/user/bulk
>>>>> [user@mynode bulk]$ ls
>>>>> BulkIngestExample.java  GenerateTestData.java  SetupTable.java
>>>>>  test_1.txt  VerifyIngest.java
>>>>> [user@mynode bulk]$
>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>> org/apache/accumulo/core/client/Instance
>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.apache.accumulo.core.client.Instance
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>         ... 3 more
>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>> org/apache/accumulo/core/client/Instance
>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.apache.accumulo.core.client.Instance
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>>         ... 3 more
>>>>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>         at java.lang.Class.forName0(Native Method)
>>>>>         at java.lang.Class.forName(Class.java:264)
>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>> [user@mynode bulk]$
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <billie@apache.org>wrote:
>>>>>
>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <ctubbsii@apache.org>wrote:
>>>>>>
>>>>>>> Try with -libjars:
>>>>>>>
>>>>>>
>>>>>> tool.sh automatically adds libjars.
>>>>>>
>>>>>> The problem is the regular expression for the examples-simple jar.
>>>>>> It's trying to exclude the javadoc jar with ^c, but it isn't excluding
the
>>>>>> sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar may work,
or you
>>>>>> can just specify the jar exactly,
>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>>>>
>>>>>> */opt/accumulo/bin/tool.sh
>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>>>>
>>>>>> Billie
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> /opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>>
>>>>>>> --
>>>>>>> Christopher L Tubbs II
>>>>>>> http://gravatar.com/ctubbsii
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <aji1705@gmail.com>
wrote:
>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2 accumulo)
and
>>>>>>> I am not
>>>>>>> > able to run the following steps. Here is the error I get:
>>>>>>> >
>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>>>>> >
>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>>>>> > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>>>> > Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>>>>> >         at java.lang.Class.forName0(Native Method)
>>>>>>> >         at java.lang.Class.forName(Class.java:264)
>>>>>>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>>>>> > [user@mynode bulk]$
>>>>>>> > [user@mynode bulk]$
>>>>>>> > [user@mynode bulk]$
>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>>>>> > accumulo-core-1.4.2.jar
>>>>>>> > accumulo-start-1.4.2.jar
>>>>>>> > commons-collections-3.2.jar
>>>>>>> > commons-logging-1.0.4.jar
>>>>>>> > jline-0.9.94.jar
>>>>>>> > accumulo-core-1.4.2-javadoc.jar
>>>>>>> > accumulo-start-1.4.2-javadoc.jar
>>>>>>> > commons-configuration-1.5.jar
>>>>>>> > commons-logging-api-1.0.4.jar
>>>>>>> > libthrift-0.6.1.jar
>>>>>>> > accumulo-core-1.4.2-sources.jar
>>>>>>> > accumulo-start-1.4.2-sources.jar
>>>>>>> > commons-io-1.4.jar
>>>>>>> > examples-simple-1.4.2.jar
>>>>>>> > log4j-1.2.16.jar
>>>>>>> > accumulo-server-1.4.2.jar
>>>>>>> > cloudtrace-1.4.2.jar
>>>>>>> > commons-jci-core-1.0.jar
>>>>>>> > examples-simple-1.4.2-javadoc.jar
>>>>>>> > native
>>>>>>> > accumulo-server-1.4.2-javadoc.jar
>>>>>>> > cloudtrace-1.4.2-javadoc.jar
>>>>>>> > commons-jci-fam-1.0.jar
>>>>>>> > examples-simple-1.4.2-sources.jar
>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>>>>> > accumulo-server-1.4.2-sources.jar
>>>>>>> > cloudtrace-1.4.2-sources.jar
>>>>>>> > commons-lang-2.4.jar
>>>>>>> >  ext
>>>>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>>>>> >
>>>>>>> > [user@mynode bulk]$
>>>>>>> >
>>>>>>> >
>>>>>>> > Clearly, the libraries and source file exist so I am not
sure
>>>>>>> whats going
>>>>>>> > on. I tried putting in
>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>>>>> > instead then it complains BulkIngestExample ClassNotFound.
>>>>>>> >
>>>>>>> > Suggestions?
>>>>>>> >
>>>>>>> >
>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <eric.newton@gmail.com>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> You will have to write your own InputFormat class which
will
>>>>>>> parse your
>>>>>>> >> file and pass records to your reducer.
>>>>>>> >>
>>>>>>> >> -Eric
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <aji1705@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>
>>>>>>> >>> Looking at the BulkIngestExample, it uses GenerateTestData
and
>>>>>>> creates a
>>>>>>> >>> .txt file which contians Key: Value pair and correct
me if I am
>>>>>>> wrong but
>>>>>>> >>> each new line is a new row right?
>>>>>>> >>>
>>>>>>> >>> I need to know how to have family and qualifiers
also. In other
>>>>>>> words,
>>>>>>> >>>
>>>>>>> >>> 1) Do I set up a .txt file that can be converted
into an
>>>>>>> Accumulo RF File
>>>>>>> >>> using AccumuloFileOutputFormat  which can then be
imported into
>>>>>>> my table?
>>>>>>> >>>
>>>>>>> >>> 2) if yes, what is the format of the .txt file.
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <
>>>>>>> eric.newton@gmail.com>
>>>>>>> >>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Your data needs to be in the RFile format, and
more importantly
>>>>>>> it needs
>>>>>>> >>>> to be sorted.
>>>>>>> >>>>
>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort
your data.
>>>>>>>  See the
>>>>>>> >>>> BulkIngestExample.
>>>>>>> >>>>
>>>>>>> >>>> -Eric
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <aji1705@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> I have some data in a text file in the following
format.
>>>>>>> >>>>>
>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>>> >>>>>
>>>>>>> >>>>> I want to import this data into a table
in accumulo. My end
>>>>>>> goal is to
>>>>>>> >>>>> understand how to use the BulkImport feature
in accumulo. I
>>>>>>> tried to login
>>>>>>> >>>>> to the accumulo shell as root and then run:
>>>>>>> >>>>>
>>>>>>> >>>>> #table mytable
>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir
true
>>>>>>> >>>>>
>>>>>>> >>>>> but it didn't work. My data file was saved
as data.txt in
>>>>>>> >>>>> /home/inputDir. I tried to create the dir/file
structure in
>>>>>>> hdfs and linux
>>>>>>> >>>>> but neither worked. When trying locally,
it keeps complaining
>>>>>>> about
>>>>>>> >>>>> failureDir not existing.
>>>>>>> >>>>> ...
>>>>>>> >>>>> java.io.FileNotFoundException: File does
not exist: failures
>>>>>>> >>>>>
>>>>>>> >>>>> When trying with files on hdfs, I get no
error on the console
>>>>>>> but the
>>>>>>> >>>>> logger had the following messages:
>>>>>>> >>>>> ...
>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>> hdfs://node....//inputDir/data.txt does
>>>>>>> >>>>> not have a valid extension, ignoring
>>>>>>> >>>>>
>>>>>>> >>>>> or,
>>>>>>> >>>>>
>>>>>>> >>>>> [tableOps.BulkImport] WARN :
>>>>>>> hdfs://node....//inputDir/data.txt is not
>>>>>>> >>>>> a map file, ignoring
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> Suggestions? Am I not setting up the job
right? Thank you for
>>>>>>> help in
>>>>>>> >>>>> advance.
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis
<aji1705@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>>>>
>>>>>>> >>>>>> I have some data in a text file in the
following format:
>>>>>>> >>>>>>
>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>
>>>>>>> >>>
>>>>>>> >>
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Mime
View raw message