accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aji Janis <aji1...@gmail.com>
Subject Re: importdirectory in accumulo
Date Thu, 04 Apr 2013 13:59:01 GMT
Where did you put all your java files?


On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton <eric.newton@gmail.com> wrote:

> I was able to run the example, as written in
> docs/examples/README.bulkIngest substituting my
> instance/zookeeper/user/password information:
>
> $ pwd
> /home/ecn/workspace/1.4.3
> $ ls
> bin      conf     docs  LICENSE  NOTICE   README  src     test
> CHANGES  contrib  lib   logs     pom.xml  target  walogs
>
> $ ./bin/accumulo
> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable test
> localhost root secret test_bulk row_00000333 row_00000666
>
> $ ./bin/accumulo
> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData 0 1000
> bulk/test_1.txt
>
> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample test
> localhost root secret test_bulk bulk tmp/bulkWork
>
> $./bin/accumulo
> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest test
> localhost root secret test_bulk 0 1000
>
> -Eric
>
>
>
> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <aji1705@gmail.com> wrote:
>
>> I am not sure its just a regular expression issue. Below is my console
>> output. Not sure why this ClassDefFoundError occurs. Has anyone tried to do
>> it successfully? Can you please tell me your env set up if you did.
>>
>>
>> [user@mynode bulk]$ pwd
>> /home/user/bulk
>> [user@mynode bulk]$ ls
>> BulkIngestExample.java  GenerateTestData.java  SetupTable.java
>>  test_1.txt  VerifyIngest.java
>> [user@mynode bulk]$
>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/accumulo/core/client/Instance
>>         at java.lang.Class.forName0(Native Method)
>>         at java.lang.Class.forName(Class.java:264)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.accumulo.core.client.Instance
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>         ... 3 more
>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>> /opt/accumulo/lib/examples-simple-*[^cs].jar
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/accumulo/core/client/Instance
>>         at java.lang.Class.forName0(Native Method)
>>         at java.lang.Class.forName(Class.java:264)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.accumulo.core.client.Instance
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>         ... 3 more
>> *[user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>> /opt/accumulo/lib/examples-simple-*[^c].jar
>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>> Exception in thread "main" java.lang.ClassNotFoundException:
>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>         at java.lang.Class.forName0(Native Method)
>>         at java.lang.Class.forName(Class.java:264)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>> [user@mynode bulk]$
>>
>>
>>
>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi <billie@apache.org> wrote:
>>
>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher <ctubbsii@apache.org> wrote:
>>>
>>>> Try with -libjars:
>>>>
>>>
>>> tool.sh automatically adds libjars.
>>>
>>> The problem is the regular expression for the examples-simple jar.  It's
>>> trying to exclude the javadoc jar with ^c, but it isn't excluding the
>>> sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you
>>> can just specify the jar exactly,
>>> /opt/accumulo/lib/examples-simple-1.4.2.jar
>>>
>>> */opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^cs].jar
>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*
>>>
>>> Billie
>>>
>>>
>>>
>>>>
>>>> /opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^c].jar
>>>> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>>
>>>> --
>>>> Christopher L Tubbs II
>>>> http://gravatar.com/ctubbsii
>>>>
>>>>
>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>> > I am trying to run the BulkIngest example (on 1.4.2 accumulo) and I
>>>> am not
>>>> > able to run the following steps. Here is the error I get:
>>>> >
>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar
>>>> > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
>>>> > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>>>> > Exception in thread "main" java.lang.ClassNotFoundException:
>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>>>> >         at java.lang.Class.forName0(Native Method)
>>>> >         at java.lang.Class.forName(Class.java:264)
>>>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>>> > [user@mynode bulk]$
>>>> > [user@mynode bulk]$
>>>> > [user@mynode bulk]$
>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/
>>>> > accumulo-core-1.4.2.jar
>>>> > accumulo-start-1.4.2.jar
>>>> > commons-collections-3.2.jar
>>>> > commons-logging-1.0.4.jar
>>>> > jline-0.9.94.jar
>>>> > accumulo-core-1.4.2-javadoc.jar
>>>> > accumulo-start-1.4.2-javadoc.jar
>>>> > commons-configuration-1.5.jar
>>>> > commons-logging-api-1.0.4.jar
>>>> > libthrift-0.6.1.jar
>>>> > accumulo-core-1.4.2-sources.jar
>>>> > accumulo-start-1.4.2-sources.jar
>>>> > commons-io-1.4.jar
>>>> > examples-simple-1.4.2.jar
>>>> > log4j-1.2.16.jar
>>>> > accumulo-server-1.4.2.jar
>>>> > cloudtrace-1.4.2.jar
>>>> > commons-jci-core-1.0.jar
>>>> > examples-simple-1.4.2-javadoc.jar
>>>> > native
>>>> > accumulo-server-1.4.2-javadoc.jar
>>>> > cloudtrace-1.4.2-javadoc.jar
>>>> > commons-jci-fam-1.0.jar
>>>> > examples-simple-1.4.2-sources.jar
>>>> > wikisearch-ingest-1.4.2-javadoc.jar
>>>> > accumulo-server-1.4.2-sources.jar
>>>> > cloudtrace-1.4.2-sources.jar
>>>> > commons-lang-2.4.jar
>>>> >  ext
>>>> > wikisearch-query-1.4.2-javadoc.jar
>>>> >
>>>> > [user@mynode bulk]$
>>>> >
>>>> >
>>>> > Clearly, the libraries and source file exist so I am not sure whats
>>>> going
>>>> > on. I tried putting in
>>>> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
>>>> > instead then it complains BulkIngestExample ClassNotFound.
>>>> >
>>>> > Suggestions?
>>>> >
>>>> >
>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <eric.newton@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> You will have to write your own InputFormat class which will parse
>>>> your
>>>> >> file and pass records to your reducer.
>>>> >>
>>>> >> -Eric
>>>> >>
>>>> >>
>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <aji1705@gmail.com>
wrote:
>>>> >>>
>>>> >>> Looking at the BulkIngestExample, it uses GenerateTestData and
>>>> creates a
>>>> >>> .txt file which contians Key: Value pair and correct me if I
am
>>>> wrong but
>>>> >>> each new line is a new row right?
>>>> >>>
>>>> >>> I need to know how to have family and qualifiers also. In other
>>>> words,
>>>> >>>
>>>> >>> 1) Do I set up a .txt file that can be converted into an Accumulo
>>>> RF File
>>>> >>> using AccumuloFileOutputFormat  which can then be imported into
my
>>>> table?
>>>> >>>
>>>> >>> 2) if yes, what is the format of the .txt file.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <eric.newton@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Your data needs to be in the RFile format, and more importantly
it
>>>> needs
>>>> >>>> to be sorted.
>>>> >>>>
>>>> >>>> It's handy to use a Map/Reduce job to convert/sort your
data.  See
>>>> the
>>>> >>>> BulkIngestExample.
>>>> >>>>
>>>> >>>> -Eric
>>>> >>>>
>>>> >>>>
>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <aji1705@gmail.com>
>>>> wrote:
>>>> >>>>>
>>>> >>>>> I have some data in a text file in the following format.
>>>> >>>>>
>>>> >>>>> rowid1 columnFamily1 colQualifier1 value
>>>> >>>>> rowid1 columnFamily1 colQualifier2 value
>>>> >>>>> rowid1 columnFamily2 colQualifier1 value
>>>> >>>>> rowid2 columnFamily1 colQualifier1 value
>>>> >>>>> rowid3 columnFamily1 colQualifier1 value
>>>> >>>>>
>>>> >>>>> I want to import this data into a table in accumulo.
My end goal
>>>> is to
>>>> >>>>> understand how to use the BulkImport feature in accumulo.
I tried
>>>> to login
>>>> >>>>> to the accumulo shell as root and then run:
>>>> >>>>>
>>>> >>>>> #table mytable
>>>> >>>>> #importdirectory /home/inputDir /home/failureDir true
>>>> >>>>>
>>>> >>>>> but it didn't work. My data file was saved as data.txt
in
>>>> >>>>> /home/inputDir. I tried to create the dir/file structure
in hdfs
>>>> and linux
>>>> >>>>> but neither worked. When trying locally, it keeps complaining
>>>> about
>>>> >>>>> failureDir not existing.
>>>> >>>>> ...
>>>> >>>>> java.io.FileNotFoundException: File does not exist:
failures
>>>> >>>>>
>>>> >>>>> When trying with files on hdfs, I get no error on the
console but
>>>> the
>>>> >>>>> logger had the following messages:
>>>> >>>>> ...
>>>> >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt
>>>> does
>>>> >>>>> not have a valid extension, ignoring
>>>> >>>>>
>>>> >>>>> or,
>>>> >>>>>
>>>> >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt
>>>> is not
>>>> >>>>> a map file, ignoring
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Suggestions? Am I not setting up the job right? Thank
you for
>>>> help in
>>>> >>>>> advance.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <aji1705@gmail.com>
>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> I have some data in a text file in the following
format:
>>>> >>>>>>
>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>> >>>>>> rowid1 columnFamily colQualifier value
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Mime
View raw message