Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@accumulo.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CANrNQ6_o9fHZyxOuDUUxSq5wCRVxU4i4AuYKMpodyq90U+XuZw@mail.gmail.com>
References: 
 <CANrNQ68M9Hbqsvv78F2-kCEop4ssiW6UbNo2exRkKZdpmZxk4Q@mail.gmail.com>
	<CANrNQ68JbN+aoN7aik3HmPxE2j7ook4GZJEYnBHNv+CwPjc7OQ@mail.gmail.com>
	<CADxc9Bki72noUQvAxLg4sVQbG3b9TNUUPHc=zZ4XSPCss0sGHw@mail.gmail.com>
	<CANrNQ68AJVznRNpyybfACjtLOGv4YL0hq7UycmSAvL7QO7O4Vw@mail.gmail.com>
	<CADxc9Bnb4AR5wMtJEuYGcczPjGA1uyXEAUXheVBaLeNTA-N-kg@mail.gmail.com>
	<CANrNQ6_o9fHZyxOuDUUxSq5wCRVxU4i4AuYKMpodyq90U+XuZw@mail.gmail.com>
Date: Wed, 3 Apr 2013 16:16:29 -0400
Message-ID: 
 <CAL5zq9b8dh9OkKLYFPxuCkVfiegt9eAyjeViTByRD=5+0kWqXg@mail.gmail.com>
Subject: Re: importdirectory in accumulo
From: Christopher <ctubbsii@apache.org>
To: user@accumulo.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Try with -libjars:

/opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^c].jar
-libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
myinstance zookeepers user pswd tableName inputDir tmp/bulkWork

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <aji1705@gmail.com> wrote:
> I am trying to run the BulkIngest example (on 1.4.2 accumulo) and I am not
> able to run the following steps. Here is the error I get:
>
> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
> /opt/accumulo/lib/examples-simple-*[^c].jar
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
> Exception in thread "main" java.lang.ClassNotFoundException:
> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:264)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
> [user@mynode bulk]$
> [user@mynode bulk]$
> [user@mynode bulk]$
> [user@mynode bulk]$ ls /opt/accumulo/lib/
> accumulo-core-1.4.2.jar
> accumulo-start-1.4.2.jar
> commons-collections-3.2.jar
> commons-logging-1.0.4.jar
> jline-0.9.94.jar
> accumulo-core-1.4.2-javadoc.jar
> accumulo-start-1.4.2-javadoc.jar
> commons-configuration-1.5.jar
> commons-logging-api-1.0.4.jar
> libthrift-0.6.1.jar
> accumulo-core-1.4.2-sources.jar
> accumulo-start-1.4.2-sources.jar
> commons-io-1.4.jar
> examples-simple-1.4.2.jar
> log4j-1.2.16.jar
> accumulo-server-1.4.2.jar
> cloudtrace-1.4.2.jar
> commons-jci-core-1.0.jar
> examples-simple-1.4.2-javadoc.jar
> native
> accumulo-server-1.4.2-javadoc.jar
> cloudtrace-1.4.2-javadoc.jar
> commons-jci-fam-1.0.jar
> examples-simple-1.4.2-sources.jar
> wikisearch-ingest-1.4.2-javadoc.jar
> accumulo-server-1.4.2-sources.jar
> cloudtrace-1.4.2-sources.jar
> commons-lang-2.4.jar
>  ext
> wikisearch-query-1.4.2-javadoc.jar
>
> [user@mynode bulk]$
>
>
> Clearly, the libraries and source file exist so I am not sure whats going
> on. I tried putting in /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
> instead then it complains BulkIngestExample ClassNotFound.
>
> Suggestions?
>
>
> On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <eric.newton@gmail.com> wrote:
>>
>> You will have to write your own InputFormat class which will parse your
>> file and pass records to your reducer.
>>
>> -Eric
>>
>>
>> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>
>>> Looking at the BulkIngestExample, it uses GenerateTestData and creates a
>>> .txt file which contians Key: Value pair and correct me if I am wrong but
>>> each new line is a new row right?
>>>
>>> I need to know how to have family and qualifiers also. In other words,
>>>
>>> 1) Do I set up a .txt file that can be converted into an Accumulo RF File
>>> using AccumuloFileOutputFormat  which can then be imported into my table?
>>>
>>> 2) if yes, what is the format of the .txt file.
>>>
>>>
>>>
>>>
>>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <eric.newton@gmail.com>
>>> wrote:
>>>>
>>>> Your data needs to be in the RFile format, and more importantly it needs
>>>> to be sorted.
>>>>
>>>> It's handy to use a Map/Reduce job to convert/sort your data.  See the
>>>> BulkIngestExample.
>>>>
>>>> -Eric
>>>>
>>>>
>>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>
>>>>> I have some data in a text file in the following format.
>>>>>
>>>>> rowid1 columnFamily1 colQualifier1 value
>>>>> rowid1 columnFamily1 colQualifier2 value
>>>>> rowid1 columnFamily2 colQualifier1 value
>>>>> rowid2 columnFamily1 colQualifier1 value
>>>>> rowid3 columnFamily1 colQualifier1 value
>>>>>
>>>>> I want to import this data into a table in accumulo. My end goal is to
>>>>> understand how to use the BulkImport feature in accumulo. I tried to login
>>>>> to the accumulo shell as root and then run:
>>>>>
>>>>> #table mytable
>>>>> #importdirectory /home/inputDir /home/failureDir true
>>>>>
>>>>> but it didn't work. My data file was saved as data.txt in
>>>>> /home/inputDir. I tried to create the dir/file structure in hdfs and linux
>>>>> but neither worked. When trying locally, it keeps complaining about
>>>>> failureDir not existing.
>>>>> ...
>>>>> java.io.FileNotFoundException: File does not exist: failures
>>>>>
>>>>> When trying with files on hdfs, I get no error on the console but the
>>>>> logger had the following messages:
>>>>> ...
>>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt does
>>>>> not have a valid extension, ignoring
>>>>>
>>>>> or,
>>>>>
>>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt is not
>>>>> a map file, ignoring
>>>>>
>>>>>
>>>>> Suggestions? Am I not setting up the job right? Thank you for help in
>>>>> advance.
>>>>>
>>>>>
>>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>>>>
>>>>>> I have some data in a text file in the following format:
>>>>>>
>>>>>> rowid1 columnFamily colQualifier value
>>>>>> rowid1 columnFamily colQualifier value
>>>>>> rowid1 columnFamily colQualifier value
>>>>>
>>>>>
>>>>
>>>
>>
>