accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Ingest speed
Date Tue, 05 May 2015 14:09:59 GMT
Hi Revan,

You likely don't want to use the shell as a means to ingest as you will 
get abysmal performance (each record you insert will create a 
batchwriter, write one record and close the batchwriter).

But, if bad performance is ok, using a large file of shell commands with 
the `shell -f ..` option you noted would work. This would be slightly 
more efficient as you said (not re-authenticating with Accumulo for 
every insert. Before you start insert records, you can switch to a table 
context using the `table <tablename>` command.

Feel free to open an issue on JIRA to add a "-t" option to the insert 
command as this would be a good addition 
(https://issues.apache.org/jira/secure/CreateIssue!default.jspa)


Revan1988 wrote:
> Hi,
> I'm an Italian student and i'm going to be graduated having  thesis about
> using accumulo.
> I've developed a little java application that reads LOGS in a .json file and
> insert them into accumulo.
> In my virtual machine my app's insert/sec score is about 6'000.
> I see that there are some bench test that scores about 50'000 in my VM.
> (It's in $ACCUMULO_HOME/test/system/* folder.)
> Those test use the shell call to insert a large amount of data.
> So I think that i could write all my insert using an accumulo-shell call in
> my app.
> I've seen that i can execute that command:
>
> ./bin/accumulo shell -u username -p passw -e "insert row fam qual val [vis]
> [timestamp]"
>
> but there is a problem: i need to set up the table with the command table
>   (infact insert command has not -t
>   option).
>
> So there is any way to execute two commands in a row into accumulo shell? i
> tried with&&  ; and other separator but with no success.
>
> The other option that i have is to write a big command file with all
> commands that i need and send it to accumulo shell using the command
>
> ./bin/accumulo shell -f<file>
>
> it may be a better solution cause i'll connect to accumulo just one time
> (and not for every insert)...
>
> Any suggest?
>
> Thank you everybody and sorry for my bad english.
>
> Revan
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Ingest-speed-tp14005.html
> Sent from the Developers mailing list archive at Nabble.com.

Mime
View raw message