accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry P." <>
Subject Re: OutOfMemoryError: Java heap space after data load
Date Mon, 29 Apr 2013 22:30:15 GMT
Hi Eric,
Yes, I loaded 4.5 million entries with the shell to verify that things were
working properly.  With three shells running, and 99% of the data going to
a single TabletServer (due to my quick-hack RowKey structure, which I'm
changing to better mimic what the real rowkey structure will be ), it
ingested 1,400 entries per second.  Again, it flushed every 10,000 records
(30,000 entries).

Here is the start of the file and 2 logical records (3 entries each) of the
junk data text file:

table junkmeta
insert 20130426172656.954191-3300-04 attr vehicle "3300"
insert 20130426172656.954191-3300-04 attr stream "04"
insert 20130426172656.954191-3300-04 data rawmsg
"NzBwWP5xETl7B6eX7GHA9Kb4nv5rt7gx7HZkqtrMRjfwWZmnAIO h"
insert 20130426172656.954348-2200-11 attr vehicle "2200"
insert 20130426172656.954348-2200-11 attr stream "11"
insert 20130426172656.954348-2200-11 data rawmsg

I ran it from this simple shell script:



echo $(date) Starting Accumulo shell to load $LOADFILE, with output piped
to $LOG ... | tee $LOG
/usr/lib/accumulo/bin/accumulo shell -u $AUSER -p $AUSERPWD < $LOADFILE >>
echo $(date) Load complete. | tee $LOG

Crude, but effective enough to validate the cluster is functioning well so
the developers can poke at it with their real programs. ;-)

On Apr 29, 2013, at 2:32 PM, Eric Newton <> wrote:

For a quick test I have a text file I generated to load 500,000 rows of
> sample data using the Accumulo shell.

So you used the shell to insert lots of data?  One cell at a time?


View raw message