accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Ingest speed
Date Tue, 05 May 2015 17:47:09 GMT
Yes, a BatchWriter is for one table only. If you're writing to multiple 
tables, the MultiTableBatchWriter might be helpful. The 
MultiTableBatchWriter does the same thing that managing multiple 
BatchWriters would do but shares the memory usage.

Are you familiar with Hadoop's MapReduce framework?

MapReduce jobs accept data from InputFormats and write data to 
OutputFormats. Specifically, the FileInputFormat allows your MapReduce 
jobs to read data from HDFS and the AccumuloOutputFormat will write 
Mutations to an Accumulo table. Unless you have many nodes with lots and 
lots of data constantly flowing in, MapReduce might be overkill. I just 
thought I'd mention it though.

Keep in touch -- wouldn't want to keep you from being able to graduate :)

Revan1988 wrote:
> Every one batchWriter is for only one table (isn't it?).
> I need to separate my json record in 3 tables (my record came from an IDS so
> i have to divide ALERT, DNS and HTTP record type).
> So maybe i can use 3 batchWriter... I'll try!!
> And what about FileInputFormat and the AccumuloOutputFormat? I'm sorry but i
> don't know it very well... do you have any website, pdf or sample that i can
> study about this?
> Thank you again!
> I want to do a good work because it is the project for my graduation of
> MSc... but here in my university no one know so much about accumulo.
> -----
> Andrea Leoni
> Italy
> Computer Engineering
> --
> View this message in context:
> Sent from the Developers mailing list archive at

View raw message