accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Bulk import
Date Wed, 12 Oct 2016 02:39:07 GMT
For only 4GB of data, you don't need to do bulk ingest. That is serious 
overkill.

I don't know why the master would have died/become unresponsive. It is 
minimally involved with the write-pipeline.

Can you share your current accumulo-env.sh/accumulo-site.xml? Have you 
followed the Accumulo user manual to change the configuration to match 
the available resources you have on your 3 nodes where Accumulo is running?

http://accumulo.apache.org/1.7/accumulo_user_manual.html#_pre_splitting_new_tables

http://accumulo.apache.org/1.7/accumulo_user_manual.html#_native_map

http://accumulo.apache.org/1.7/accumulo_user_manual.html#_troubleshooting

Yamini Joshi wrote:
> Hello
>
> I am trying to import data from a bson file to a 3 node Accumulo cluster
> using pyaccumulo. The bson file is 4G and has a lot of records, all to
> be stored into one table. I tried a very naive approach and used
> pyaccumulo batch writer to write to the table. After parsing some
> records, my master became unresonsive and shut down with the tserver
> threads stuck on low memory error. I am assuming that the records are
> created faster than what the proxy/master can handle. Is there ant other
> way to go about it? I am thinking of using bulk ingest but I am not sure
> how exactly.
>
> Best regards,
> Yamini Joshi

Mime
View raw message