accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yamini Joshi <>
Subject Bulk import
Date Tue, 11 Oct 2016 21:24:52 GMT

I am trying to import data from a bson file to a 3 node Accumulo cluster
using pyaccumulo. The bson file is 4G and has a lot of records, all to be
stored into one table. I tried a very naive approach and used pyaccumulo
batch writer to write to the table. After parsing some records, my master
became unresonsive and shut down with the tserver threads stuck on low
memory error. I am assuming that the records are created faster than what
the proxy/master can handle. Is there ant other way to go about it? I am
thinking of using bulk ingest but I am not sure how exactly.

Best regards,
Yamini Joshi

View raw message