accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony F <afc...@gmail.com>
Subject Bulk ingest losing tablet server
Date Mon, 13 Jan 2014 13:11:23 GMT
I am experiencing an issue when bulk importing the results of a mapreduce
job of losing one or more tservers.  After the job is finished and the bulk
import is kicked off, I observe the following in the lost tserver's logs:

2014-01-10 23:14:21,312 [zookeeper.DistributedWorkQueue] INFO : Got
unexpected zookeeper event: None for
/accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/recover
y
2014-01-10 23:14:21,312 [zookeeper.DistributedWorkQueue] INFO : Got
unexpected zookeeper event: None for
/accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/bulk_failed_copyq
2014-01-10 23:14:21,369 [zookeeper.DistributedWorkQueue] ERROR: Failed to
look for work
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for
/accumulo/f76cacfa-e117-4999-893a-1eba79920f2c/bulk_failed_copyq

However, the bulk import actually succeeded and all is well with the data
in the table.  I have to restart the tserver each time this happens which
is not a viable solution for production.

I am using Accumulo 1.5.0.  Tservers have 12G of RAM and index caching, CF
bloom filters, and groups are turned on for the table in question.  Any
ideas why this might be happening?

Thanks,
Anthony

Mime
View raw message