accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugene Cheipesh <>
Subject AccumuloInputFormat and spark
Date Mon, 16 Feb 2015 20:10:07 GMT

This is more of a use-case report and a request for comment.

I am using Accumulo as a source for Spark RDDs through AccumuloInputFormat. My index is based
on a z-order space filing curve. When I decompose  a bounding box into index ranges I can
end up with a large number of Ranges, 3k+ is not too unusual. Getting a fast response from
Accumulo is not at all an issue. It would be possible to generate approximate ranges and use
a Filter to refine them on server side but that only delays the problem.

The ideal scenario is for Spark executors to be co-located with Accumulo tservers and number
of splits per server to be roughly equal to the number of cores on the machine. 

However, AccumuloInputFormat maps each range to a Split and Spark maps every split to a
Task. It is nature of z-order curve that some of these ranges contain only a few tiles while
others contain a pretty big chunk. Having significantly more splits than cores prevents good
concurrency on fetches. This is a problem that BatchScanner is designed to fix but it’s
not used in AccumuloInputFormat.

I noticed that TabletLocator which is used by AccumuloInputFormat returns a structure that
looks like it breaks down ranges by host and then by tablet. I hacked together an InputFormat
that generates a split per tablet and a Reader that uses a BatchScanner. The performance
for spark use-case was orders of magnitude better. I end up with about 50 splits for the same
dataset.  I can’t give exact numbers because I gave up timing the original source. This
seems is a pretty good compromise since the number of splits can be dynamically controlled to
tune the distribution and granularity of calculation batches.

A drawback is that most modes can not support this operation directly: client side, offline,
and isolated scans require a single range iterator. So some additional code would be required
for juggling them.

What are your thoughts on this use case and its requirements? Is this a legitimate use of

It would be nice if AccumuloInputFormat was able to use BatchScanner, perhaps as an option.
Accumulo is designed to crunch through large number of ranges so I would guess this to be
a common issue. I’d be willing to take a stab at a PR if there is agreement on that.

Eugene Cheipesh

View raw message