accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Hutchison <dhutc...@mit.edu>
Subject Scanning with many singleton ranges?
Date Thu, 02 Apr 2015 22:16:37 GMT
A friend of mine has a use case where he wants to scan ~1M individual rows,
scattered across a ~15GB table.  He performed the following:

1. Gather a List of Range objects, each one a singleton range spanning an
entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.

Performing these steps crashed the TabletServer for my friend (haven't had
time to verify it myself yet). We're using a single-node standalone 1.6.1
Accumulo instance.

Is this a bad way to use Accumulo?  I advised my friend to batch the reads
into groups of ~10k ranges and see if that helps.  I wanted to check with
the community and see if we're doing something weird.  If the behavior
should have worked, I can try to put together a test case reproducing it,
that creates a table with many entries and then scans with many ranges.

Thanks,
Dylan Hutchison

Mime
View raw message