accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From keith-turner <...@git.apache.org>
Subject [GitHub] accumulo pull request: ACCUMULO-3602 BatchScanner optimization for...
Date Wed, 08 Apr 2015 21:12:31 GMT
Github user keith-turner commented on the pull request:

    https://github.com/apache/accumulo/pull/25#issuecomment-91038638
  
    > Agreed, but is this a new issue?
    
    I think the intent of this PR is new.  I think the intent is to efficiently support a
map reduce job that reads from many small ranges.  If we are going to do that, then lets do
it in a way that scales better.     
    
    > However, I still think that for many things, it's probably better to simply use an
iterator with some filter criteria.
    
    This approach would work well for existing input format.  The only drawback to this approach
is it requires adding custom iterators to tservers classpaths.   The batch scanner approach
allows efficient reads of many small ranges from a tablet w/o custom iterators.  Just need
a custom function in the map reduce job to generate the ranges.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message