accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From phrocker <...@git.apache.org>
Subject [GitHub] accumulo issue #260: ACCUMULO-4643 initial implementation
Date Thu, 01 Jun 2017 18:55:00 GMT
Github user phrocker commented on the issue:

    https://github.com/apache/accumulo/pull/260
  
    @ivakegg  Yields are not aware of other yields and thus are completely independent and
thus cannot cooperate with any scheduling mechanism.  My old Operating System book calls this
"uncooperative yielding." But I can see how this can be confusing. Let's call it isolated
yielding.
    
    To your point that "they do it to themselves." Well, since an iterator is one amongst
a stack and you could have a multi-user system, if you had one iterator that would skip just
five more keys before completing, but is pre-empted due to another iterator, you have the
potential for a yield when one is not desired. The only way to combat this would be solid
metrics. You don't know how many increased RPC calls there are. This can increase RPCs if
you simply set the key yield incorrectly.  You don't know I/O load and how many keys being
skipped is reasonable without these metrics. Further, one key is not the same as another key.
Parts of a table could have much smaller keys, so again, these metrics prove everything by
telling us: how much time spent before yield, size of keys skipped, etc, etc
    
     Hence those metrics would be useful to show if this mechanism works as intended in production.

    
    Then, after metrics, a nice to have would be a mechanism that allows the entire scan to
stop. If you are going to put a limit and "yield." You must have a cessation point. Agree
that long running scans can happen, but the RPC increase and context switching is a problem
that we cannot stop with the current solution. You also need a point at which you have yielded
enough and thus must sop entirely. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message