accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1188) Sandbox iterators
Date Mon, 18 Mar 2013 20:11:17 GMT


Keith Turner commented on ACCUMULO-1188:

Seems like executing iterators in a separate process would be a good way to go.  

 * Have a pool of java processes for executing iterators, because starting a java process
is not quick.  Reusing processes can have its own problems.  Should probably throw a process
away anytime there is an exception in the iterator stack.
 * External processes could read RFiles directly from HDFS
 * External process would not benefit from current in memory tserver file cache.  Store cache
in shared memory?  Take a look at latest HBase cache code.
 * For compactions, external processes could also write files directly to HDFS.
 * For scans, need to figure out best way to get data from external process back to client.
 Client connects to tserver process.
 * Scans and Minor Compactions will need to read in memory map from tablet server.

This seems really easy for major compactions, the external process would just read rfiles
and write a rfile using the iterator stack.  Scans seems trickier because of cache, in memory
map, and efficiently getting data back to client.

> Sandbox iterators
> -----------------
>                 Key: ACCUMULO-1188
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Keith Turner
>             Fix For: 1.6.0
> Its possible that a user iterator can bring down a tablet server.  For example if it
has a OOM or creates too many threads.  It would be nice it iterators could be sandboxed in
some way.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message