accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-624) iterators may open lots of compressors
Date Mon, 28 Dec 2015 18:13:49 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072964#comment-15072964
] 

Eric Newton commented on ACCUMULO-624:
--------------------------------------

I wrote a little experiment: 10 threads allocate 100K decompressors each.

Using  {{gz.returnDecompressor(gz.getDecompressor()}} all threads complete in 1.4 seconds.

Using {{gz.getCodec().createDecompressor()}} all threads complete in 20 seconds.

So, it is quite a bit faster to use the pool. But, allocating decompressors without the pool
still takes less than a millisecond.

It seems we are not the only ones that think that [codec reuse may not be worth it | https://github.com/prestodb/presto-hive-apache/blob/master/src/main/java/org/apache/hadoop/hive/ql/io/CodecPool.java].

> iterators may open lots of compressors
> --------------------------------------
>
>                 Key: ACCUMULO-624
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-624
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>            Reporter: Eric Newton
>
> A large iterator tree may create many instances of Compressors.  These instances are
pulled from a pool that never decreases in size.  So, if 50 simultaneous queries are run over
dozens of files, each with a complex iterator stack, there will be thousands of compressors
created.  Each of these holds a large buffer.  This can cause the server to run out of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message