accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "marco polo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4419) Create Compressor factory allowing Compression settings to be updated
Date Tue, 23 Aug 2016 15:25:22 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433006#comment-15433006
] 

marco polo commented on ACCUMULO-4419:
--------------------------------------

Yes. As per Hadoop's Codec Pool ( below ), it does not remove un-used compressors. The implementation
provided will trim unused compressors via a background thread. Further, the option is provided
to not use a pool at all by way of the CompressorFactory. Internally we've used the non pooled
method since CodecPool did not trim the compressors when they weren't being used. In some
cases a large number of sources will increase the memory foot print, so this was for memory
usage; however, the ability to choose a factory was something I added to accommodate those
who wanted to make their own choice.

http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-common/2.6.0/org/apache/hadoop/io/compress/CodecPool.java#CodecPool.payback%28java.util.Map%2Corg.apache.hadoop.io.compress.Decompressor%29

Additionally, I made some options modifiable. Specifically, input/output buffer sized for
the ByteInputStream and ByteOutputStream when a compression stream is obtained. The PR allows
us to have a mechanism by which these options can be updated. In testing I found significant
performance increases by changing the 1K buffer higher. Allowing this to be changed on the
fly would be helpful. I didn't want to confuse the focus of the PR, but I did want to allow
the CompressorFactory implementation and its configuration the ability to be updated.


> Create Compressor factory allowing Compression settings to be updated
> ---------------------------------------------------------------------
>
>                 Key: ACCUMULO-4419
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4419
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: marco polo
>            Assignee: marco polo
>            Priority: Minor
>              Labels: core
>             Fix For: 1.7.3, 1.8.1
>
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> This ticket is to account for work done elsewhere in which I've made the compression
pool configurable such that we either don't use the pool at all or use an adjustable pool
based on commons-pool
> Other configuration options are now updated through a CompressionUpdate mechanism. 
> This PR will move us away from CodecPool, but will allow us greater control over trimming
codecs from the pool itself. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message