commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Meyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-207) add notifier support for new block in BZip2CompressorInputStream
Date Sun, 03 Apr 2016 12:28:25 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223246#comment-15223246
] 

Thomas Meyer commented on COMPRESS-207:
---------------------------------------

Some background/infos: I was inspired of the possibility to process the bzip2 stream block
by block while reading this book: Hadoop: The Definitive Guide (http://shop.oreilly.com/product/0636920033448.do).
The Hadoop has so called splittable compression streams. which AFAIK does this: it splits
the total length of the compressed input file by 2 (e.g.) and then searches for the number
PI marker (start of block) in the stream, once found it starts to uncompress. this should
mostly work, but I guess when can created a bzip2 stream which has the number PI as output
of the compression algorithm, but this is very theoretically.

See also:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java
and 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/bzip2/CBZip2InputStream.java
(which seems to probably be copy&pasted from the commons-compress implementation, at least
it looks very similar)



> add notifier support for new block in BZip2CompressorInputStream
> ----------------------------------------------------------------
>
>                 Key: COMPRESS-207
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-207
>             Project: Commons Compress
>          Issue Type: New Feature
>          Components: Compressors
>    Affects Versions: 1.4.1
>            Reporter: Thomas Meyer
>            Priority: Minor
>              Labels: API, bzip
>         Attachments: 0001-Add-notifier-support-for-new-block-in-BZip2Compresso.patch,
BZip2CompressorInputStream-add-newBlock-notifier.patch, BZip2CompressorInputStream-add-newBlock-notifier.patch,
BZip2CompressorInputStream-add-newBlock-notifier.patch
>
>
> hi,
> attached patch enables an program to add a listener when a new bzip2
> block is detected.
> The notifier is called with:
>  - xxx.newBlock(this, currBlockPosition)
> - this = the current BZip2CompressorInputStream object
> - currBlockPosition = The offset (i.e. start position) in the compressed
> input stream of the current block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message