flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1981) Add GZip support
Date Tue, 02 Jun 2015 19:39:51 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569625#comment-14569625
] 

ASF GitHub Bot commented on FLINK-1981:
---------------------------------------

Github user sekruse commented on a diff in the pull request:

    https://github.com/apache/flink/pull/762#discussion_r31562256
  
    --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java
---
    @@ -628,9 +692,10 @@ public void open(FileInputSplit fileSplit) throws IOException {
     	 * @see org.apache.flink.api.common.io.InputStreamFSInputWrapper
     	 */
     	protected FSDataInputStream decorateInputStream(FSDataInputStream inputStream, FileInputSplit
fileSplit) throws Throwable {
    -		// Wrap stream in a extracting (decompressing) stream if file ends with .deflate.
    -		if (fileSplit.getPath().getName().endsWith(DEFLATE_SUFFIX)) {
    -			return new InflaterInputStreamFSInputWrapper(stream);
    +		// Wrap stream in a extracting (decompressing) stream if file ends with a known compression
file extension.
    +		InflaterInputStreamFactory<?> inflaterInputStreamFactory = getInflaterInputStreamFactory(fileSplit.getPath());
    +		if (inflaterInputStreamFactory != null) {
    +			return new InputStreamFSInputWrapper(inflaterInputStreamFactory.create(stream));
    --- End diff --
    
    It might also be the case that the stream was not compressed at all. It would of course
be nice to react appropriately to a missing codec, but how would we know if the current input
split belongs to an uncompressed file or a compressed file with an unknown codec?


> Add GZip support
> ----------------
>
>                 Key: FLINK-1981
>                 URL: https://issues.apache.org/jira/browse/FLINK-1981
>             Project: Flink
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sebastian Kruse
>            Assignee: Sebastian Kruse
>            Priority: Minor
>
> GZip, as a commonly used compression format, should be supported in the same way as the
already supported deflate files. This allows to use GZip files with any subclass of FileInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message