hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Filipiak (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-8836) Skip newline on empty files with getMerge -nl
Date Wed, 29 Jul 2015 09:17:04 GMT
Jan Filipiak created HDFS-8836:

             Summary: Skip newline on empty files with getMerge -nl
                 Key: HDFS-8836
                 URL: https://issues.apache.org/jira/browse/HDFS-8836
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs-client
    Affects Versions: 2.7.1, 2.6.0
            Reporter: Jan Filipiak
            Priority: Trivial

Hello everyone,

I recently was in the need of using the new line option -nl with getMerge because the files
I needed to merge simply didn't had one. I was merging all the files from one directory and
unfortunately this directory also included empty files, which effectively led to multiple
newlines append after some files. I needed to remove them manually afterwards.

In this situation it is maybe good to have another argument that allows
skipping empty files.

Thing one could try to implement this feature:

The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't
return the number of bytes copied which would be convenient as one could
skip append the new line when 0 bytes where copied or one would check the file size before.

I posted this Idea on the mailing list http://mail-archives.apache.org/mod_mbox/hadoop-user/201507.mbox/%3C55B25140.3060005%40trivago.com%3E
but I didn't really get many responses, so I thought I my try this way.

This message was sent by Atlassian JIRA

View raw message