apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (APEXMALHAR-2484) BlockWriter for writing the part files into the specified directory
Date Thu, 27 Apr 2017 10:00:10 GMT

    [ https://issues.apache.org/jira/browse/APEXMALHAR-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986303#comment-15986303

ASF GitHub Bot commented on APEXMALHAR-2484:

Github user chaithu14 closed the pull request at:


> BlockWriter for writing the part files into the specified directory
> -------------------------------------------------------------------
>                 Key: APEXMALHAR-2484
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2484
>             Project: Apache Apex Malhar
>          Issue Type: Task
>            Reporter: Chaitanya
>            Assignee: Chaitanya
> Use case: Suppose, the size of source file (f1.txt) is 1 GB and the block size is 128
MB. I want to copy the file in destination as follows:
> f1.txt.part1
> f2.txt.part2
> ....
> By default, size of each part file is 128 MB except the last part.
> Design: Currently, the BlockWriter is restricted to write the part files into the HDFS
on which the app is running. To achieve the above use case, operator needs the block index
and relative path information. BlockMetadata which is the input port for the BlockWriter doesn't
have these information.
> So, I am creating the new operator(PartFileWriter) which extends from BlockWriter with
the input port of type FileMetadata.

This message was sent by Atlassian JIRA

View raw message