apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Jotwani <mo...@datatorrent.com>
Subject Re: HDFS file copy module for Malhar
Date Thu, 10 Mar 2016 03:46:21 GMT
+1

Regards,
Mohit
On 9 Mar 2016 21:09, "Chinmay Kolhatkar" <chinmay@apache.org> wrote:

> +1.
>
> On Wed, Mar 9, 2016 at 8:38 PM, Yogi Devendra <yogidevendra@apache.org>
> wrote:
>
> > Hi,
> >
> > I mentioned earlier here,
> >
> >
> http://mail-archives.apache.org/mod_mbox/apex-dev/201602.mbox/%3CCAHekGF9xNa6qvvt4ySGBC4SmCN7_Hn2r9rj2SQSV%2BE1Cc5A0fQ%40mail.gmail.com%3E
> >
> > I am proposing HDFS file copy module.
> > JIRA created for this work is available here :
> > https://issues.apache.org/jira/browse/APEXMALHAR-2013
> >
> > Please note that, these work is related to but different from
> > https://issues.apache.org/jira/browse/APEXMALHAR-2009 which talks about
> > concrete operator for writing data to HDFS tuple by tuple.
> >
> > Main difference here is in case of file copy module; block sequence for a
> > file has to be retained. Thus, we need to pass on additional information
> > like FileMetaData, BlockMetaData from the upstream operator.
> >
> > Usecase
> > ------------
> > This module can be used with HDFS input module to copy files from HDFS to
> > HDFS.
> > Large files will be copied in block-by-block approach.
> >
> > Functionality
> > -----------------
> >
> >    1. Writing files to HDFS using FileMetaData, BlockMetaData, BlockData
> >    emitted by HDFS input module.
> >    2. Blocks data have to be synchronized to retain original sequence
> from
> >    source
> >    3. Support to copy multiple files, recursive copy of directory
> structure
> >    etc.
> >    4. Metrics for summary information on the progress of file copy.
> >
> > Let me know your thoughts on this. You may post your comments on the JIRA
> > https://issues.apache.org/jira/browse/APEXMALHAR-2013
> >
> > ~ Yogi
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message