apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DT-Priyanka <...@git.apache.org>
Subject [GitHub] incubator-apex-malhar pull request: Apexmalhar-2008: hdfs file rea...
Date Tue, 08 Mar 2016 10:14:48 GMT
GitHub user DT-Priyanka opened a pull request:


    Apexmalhar-2008: hdfs file reader module

    Code to add HDFS file reader module. 
    1. The module reads file/list of files (directory is also accepted) and emit the file
    2. The module can be configured to emit blocks in order or out of order.
    3. Module reads file blocks in parallel. The number of parallel readers is configurable,
if not configured it will increase or decrease readers dynamically as per input data rate.
    Also updated code of FileSplitterInput to add some improvements:
    1. Tracking last file reference times of each folder differently, to avoid duplicates
(duplicates could be due to same relative paths of multiple files/sub dir)
    2. Small improvements in code.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DT-Priyanka/incubator-apex-malhar APEXMALHAR-2008-hdfs-input-module

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #206
commit 46d1d00e7ec860ef1010503f70a1a64f360df5ca
Author: Priyanka Gugale <priyanka@datatorrent.com>
Date:   2016-03-08T08:42:13Z

    APEXMALHAR-2008: Create HDFS File Reader module

commit 8f078150b4d4a97ed2baee41019d50c4c1d1ca44
Author: Priyanka Gugale <priyanka@datatorrent.com>
Date:   2016-03-08T09:06:28Z

    Adding headers


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message