apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (APEXMALHAR-2063) Integrate WAL to FS WindowDataManager
Date Thu, 19 May 2016 04:52:12 GMT

    [ https://issues.apache.org/jira/browse/APEXMALHAR-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290461#comment-15290461
] 

Chandni Singh commented on APEXMALHAR-2063:
-------------------------------------------

@thw  WindowDataManager has been moved to org.apache.apex.malhar package in the current master
branch.
WindowDataManager is @Evolving, so it didn't cause any backward incompatibility.
IMO it is convenient to have it moved in 3.4.0 release so that there will not be any compatibility
changes (changes in operators that use it) in the next release.

> Integrate WAL to FS WindowDataManager
> -------------------------------------
>
>                 Key: APEXMALHAR-2063
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2063
>             Project: Apache Apex Malhar
>          Issue Type: Improvement
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>
> FS Window Data Manager is used to save meta-data that helps in replaying tuples every
completed application window after failure. For this it saves meta-data in a file per window.
Having multiple small size files on hdfs cause issues as highlighted here:
> http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
> Instead FS Window Data Manager can utilize the WAL to write data and maintain a mapping
of how much data was flushed to WAL each window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message