apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandni Singh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (APEXMALHAR-2063) Integrate WAL to FS WindowDataManager
Date Thu, 19 May 2016 00:53:12 GMT

    [ https://issues.apache.org/jira/browse/APEXMALHAR-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290200#comment-15290200
] 

Chandni Singh commented on APEXMALHAR-2063:
-------------------------------------------

Implementation approach:

1. Currently the API of WindowDataManager is an extension of StorageAgent which is in Apex
core.
This doesn't need to change since operators already use WindowDataManager and changing the
API means changing the operators as well.

2. Change the FSWindowDataManager to use FileSystemWAL internally instead of FSStorageAgent.
All the methods which are in the API will be implemented using FileSystemWAL.

> Integrate WAL to FS WindowDataManager
> -------------------------------------
>
>                 Key: APEXMALHAR-2063
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2063
>             Project: Apache Apex Malhar
>          Issue Type: Improvement
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>
> FS Window Data Manager is used to save meta-data that helps in replaying tuples every
completed application window after failure. For this it saves meta-data in a file per window.
Having multiple small size files on hdfs cause issues as highlighted here:
> http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
> Instead FS Window Data Manager can utilize the WAL to write data and maintain a mapping
of how much data was flushed to WAL each window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message