gearpump-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "darion yaphet (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GEARPUMP-63) Gearpump Storage framework
Date Wed, 19 Oct 2016 08:13:58 GMT

    [ https://issues.apache.org/jira/browse/GEARPUMP-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588010#comment-15588010
] 

darion yaphet commented on GEARPUMP-63:
---------------------------------------

AFAIK a lot of user who using gearump will read data from Kafka topics so maybe we can push
application log into a topic or push to HDFS daily or hourly . It seems Log4J have support
HDFS and Kafka Appender . 

> Gearpump Storage framework
> --------------------------
>
>                 Key: GEARPUMP-63
>                 URL: https://issues.apache.org/jira/browse/GEARPUMP-63
>             Project: Apache Gearpump
>          Issue Type: New Feature
>            Reporter: Manu Zhang
>            Assignee: Weihua Jiang
>
> imported from https://github.com/gearpump/gearpump/issues/1197 on behalf of [~whjiang].
 His original proposal,
> In general, a Gearpump application requires following storage support:
> # Jar-file storage to store the application jar file(s).
> # application log. Currently we store logs in each node which makes application log analysis
difficult.
> # application metrics.
> # application configuration.
> # data source offset store (for at-least once semantics of streaming application)
> # application state checkpoint store (for transaction semantics)
> The general idea is:
> # Provide a storage system satisfied the above requirements.
> # Assume this storage is highly available. That means, it is user's duty to provide such
kind of a storage. For test purpose, user can use some non-HA storage system. But, in product
use, it shall be HAed.
> # Isolate usage from implementation. That is, Gearpump doesn't rely on Hadoop-common
or HDFS or one specific implementation to provide such storage. User is free to implement
its own storage.
> # This is a daemon provided functionality and can be used by every Gearpump application.
> # This storage shall provide data retentation functionality and access control.
> # This storage provides a set of API to meet the above requirements instead of one low-level
API.
> # User can override the system setting to provide dedicated implementation for certain
sub-storage system, e.g. chekcpoint store.
> # Akka replication shall store minimal info for an application and leave the majority
to this storage system. I.e. akka replication is more like a seed to this storage system.
> # In release, each storage implementation (e.g. storage-hdfs) is a standalone module/artifact.
> The draft of this storage looks like (quite initial, tentative to change):
> {code}
> trait Storage {
>     def createAppStorage(AppName, AppId) : AppStorage
>     def getAppStorage(AppId) : Option[AppStorage]
> }
> trait AppStorage {
>     def open
>     def close
>     def getJarStore: JarStore
>     def getMetricsStore: AppMetricsStore
>     def getKVStore: KVStore
>     def getLogAppender: LogAppender
>     def getConfiguration(ProcessorId): UserConfig
>     def setConfiguration(ProcessorId, UserConfig)
> }
> trait JarStore {
>     def copyFromLocal(localPath, remotePath)
>     def copyToLocal(remotePath, localPath)
> }
> ///assume K is sortable
> trait KVStore[K,V] {
>     def append(key, value)
>     def read(key): Try[Option[V]]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message