streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (STREAMS-293) allow for missing metadata fields in streams-persist-hdfs
Date Wed, 11 Mar 2015 18:59:38 GMT


ASF GitHub Bot commented on STREAMS-293:

GitHub user steveblackmon opened a pull request:


    resolves STREAMS-293

You can merge this pull request into a Git repository by running:

    $ git pull STREAMS-293

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #195
commit aac486338b61aea4aac3d969618da6b642297b3e
Author: sblackmon <>
Date:   2015-03-03T17:21:57Z

    implements missing metadata
    needs tests

commit 8d2b3e219fd1ba82078bb0cb09c37595cd0e0d00
Author: sblackmon <>
Date:   2015-03-04T01:55:03Z

    implements ordered fields and metadata
    needs tests

commit 0d953487f8ecb93706759863f4fe65d3f3306289
Author: sblackmon <>
Date:   2015-03-11T18:57:55Z

    added tests and fixes to make tests work


> allow for missing metadata fields in streams-persist-hdfs
> ---------------------------------------------------------
>                 Key: STREAMS-293
>                 URL:
>             Project: Streams
>          Issue Type: Improvement
>            Reporter: Steve Blackmon
>            Assignee: Steve Blackmon
> Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns.
 this could be made much more flexible without too much effort.  
> Update reader to support additional use cases:
> a) file paths containing one json document per line
> b) file paths containing just id and json on each line, 
> c) file paths containing id timestamp and json document on each line
> Update writer support
> a) ids only
> b) ids and timestamp only
> c) ids timestamp and json only

This message was sent by Atlassian JIRA

View raw message