streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Blackmon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (STREAMS-293) allow for missing metadata fields in streams-persist-hdfs
Date Tue, 03 Mar 2015 22:05:04 GMT

     [ https://issues.apache.org/jira/browse/STREAMS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Blackmon updated STREAMS-293:
-----------------------------------
    Description: 
Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns. 
this could be made much more flexible without too much effort.  

Update reader and writer to support additional use cases:
a) file paths containing one json document per line
b) file paths containing just id and json on each line, 
c) file paths containing id timestamp and json document on each line




  was:
Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns. 
this could be made much more flexible without too much effort.  

Update reader and writer to support additional use cases:
a) files with field delimiter other than \t
b) files with line delimiter other than \n
c) file paths containing one json document per line
d) file paths containing just id and json on each line, 
e) file paths containing id timestamp and json document on each line





> allow for missing metadata fields in streams-persist-hdfs
> ---------------------------------------------------------
>
>                 Key: STREAMS-293
>                 URL: https://issues.apache.org/jira/browse/STREAMS-293
>             Project: Streams
>          Issue Type: Improvement
>            Reporter: Steve Blackmon
>
> Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns.
 this could be made much more flexible without too much effort.  
> Update reader and writer to support additional use cases:
> a) file paths containing one json document per line
> b) file paths containing just id and json on each line, 
> c) file paths containing id timestamp and json document on each line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message