griffin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johnnie (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (GRIFFIN-278) AvroBatchDataConnector handle input is directory
Date Tue, 13 Aug 2019 23:17:00 GMT

     [ https://issues.apache.org/jira/browse/GRIFFIN-278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Johnnie updated GRIFFIN-278:
----------------------------
    Description: 
Griffin data connector designed to compare the dataset's accuracy between source and target.

However, in big data eco-system, most of the source is huge and will have hundreds of files
in one folder. I think it would be great if griffin can handle the source by folder instead
of a file by default.

 In addition, in spark normally it reads data from a folder. in this case we don't need to
union all the files in one folder

  was:
Griffin data connector designed to compare the dataset's accuracy between source and target.

However, in big data eco-system, most of the source is huge and will have hundreds of files
in one folder. I think it would be great if griffin can handle the source by folder instead
of a file.

 


> AvroBatchDataConnector handle input is directory
> ------------------------------------------------
>
>                 Key: GRIFFIN-278
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-278
>             Project: Griffin
>          Issue Type: Improvement
>            Reporter: Johnnie
>            Priority: Major
>
> Griffin data connector designed to compare the dataset's accuracy between source and
target.
> However, in big data eco-system, most of the source is huge and will have hundreds of
files in one folder. I think it would be great if griffin can handle the source by folder
instead of a file by default.
>  In addition, in spark normally it reads data from a folder. in this case we don't need
to union all the files in one folder



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message