beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Onofré (JIRA) <j...@apache.org>
Subject [jira] [Updated] (BEAM-1592) Unify HdfsIO and HadoopInputFormatIO
Date Thu, 02 Mar 2017 20:23:45 GMT

     [ https://issues.apache.org/jira/browse/BEAM-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Baptiste Onofré updated BEAM-1592:
---------------------------------------
    Component/s:     (was: sdk-java-core)
                 sdk-java-extensions

> Unify HdfsIO and HadoopInputFormatIO
> ------------------------------------
>
>                 Key: BEAM-1592
>                 URL: https://issues.apache.org/jira/browse/BEAM-1592
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Stephen Sisk
>            Assignee: Jean-Baptiste Onofré
>
> HIFIO is currently in PR (https://github.com/apache/beam/pull/1994)  and as per discussion
in https://lists.apache.org/thread.html/803857877804165e798cf31edf079e6603eb9682b7690d52124c31e7@%3Cdev.beam.apache.org%3E,
we'd like to check HIFIO in as-is, then unify the two since they share a lot of code. 
> [~dhalperi@google.com] has mentioned: "the FileInputFormat reader gets to call some special
APIs that the
> generic InputFormat reader cannot -- so they are not completely redundant. Specifically,
FileInputFormat reader can do size-based splitting." 
> Dan recommended: "See if we can "inline" the FileInputFormat specific parts of HdfsIO
inside of HadoopInputFormatIO via reflection. If so, we can get the best of both worlds with
shared code." 
> This seems reasonable to me. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message