incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Beech (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-143) CrunchInputSplit should be public
Date Tue, 15 Jan 2013 20:58:12 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554331#comment-13554331
] 

Dave Beech commented on CRUNCH-143:
-----------------------------------

Hi Josh - I did try that actually, which is fine if the methods you need are on the InputSplit
interface. But, because CrunchInputSplit is package-private, it's not possible to cast and
access the actual input split underneath via CrunchInputFormat.getInputSplit()

I haven't used the in-memory pipeline seriously yet, so let me have a play around and I'll
come back with thoughts about an API. To be honest, anything which avoids the ugly casts would
be great!
                
> CrunchInputSplit should be public
> ---------------------------------
>
>                 Key: CRUNCH-143
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-143
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4.0
>            Reporter: Dave Beech
>            Assignee: Josh Wills
>            Priority: Minor
>
> Similar to MAPREDUCE-2226 - it's currently not possible to access the underlying input
split details, for instance the path on HDFS. 
> Is there a nice way to make this information available from DoFn instances while keeping
with the Crunch abstraction?
> Also - MAPREDUCE-4923 might also be applicable to CrunchInputSplit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message