crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mac champion (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CRUNCH-565) CSVInputFormat needs to be more defensive when configuring itself
Date Thu, 01 Oct 2015 15:55:28 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939926#comment-14939926
] 

mac champion edited comment on CRUNCH-565 at 10/1/15 3:55 PM:
--------------------------------------------------------------

[~mkwhitacre]
Well, at first I started using it just because that's what I'm comfortable with. But later
I realized i wasn't completely certain how to manipulate it into returning null instead of
blank strings. With Mockito that's easy, just don't mock anything and the return value will
be null.

BUT, If I switch all of these to get(opt,default) I will have to do some extra stuff, but
I shouldn't have to handle nulls or do anything weird like that. Can you take another look
here? https://github.com/champgm/crunch/pull/7

Also, sorry about the pull request to apache/crunch. I've forked that and I use it play around
and create pull requests so I can have a nice place to review and comment on the diffs. When
the code looks good and it builds, I'll squash, create a patch, and attach it to the JIRA.
Is that an okay workflow? The official one is pretty sparse and doesn't include any kind of
review steps: https://cwiki.apache.org/confluence/display/CRUNCH/Committer+Workflow





was (Author: champgm):
[~mkwhitacre]
Well, at first I started using it just because that's what I'm comfortable with. But later
I realized i wasn't completely certain how to manipulate it into returning null instead of
blank strings. With Mockito that's easy, just don't mock anything and the return value will
be null.

BUT, If I switch all of these to get(opt,default) I will have to do some extra stuff, but
I shouldn't have to handle nulls or do anything weird like that. Can you take another look
here? https://github.com/champgm/crunch/pull/7

Also, sorry about the pull request to apache/crunch. I've forked that and I use it play around
and create pull requests so I can have a nice place to review and comment on the diffs. When
the code looks good and it builds, I'll create a patch and attach it to the JIRA. Is that
an okay workflow? The official one is pretty sparse and doesn't include any kind of review
steps: https://cwiki.apache.org/confluence/display/CRUNCH/Committer+Workflow




> CSVInputFormat needs to be more defensive when configuring itself
> -----------------------------------------------------------------
>
>                 Key: CRUNCH-565
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-565
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.10.0, 0.8.3
>            Reporter: mac champion
>            Assignee: mac champion
>            Priority: Minor
>              Labels: csv, csvparser
>
> It seems that some behavior has changed somewhere along the line where hadoop Configuration
is concerned. It is possible that a call to .get(OPTION) will return null. CSVInputFormat
does not handle that case gracefully:
> https://github.com/apache/crunch/blob/apache-crunch-0.10.0/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVInputFormat.java#L178-L183
> Some more relevant details can be found in this JIRA:
> https://issues.apache.org/jira/browse/CRUNCH-564?focusedCommentId=14938186&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14938186



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message