crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Danny Morgan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-429) The CSVFileSource does not always function properly
Date Mon, 01 Dec 2014 20:57:12 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230442#comment-14230442
] 

Danny Morgan commented on CRUNCH-429:
-------------------------------------

[~mkwhitacre] Moving FileSystem retrieval outside the loop broke reading from alternative
filesystem sources.

If the crunch job is running on a hadoop cluster and the input paths are s3 then:

{code:java}
FileSystem fileSystem = FileSystem.get(job.getConfiguration());
{code}

isn't correct.

> The CSVFileSource does not always function properly
> ---------------------------------------------------
>
>                 Key: CRUNCH-429
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-429
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: mac champion
>            Assignee: mac champion
>            Priority: Minor
>              Labels: csv, csvparser
>             Fix For: 0.8.4, 0.11.0
>
>         Attachments: 0001-CRUNCH-429-Fix-CSVInputFormat.patch, CRUNCH-429_a.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> The "configure" method of CSVInputFormat does not have any effect on its configuration
and is never called. Instead, the class needs to implement Configurable and set its configuration
options in an overriden setConf method.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message