Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@crunch.apache.org
Date: Tue, 30 Dec 2014 22:14:13 +0000 (UTC)
From: "Micah Whitacre (JIRA)" <jira@apache.org>
To: crunch-dev@incubator.apache.org
Message-ID: <JIRA.12723645.1403707311000.117729.1419977653648@Atlassian.JIRA>
In-Reply-To: <JIRA.12723645.1403707311000@Atlassian.JIRA>
References: <JIRA.12723645.1403707311000@Atlassian.JIRA>
 <JIRA.12723645.1403707311590@arcas>
Subject: [jira] [Commented] (CRUNCH-429) The CSVFileSource does not always
 function properly
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CRUNCH-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261560#comment-14261560 ] 

Micah Whitacre commented on CRUNCH-429:
---------------------------------------

[~unluckyboy], interesting I don't typically use s3.  My suggestion was to cut down on retrieving the FileSystem object because typically for a Source it would not change.  In your s3 use case do you typically interact with multiple instances that you would need to vary config with each path?  Or do you mix reading CSV files from HDFS and s3 inside a single Source?  The reason I ask is that you should still be able to use the current CSVFileSource by configuring the connection information for s3 using the Source's inputConf(...) methods[1].

If that is prohibitive feel free to open up another issue and we can enhance the Source code.

[1] - http://crunch.apache.org/apidocs/0.8.4/org/apache/crunch/Source.html#inputConf(java.lang.String, java.lang.String)

> The CSVFileSource does not always function properly
> ---------------------------------------------------
>
>                 Key: CRUNCH-429
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-429
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: mac champion
>            Assignee: mac champion
>            Priority: Minor
>              Labels: csv, csvparser
>             Fix For: 0.8.4, 0.11.0
>
>         Attachments: 0001-CRUNCH-429-Fix-CSVInputFormat.patch, CRUNCH-429_a.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> The "configure" method of CSVInputFormat does not have any effect on its configuration and is never called. Instead, the class needs to implement Configurable and set its configuration options in an overriden setConf method.  


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)