crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Deepak Subhramanian (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-220) Crunch not working with S3
Date Tue, 18 Jun 2013 16:36:22 GMT


Deepak Subhramanian commented on CRUNCH-220:

[~joshwills] cc [~davebeech]

Hi Josh, When I pass the with 0.6 version it works fine locally but not in
the cluster since it is looking job.jar in the wrong place.

I tried to get the latest code from master from git and compile it. For some reason it is
giving error and I cannot find the test classes in the  project repository. 

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile
(default-testCompile) on project crunch-core: Compilation failure
[ERROR] /Users/deepak/github/crunch/crunch-core/src/it/java/org/apache/crunch/io/avro/[74,11]
cannot find symbol
[ERROR] symbol  : constructor Person(java.lang.String,int,java.util.List<java.lang.CharSequence>)
[ERROR] location: class org.apache.crunch.test.Person
[ERROR] -> [Help 1]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] For more information about the errors and possible solutions, please read the following
[ERROR] [Help 1]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :crunch-core

> Crunch not working with S3
> --------------------------
>                 Key: CRUNCH-220
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: IO
>    Affects Versions: 0.6.0
>         Environment: Cloudera Hadoop with Amazon S3
>            Reporter: Deepak Subhramanian
>            Assignee: Josh Wills
>            Priority: Minor
>             Fix For: 0.7.0
>         Attachments: CRUNCH-220.patch
> I am trying to use crunch to read file from S3 and write to S3. I am able to read the
file .But giving an error while writing to s3.  Not sure if it is a bug or I am missing a
hadoop configuration.  I am able to read from s3 and write to a local file or hdfs directly.
 Here is the code and error. I am passing s3 key and secret as parameters.  
> PCollection<String> lines,   Writables.strings()));
>     PCollection<String> textline = lines.parallelDo(new DoFn<String, String>()
>         public void process(String line, Emitter<String> emitter) {
>             if (headerNotWritten) {
>                 //emitter.emit("Writing Header");
>                 emitter.emit(table_header.getTable_header());
>                 emitter.emit(line);
>                 headerNotWritten =false;
>             }else {
>             emitter.emit(line);
>             }
>         }
>       }, Writables.strings()); // Indicates the serialization format
>     pipeline.writeTextFile(textline, outputdir);
>  Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: s3n://bktname/testcsv,
expected: hdfs://ip-address.compute.internal
> [] out: 	at org.apache.hadoop.fs.FileSystem.checkPath(
> [] out: 	at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(
> [] out: 	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(
> [] out: 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(
> [] out: 	at org.apache.hadoop.fs.FileSystem.exists(
> [] out: 	at
> [] out: 	at
> [] out: 	at
> [] out: 	at
> [] out: 	at

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message