hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Macdonald (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-175) Reading compressed files in local mode + MiniMRCluster
Date Fri, 08 May 2009 19:00:46 GMT

    [ https://issues.apache.org/jira/browse/PIG-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707468#action_12707468
] 

Craig Macdonald commented on PIG-175:
-------------------------------------

Enclosed are updated results fro Pig 0.2.0. In this version, MapReduce mode can now always
parse gzip and bzip2 files file, however local mode cannot.


{noformat}
==========================================
Bashs good friend: cat
==========================================
Normal
A
B
C
bz2
A
B
C
gzip
A
B
C
==========================================
MiniMRCluster
==========================================
test.all.pig
2009-05-08 19:56:51,715 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
2009-05-08 19:56:52,034 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
2009-05-08 19:56:54,686 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:56:54,717 [Thread-3] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:56:55,718 [Thread-9] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:56:56,015 [Thread-9] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:56:56,020 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0001_m_000000_0'
done.
2009-05-08 19:56:56,030 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp442336691/tmp1233577046
2009-05-08 19:56:59,714 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:57:04,720 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:57:04,720 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
2009-05-08 19:57:06,148 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:57:06,153 [Thread-10] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:57:06,450 [Thread-16] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:57:06,512 [Thread-16] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:57:06,514 [Thread-16] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0002_m_000000_0'
done.
2009-05-08 19:57:06,519 [Thread-16] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0002_m_000000_0' to file:/tmp/temp442336691/tmp-1848149730
2009-05-08 19:57:11,152 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:57:16,154 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:57:16,154 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
2009-05-08 19:57:17,114 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:57:17,118 [Thread-17] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:57:17,359 [Thread-23] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:57:17,520 [Thread-23] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:57:17,523 [Thread-23] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0003_m_000000_0'
done.
2009-05-08 19:57:17,528 [Thread-23] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0003_m_000000_0' to file:/tmp/temp442336691/tmp97423898
2009-05-08 19:57:22,119 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:57:27,122 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:57:27,122 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
test.bz2.pig
2009-05-08 19:57:28,096 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
2009-05-08 19:57:28,401 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
2009-05-08 19:57:31,376 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:57:31,406 [Thread-3] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:57:32,085 [Thread-9] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:57:32,202 [Thread-9] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:57:32,206 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0001_m_000000_0'
done.
2009-05-08 19:57:32,214 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp1305251735/tmp-365770513
2009-05-08 19:57:36,400 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:57:41,404 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:57:41,404 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
test.gz.pig
2009-05-08 19:57:42,419 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
2009-05-08 19:57:42,775 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
2009-05-08 19:57:45,289 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:57:45,332 [Thread-3] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:57:45,980 [Thread-9] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:57:46,070 [Thread-9] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:57:46,073 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0001_m_000000_0'
done.
2009-05-08 19:57:46,078 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp-1528873036/tmp-704202139
2009-05-08 19:57:50,298 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:57:55,394 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:57:55,394 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
test.normal.pig
2009-05-08 19:57:56,406 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
2009-05-08 19:57:56,781 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
2009-05-08 19:57:59,138 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
2009-05-08 19:57:59,231 [Thread-3] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
2009-05-08 19:57:59,909 [Thread-9] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
0
2009-05-08 19:58:00,122 [Thread-9] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2009-05-08 19:58:00,125 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'attempt_local_0001_m_000000_0'
done.
2009-05-08 19:58:00,130 [Thread-9] INFO  org.apache.hadoop.mapred.TaskRunner - Saved output
of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp2083418815/tmp1247603366
2009-05-08 19:58:04,196 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-05-08 19:58:09,202 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-05-08 19:58:09,202 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
(A)
(B)
(C)
==========================================
Local execution mode
==========================================
test.all.pig
2009-05-08 19:58:11,292 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:11,293 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
(A)
(B)
(C)
2009-05-08 19:58:11,318 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:11,318 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
2009-05-08 19:58:11,343 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:11,343 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
test.bz2.pig
2009-05-08 19:58:12,804 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:12,805 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
test.gz.pig
2009-05-08 19:58:14,885 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:14,886 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
test.normal.pig
2009-05-08 19:58:17,208 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 100% complete!
2009-05-08 19:58:17,209 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Success!!
(A)
(B)
(C)
{noformat}

> Reading compressed files in local mode + MiniMRCluster
> ------------------------------------------------------
>
>                 Key: PIG-175
>                 URL: https://issues.apache.org/jira/browse/PIG-175
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Craig Macdonald
>         Attachments: testCompressed.sh
>
>
> I have written a small test script that tests if three simple compressed and uncompressed
files can be loaded successfully. Essentially, it writes a file, compresses it using gzip
and bzip2, and see if Pig can load it. I use both local execution mode and miniMR cluster.
> Here are my results:
> MiniMRCluster
>  * uncompressed: OK
>  * gzip: OK
>  * bzip2: OK
>  * All three at once: not OK
> Local Execution Mode
>  * uncompressed: OK
>  * gzip: not OK (garbled output)
>  * bzip2: not OK ( garbled output)
>  * All three at once: not OK (expected)
> I'm not sure what the problem is with the miniMRcluster - there is a NPE in PigSplit.getLocations().
I suspect that getFileCacheHints() is returning null, which ususally indicates a non-existant
file. 
> However, for the local execution mode, I'm fairly confident that this mode has no support
for compressed files.
> Craig
> {noformat}
> ==========================================
> Bashs good friend: cat
> ==========================================
> Normal
> A
> B
> C
> bz2
> A
> B
> C
> gzip
> A
> B
> C
> ==========================================
> MiniMRCluster
> ==========================================
> test.all.pig
> 2008-03-29 12:07:22,103 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
> 2008-03-29 12:07:22,241 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
> 2008-03-29 12:07:22,555 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [/users/grad/craigm/src/pig/FROMApache/trunk4/trunk/test.normal:org.apache.pig.builtin.PigStorage()]
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]]
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: /tmp/temp-1403805719/tmp1733057091:org.apache.pig.builtin.BinStorage
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
> 2008-03-29 12:07:22,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
> 2008-03-29 12:07:22,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
> 2008-03-29 12:07:23,427 [Thread-0] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
1
> 2008-03-29 12:07:23,544 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner -
> 2008-03-29 12:07:23,545 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'map_0000'
done.
> 2008-03-29 12:07:23,581 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'map_0000' to file:/tmp/temp-1403805719/tmp1733057091
> 2008-03-29 12:07:23,625 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner - reduce
> reduce
> 2008-03-29 12:07:23,626 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'reduce_cibps7'
done.
> 2008-03-29 12:07:23,630 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'reduce_cibps7' to file:/tmp/temp-1403805719/tmp1733057091
> 2008-03-29 12:07:24,383 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Pig progress = 100%
> (A)
> (B)
> (C)
> 2008-03-29 12:07:24,415 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
> 2008-03-29 12:07:24,415 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [/user/craigm/test.gz:org.apache.pig.builtin.PigStorage()]
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]]
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: /tmp/temp-1403805719/tmp-1191951534:org.apache.pig.builtin.BinStorage
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
> 2008-03-29 12:07:24,416 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
> 2008-03-29 12:07:24,417 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
> java.lang.NullPointerException
>         at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigSplit.getLocations(PigSplit.java:107)
>         at org.apache.hadoop.mapred.JobClient.writeSplitsFile(JobClient.java:638)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:540)
>         at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher.launchPig(MapReduceLauncher.java:260)
>         at org.apache.pig.backend.hadoop.executionengine.POMapreduce.open(POMapreduce.java:176)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:274)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:314)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:255)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:160)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:63)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
>         at org.apache.pig.Main.main(Main.java:265)
> 2008-03-29 12:07:24,868 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.io.IOException:
Unable to open iterator for alias: gz
>         at org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:16)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:325)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:255)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:160)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:63)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
>         at org.apache.pig.Main.main(Main.java:265)
> Caused by: org.apache.pig.backend.executionengine.ExecException: java.io.IOException
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:288)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:314)
>         ... 5 more
> Caused by: java.io.IOException
>         at org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:16)
>         at org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:12)
>         at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher.launchPig(MapReduceLauncher.java:380)
>         at org.apache.pig.backend.hadoop.executionengine.POMapreduce.open(POMapreduce.java:176)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:274)
>         ... 6 more
> Caused by: java.lang.NullPointerException
>         at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigSplit.getLocations(PigSplit.java:107)
>         at org.apache.hadoop.mapred.JobClient.writeSplitsFile(JobClient.java:638)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:540)
>         at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher.launchPig(MapReduceLauncher.java:260)
>         ... 8 more
> 2008-03-29 12:07:24,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - Unable to open
iterator for alias: gz
> test.bz2.pig
> 2008-03-29 12:07:25,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
> 2008-03-29 12:07:25,486 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
> 2008-03-29 12:07:25,761 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
> 2008-03-29 12:07:25,761 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [/users/grad/craigm/src/pig/FROMApache/trunk4/trunk/test.bz2:org.apache.pig.builtin.PigStorage()]
> 2008-03-29 12:07:25,761 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]]
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: /tmp/temp-142293823/tmp-1682881533:org.apache.pig.builtin.BinStorage
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
> 2008-03-29 12:07:25,762 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
> 2008-03-29 12:07:26,585 [Thread-0] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
1
> 2008-03-29 12:07:26,802 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner -
> 2008-03-29 12:07:26,802 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'map_0000'
done.
> 2008-03-29 12:07:26,809 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'map_0000' to file:/tmp/temp-142293823/tmp-1682881533
> 2008-03-29 12:07:26,852 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner - reduce
> reduce
> 2008-03-29 12:07:26,852 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'reduce_r75h48'
done.
> 2008-03-29 12:07:26,859 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'reduce_r75h48' to file:/tmp/temp-142293823/tmp-1682881533
> 2008-03-29 12:07:27,547 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Pig progress = 100%
> (A)
> (B)
> (C)
> test.gz.pig
> 2008-03-29 12:07:28,110 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
> 2008-03-29 12:07:28,266 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
> 2008-03-29 12:07:28,582 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
> 2008-03-29 12:07:28,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [/users/grad/craigm/src/pig/FROMApache/trunk4/trunk/test.gz:org.apache.pig.builtin.PigStorage()]
> 2008-03-29 12:07:28,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]]
> 2008-03-29 12:07:28,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
> 2008-03-29 12:07:28,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
> 2008-03-29 12:07:28,584 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
> 2008-03-29 12:07:28,584 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: /tmp/temp-1552662535/tmp1393315176:org.apache.pig.builtin.BinStorage
> 2008-03-29 12:07:28,584 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
> 2008-03-29 12:07:28,584 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
> 2008-03-29 12:07:28,584 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
> 2008-03-29 12:07:29,621 [Thread-0] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
1
> 2008-03-29 12:07:29,677 [Thread-0] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable
to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2008-03-29 12:07:29,830 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner -
> 2008-03-29 12:07:29,831 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'map_0000'
done.
> 2008-03-29 12:07:29,875 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'map_0000' to file:/tmp/temp-1552662535/tmp1393315176
> 2008-03-29 12:07:30,096 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner - reduce
> reduce
> 2008-03-29 12:07:30,097 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'reduce_kan4fo'
done.
> 2008-03-29 12:07:30,103 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'reduce_kan4fo' to file:/tmp/temp-1552662535/tmp1393315176
> 2008-03-29 12:07:30,583 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Pig progress = 100%
> (A)
> (B)
> (C)
> test.normal.pig
> 2008-03-29 12:07:31,114 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
> 2008-03-29 12:07:31,270 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing
JVM Metrics with processName=JobTracker, sessionId=
> 2008-03-29 12:07:31,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- ----- MapReduce Job -----
> 2008-03-29 12:07:31,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Input: [/users/grad/craigm/src/pig/FROMApache/trunk4/trunk/test.normal:org.apache.pig.builtin.PigStorage()]
> 2008-03-29 12:07:31,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map: [[*]]
> 2008-03-29 12:07:31,556 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Group: null
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Combine: null
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce: null
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Output: /tmp/temp-323341057/tmp-1104693095:org.apache.pig.builtin.BinStorage
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Split: null
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Map parallelism: -1
> 2008-03-29 12:07:31,557 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce
- Reduce parallelism: -1
> 2008-03-29 12:07:32,402 [Thread-0] INFO  org.apache.hadoop.mapred.MapTask - numReduceTasks:
1
> 2008-03-29 12:07:32,514 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner -
> 2008-03-29 12:07:32,514 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'map_0000'
done.
> 2008-03-29 12:07:32,521 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'map_0000' to file:/tmp/temp-323341057/tmp-1104693095
> 2008-03-29 12:07:32,568 [Thread-0] INFO  org.apache.hadoop.mapred.LocalJobRunner - reduce
> reduce
> 2008-03-29 12:07:32,568 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Task 'reduce_4q573x'
done.
> 2008-03-29 12:07:32,572 [Thread-0] INFO  org.apache.hadoop.mapred.TaskRunner - Saved
output of task 'reduce_4q573x' to file:/tmp/temp-323341057/tmp-1104693095
> 2008-03-29 12:07:33,369 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
- Pig progress = 100%
> (A)
> (B)
> (C)
> ==========================================
> Local execution mode
> ==========================================
> test.all.pig
> (A)
> (B)
> (C)
> (?0?Gs?r?r?s?}8)
> (BZh91AY&SY????8 !?h3M???"?(HP??)
> test.bz2.pig
> (BZh91AY&SY????8 !?h3M???"?(HP??)
> test.gz.pig
> (?0?Gs?r?r?s?}8)
> test.normal.pig
> (A)
> (B)
> (C)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message