pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Le Dem <jul...@twitter.com>
Subject Re: [tw-hadoop-users] ERROR 2017: Internal error creating job configuration.
Date Thu, 15 Nov 2012 17:36:46 GMT
Hi Kurt,
>From the stack trace, it looks like it runs into an error while
estimating the size of the input.
Are all of the paths it's looking for there in hdfs:///user/kurt ?
does it work with pig_11 ? add --pig_version pig_11 to the oink command
also send out the command line you are using.
Thanks,
Julien

On Thu, Nov 15, 2012 at 9:15 AM, Kurt Smith <kurt@twitter.com> wrote:
> I'm getting this error when doing a manual run of the search_simplified
> twadoop query. This query has run fine before. Any idea what the issue is?
>
> Pig Stack Trace
> ---------------
> ERROR 2017: Internal error creating job configuration.
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException:
> ERROR 2017: Internal error creating job configuration.
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:738)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:264)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:150)
>         at org.apache.pig.PigServer.launchPlan(PigServer.java:1267)
>         at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
>         at org.apache.pig.PigServer.execute(PigServer.java:1242)
>         at org.apache.pig.PigServer.executeBatch(PigServer.java:356)
>         at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
>         at
> org.apache.pig.tools.grunt.GruntParser.processScript(GruntParser.java:452)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.Script(PigScriptParser.java:752)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:423)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
>         at org.apache.pig.Main.run(Main.java:561)
>         at org.apache.pig.Main.main(Main.java:111)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:971)
>         at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:944)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getInputSize(JobControlCompiler.java:840)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:810)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.adjustNumReducers(JobControlCompiler.java:750)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:396)
>         ... 20 more
> ================================================================================
>
>
> from the output:
> ---
>
> 2012-11-15 07:54:02,554 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to
> the job
> 2012-11-15 07:54:02,555 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2012-11-15 07:54:03,005 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - BytesPerReducer=1610612736 maxReducers=999
> totalInputFileSize=1456250098769
> 2012-11-15 07:54:03,005 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting Parallelism to 15
> 2012-11-15 07:54:04,655 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - creating jar file Job917013994443138523.jar
> 2012-11-15 07:54:07,963 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - jar file Job917013994443138523.jar created
> 2012-11-15 07:54:07,972 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up multi store job
> 2012-11-15 07:54:08,199 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to
> the job
> 2012-11-15 07:54:08,199 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2012-11-15 07:54:08,404 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2017: Internal error creating job configuration.
> Details at logfile: /var/log/pig/pig_1352950582587.log
> 2012-11-15 07:54:08,706 [Thread-5] INFO
> org.apache.hcatalog.common.HiveClientCache - Cleaning up hive client cache
> in ShutDown hook
> [2012-11-15 07:54:10] Pig job failed: return code 6 from running
> /usr/bin/pig_9 -t ColumnMapKeyPrune -F -param
> TWADOOP_HOME=/home/kurt/twadoop  -p END_DATE="20121102" -p
> END_TIME_UNIX="1351814400" -p
> LATEST_PATH_FILE="/user/kurt/processed/search/search_simplified/daily/_latest"
> -p START_TIME="2012-11-01-00:00:00" -p PREVIOUS_ONE_WEEK="2012/10/25" -p
> REGISTER_HCAT="'register
> /usr/lib/hcatalog/share/hcatalog/hcatalog-{core,pig-adapter}-*-*.jar'" -p
> START_DATE="20121101" -p PREVIOUS_ONE_MONTH="2012/10/01" -p BATCH_ID="0" -p
> SCHEDULER_POOL="'set mapred.fairscheduler.pool search'" -p
> PREVIOUS_FOUR_WEEKS_AGO="2012/10/05" -p PREVIOUS_ONE_DAY="2012/10/31" -p
> REGISTER_DAL="'register /usr/lib/dal/dal.jar;'" -p REGISTER_HIVE="'register
> /usr/lib/hive/lib/{hive-exec-*-*,hive-metastore-*-*,libfb303-*}.jar'" -p
> INPROCESS_DIR="/user/kurt/in_process/processed/search/search_simplified/daily/2012/11/01"
> -p JOB_NAME="search_simplified:daily_2012/11/01_to_2012/11/02" -p
> BATCH_DESC="'oink search_simplified:daily'" -p END_TIME_DAY_OF_WEEK="4" -p
> SET_PARALLEL="'set default_parallel 15'" -p
> ALL_DATES_4_WEEKS_AGO_TO_TODAY="2012/10/05,2012/10/06,2012/10/07,2012/10/08,2012/10/09,2012/10/10,2012/10/11,2012/10/12,2012/10/13,2012/10/14,2012/10/15,2012/10/16,2012/10/17,2012/10/18,2012/10/19,2012/10/20,2012/10/21,2012/10/22,2012/10/23,2012/10/24,2012/10/25,2012/10/26,2012/10/27,2012/10/28,2012/10/29,2012/10/30,2012/10/31,2012/11/01"
> -p
> ALL_DATES_2_WEEKS_AGO_TO_TODAY="2012/10/19,2012/10/20,2012/10/21,2012/10/22,2012/10/23,2012/10/24,2012/10/25,2012/10/26,2012/10/27,2012/10/28,2012/10/29,2012/10/30,2012/10/31,2012/11/01"
> -p RAND="816544" -p
> ALL_DATES_8_WEEKS_TO_4_WEEKS_AGO="2012/09/07,2012/09/08,2012/09/09,2012/09/10,2012/09/11,2012/09/12,2012/09/13,2012/09/14,2012/09/15,2012/09/16,2012/09/17,2012/09/18,2012/09/19,2012/09/20,2012/09/21,2012/09/22,2012/09/23,2012/09/24,2012/09/25,2012/09/26,2012/09/27,2012/09/28,2012/09/29,2012/09/30,2012/10/01,2012/10/02,2012/10/03,2012/10/04"
> -p PREVIOUS_TWO_WEEKS_AGO="2012/10/19" -p
> OUTPUT_DIR="/user/kurt/processed/search/search_simplified/daily/2012/11/01"
> -p
> OUTPUT_DIR_PARENT="/user/kurt/processed/search/search_simplified/daily/2012/11"
> -p OUTPUT_BASE="dal://smf1-dw-hcat.kurt.search_search_simplified_daily" -p
> START_TIME_DAY_OF_WEEK="3" -p END_TIME="2012-11-02-00:00:00" -p
> BATCH_STEP="86400" -p END_DATE_MINUS_ONE_WITH_SLASHES="2012/11/01" -p
> START_TIME_UNIX="1351728000" -p START_HOUR="00" -p END_HOUR="00" -p
> DEBUG="off" -p PART="'part_dt=20121101T000000Z'"
> /tmp/oink20121115-33234-42y95k-0.pig at Thu Nov 15 07:54:10 +0000 2012.
> Exiting...
> [2012-11-15 07:54:10] Oink failed because of unhandled exeception:
> Pig job failed: return code 6 from running  /usr/bin/pig_9 -t
> ColumnMapKeyPrune -F -param TWADOOP_HOME=/home/kurt/twadoop  -p
> END_DATE="20121102" -p END_TIME_UNIX="1351814400" -p
> LATEST_PATH_FILE="/user/kurt/processed/search/search_simplified/daily/_latest"
> -p START_TIME="2012-11-01-00:00:00" -p PREVIOUS_ONE_WEEK="2012/10/25" -p
> REGISTER_HCAT="'register
> /usr/lib/hcatalog/share/hcatalog/hcatalog-{core,pig-adapter}-*-*.jar'" -p
> START_DATE="20121101" -p PREVIOUS_ONE_MONTH="2012/10/01" -p BATCH_ID="0" -p
> SCHEDULER_POOL="'set mapred.fairscheduler.pool search'" -p
> PREVIOUS_FOUR_WEEKS_AGO="2012/10/05" -p PREVIOUS_ONE_DAY="2012/10/31" -p
> REGISTER_DAL="'register /usr/lib/dal/dal.jar;'" -p REGISTER_HIVE="'register
> /usr/lib/hive/lib/{hive-exec-*-*,hive-metastore-*-*,libfb303-*}.jar'" -p
> INPROCESS_DIR="/user/kurt/in_process/processed/search/search_simplified/daily/2012/11/01"
> -p JOB_NAME="search_simplified:daily_2012/11/01_to_2012/11/02" -p
> BATCH_DESC="'oink search_simplified:daily'" -p END_TIME_DAY_OF_WEEK="4" -p
> SET_PARALLEL="'set default_parallel 15'" -p
> ALL_DATES_4_WEEKS_AGO_TO_TODAY="2012/10/05,2012/10/06,2012/10/07,2012/10/08,2012/10/09,2012/10/10,2012/10/11,2012/10/12,2012/10/13,2012/10/14,2012/10/15,2012/10/16,2012/10/17,2012/10/18,2012/10/19,2012/10/20,2012/10/21,2012/10/22,2012/10/23,2012/10/24,2012/10/25,2012/10/26,2012/10/27,2012/10/28,2012/10/29,2012/10/30,2012/10/31,2012/11/01"
> -p
> ALL_DATES_2_WEEKS_AGO_TO_TODAY="2012/10/19,2012/10/20,2012/10/21,2012/10/22,2012/10/23,2012/10/24,2012/10/25,2012/10/26,2012/10/27,2012/10/28,2012/10/29,2012/10/30,2012/10/31,2012/11/01"
> -p RAND="816544" -p
> ALL_DATES_8_WEEKS_TO_4_WEEKS_AGO="2012/09/07,2012/09/08,2012/09/09,2012/09/10,2012/09/11,2012/09/12,2012/09/13,2012/09/14,2012/09/15,2012/09/16,2012/09/17,2012/09/18,2012/09/19,2012/09/20,2012/09/21,2012/09/22,2012/09/23,2012/09/24,2012/09/25,2012/09/26,2012/09/27,2012/09/28,2012/09/29,2012/09/30,2012/10/01,2012/10/02,2012/10/03,2012/10/04"
> -p PREVIOUS_TWO_WEEKS_AGO="2012/10/19" -p
> OUTPUT_DIR="/user/kurt/processed/search/search_simplified/daily/2012/11/01"
> -p
> OUTPUT_DIR_PARENT="/user/kurt/processed/search/search_simplified/daily/2012/11"
> -p OUTPUT_BASE="dal://smf1-dw-hcat.kurt.search_search_simplified_daily" -p
> START_TIME_DAY_OF_WEEK="3" -p END_TIME="2012-11-02-00:00:00" -p
> BATCH_STEP="86400" -p END_DATE_MINUS_ONE_WITH_SLASHES="2012/11/01" -p
> START_TIME_UNIX="1351728000" -p START_HOUR="00" -p END_HOUR="00" -p
> DEBUG="off" -p PART="'part_dt=20121101T000000Z'"
> /tmp/oink20121115-33234-42y95k-0.pig at Thu Nov 15 07:54:10 +0000 2012.
> Exiting...
>         /home/kurt/twadoop/oink/lib/runner.rb:327:in `fail'
>         /home/kurt/twadoop/oink/lib/runner.rb:172:in `run'
>         /home/kurt/twadoop/oink/oink.rb:25:in `go'
>         /home/kurt/twadoop/oink/oink.rb:46
>
>
> --
> Kurt Smith
> Senior Data Scientist, Analytics | Twitter, Inc
> @kurtosis0
>

Mime
View raw message