hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Writing large output kills job with timeout _ need ideas
Date Wed, 18 Jan 2012 17:49:59 GMT
I am running a mapper job which generates a large number of output records
for every input record.
about 32,000,000,000 output records from about 150 mappers - each record
about 200 bytes
The job is failing with timeouts.
When I alter the code to do exactly what it did previously but only output
1 in 100 output records it runs to completion with no
difficulty.
I believe I am saturating some local resource on the mapper but this gets
WAY beyond my knowledge of what is going on internally
Any bright ideas?
-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Mime
View raw message