Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 676E7BE0F for ; Wed, 18 Jan 2012 17:50:29 +0000 (UTC) Received: (qmail 20666 invoked by uid 500); 18 Jan 2012 17:50:28 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 20593 invoked by uid 500); 18 Jan 2012 17:50:27 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 20584 invoked by uid 99); 18 Jan 2012 17:50:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jan 2012 17:50:26 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lordjoe2000@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-tul01m020-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jan 2012 17:50:20 +0000 Received: by obcwp18 with SMTP id wp18so7465807obc.35 for ; Wed, 18 Jan 2012 09:49:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=cn9m/cn4MRyYdyRBMEnA3nEvw4SB7P5PDsRQ3Co4YUw=; b=QPihyI5es+GVKidB+1RLZ7g49imjANBjGvbo6sz3TS3b7ijuM36IoCS+oGMp1u4ZSk GottLzWf6bPabNavHbuJrVTHfKv5d/yb92qRvaFiBIZ9HDPLFWHtvEAhkYi8qkBVq+ZR DY5tyJcAkVvgnrTrV2yvrantUh8h2QE/XwTzo= MIME-Version: 1.0 Received: by 10.182.231.7 with SMTP id tc7mr19918425obc.29.1326908999832; Wed, 18 Jan 2012 09:49:59 -0800 (PST) Received: by 10.182.92.105 with HTTP; Wed, 18 Jan 2012 09:49:59 -0800 (PST) Date: Wed, 18 Jan 2012 09:49:59 -0800 Message-ID: Subject: Writing large output kills job with timeout _ need ideas From: Steve Lewis To: mapreduce-user Content-Type: multipart/alternative; boundary=f46d044788779593c004b6d1131a --f46d044788779593c004b6d1131a Content-Type: text/plain; charset=ISO-8859-1 I am running a mapper job which generates a large number of output records for every input record. about 32,000,000,000 output records from about 150 mappers - each record about 200 bytes The job is failing with timeouts. When I alter the code to do exactly what it did previously but only output 1 in 100 output records it runs to completion with no difficulty. I believe I am saturating some local resource on the mapper but this gets WAY beyond my knowledge of what is going on internally Any bright ideas? -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com --f46d044788779593c004b6d1131a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I am running a mapper job which generates a large number of output records = for every input record.=A0
about 32,000,000,000 output records from about 150 mappers - ea= ch record about 200 bytes
The job is failing with timeouts= .
When I alter the c= ode to do exactly what it did previously but only output 1 in 100 output re= cords it runs to completion with no=A0
difficulty.
I believe I am saturating some local res= ource on the mapper but this gets WAY beyond my knowledge of what is going = on internally
Any bright ideas?
--=
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033=
206-384-1340 (cell)
Skype lordjoe_com


--f46d044788779593c004b6d1131a--