Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 29FADDC7D for ; Mon, 24 Sep 2012 15:54:58 +0000 (UTC) Received: (qmail 80404 invoked by uid 500); 24 Sep 2012 15:54:53 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 80286 invoked by uid 500); 24 Sep 2012 15:54:53 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 80279 invoked by uid 99); 24 Sep 2012 15:54:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2012 15:54:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dechouxb@gmail.com designates 209.85.216.41 as permitted sender) Received: from [209.85.216.41] (HELO mail-qa0-f41.google.com) (209.85.216.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2012 15:54:46 +0000 Received: by qatp27 with SMTP id p27so1476945qat.14 for ; Mon, 24 Sep 2012 08:54:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=V/AqC4A5JS1kp+7vJWWqIS/vKZJ4F0NnbbgIH3buHSg=; b=z3FwwW1aOCwnenHNqAaSnutHG/Y8QyJEwpz8BegBm88XFSTshzdkrElLOvfa8BF/Xj VgvmgBXgzkDagF5C3AFYZ9Uh94UOD6L+zbulC0/wgkrGC343tR9+tGXyMgiNDtGzjXj0 hL7nqI/5Hk6Tzxc59/WO5IHNtyZn7P03HwpbPRyJ4zOsOf9Jij4/ewcOSAQTWgZqRRXZ y+mq/d8gO7M0a4eK+XxYbiZakaPxBvLEzUN3xfidyypn2JuNE/XyUfbCpIgViB6rlAuL RgxMl8totyLMY4zNz6+9aaEn3CH0+I0PvFGCALaao5UyFUJHFP+C6CPIusmvBzR+nzHx 2P8Q== MIME-Version: 1.0 Received: by 10.224.168.83 with SMTP id t19mr33329229qay.8.1348502065744; Mon, 24 Sep 2012 08:54:25 -0700 (PDT) Received: by 10.49.71.231 with HTTP; Mon, 24 Sep 2012 08:54:25 -0700 (PDT) In-Reply-To: References: Date: Mon, 24 Sep 2012 17:54:25 +0200 Message-ID: Subject: Re: Not able to place enough replicas in Reduce From: Bertrand Dechoux To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf3074b5ce9b894904ca749a84 --20cf3074b5ce9b894904ca749a84 Content-Type: text/plain; charset=ISO-8859-1 And do you have any remaining space in your HDFS? (Or do you have quota? But the message should be different, I guess.) What are the metrics you get from the namenode? Are all datanodes (you have only one?) live? http://localhost:50070/dfshealth.jsp As long as you consume (map) you don't need much space in HDFS but when you produce (reduce) you definitely need some. As Ted pointed out, your error is a standard one when hadoop is unable to replicate a block. It should not be related to the reduce itself and even less related about your specific logic. Regards Bertrand On Mon, Sep 24, 2012 at 5:41 PM, Jason Yang wrote: > Hi, Ted > > here is the result of jps: > yanglin@ubuntu:~$ jps > 3286 TaskTracker > 14053 Jps > 2623 DataNode > 2996 JobTracker > 2329 NameNode > 2925 SecondaryNameNode > --- > It seems that the DN is working. > > And it is not failed immediately when enter the reduce phase, actually it > always failed after processing some data > > > 2012/9/24 Steve Loughran > >> >> >> On 24 September 2012 15:47, Ted Reynolds wrote: >> >>> Jason, >>> >>> The line in the JobTracker log - "Could only be replicated to 0 nodes, >>> instead of 1" points to a problem with your data node. I generally means >>> that your DataNode is either down or not functioning correctly. What is >>> the output of the "jps" command? ("jps" is found in /bin). >>> >>> >> >> see also: http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo >> >> -steve >> > > > > -- > YANG, Lin > > -- Bertrand Dechoux --20cf3074b5ce9b894904ca749a84 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable And do you have any remaining space in your HDFS? (Or do you have quota? Bu= t the message should be different, I guess.)
What are the metrics you ge= t from the namenode? Are all datanodes (you have only one?) live?

http://localhost:50070/dfs= health.jsp

As long as you consume (map) you don't need much = space in HDFS but when you produce (reduce) you definitely need some.
As Ted pointed out, your error is a standard one when hadoop is unable to r= eplicate a block. It should not be related to the reduce itself and even le= ss related about your specific logic.

Regards

Bertrand

On Mon, Sep 24, 2012 at 5:41 PM, Jason Yang = <lin.yang.jason@gmail.com> wrote:
Hi, Ted

here is the result of jps:
yangli= n@ubuntu:~$ jps
3286 TaskTracker
14053 Jps
26= 23 DataNode
2996 JobTracker
2329 NameNode
2925 SecondaryNameNode
---
It seems that the DN is work= ing.

And it is not failed immediately when enter t= he reduce phase, actually it always failed after processing some data


2012/9/24 Steve Loughran <stevel@hortonworks.com>


On 24 September 2012 15:47, Ted Rey= nolds <tedr@hortonworks.com> wrote:
Jason,

The line in the JobTracker log - "Could only= be replicated to 0 nodes, instead of 1" points to a problem with your= data node. =A0I generally means that your DataNode is either down or not f= unctioning correctly. =A0What is the output of the "jps" command?= =A0("jps" is found in <JAVA_HOME>/bin).




-steve



--
YANG, Lin




--
Bertrand Dechoux
--20cf3074b5ce9b894904ca749a84--