Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 01A3D7CD2 for ; Sun, 30 Oct 2011 09:03:00 +0000 (UTC) Received: (qmail 51568 invoked by uid 500); 30 Oct 2011 09:02:55 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 51454 invoked by uid 500); 30 Oct 2011 09:02:54 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 51443 invoked by uid 99); 30 Oct 2011 09:02:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Oct 2011 09:02:54 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.220.176] (HELO mail-vx0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Oct 2011 09:02:49 +0000 Received: by vcdn13 with SMTP id n13so6505351vcd.35 for ; Sun, 30 Oct 2011 02:02:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.8.195 with SMTP id i3mr1646448vci.209.1319965346972; Sun, 30 Oct 2011 02:02:26 -0700 (PDT) Received: by 10.220.188.134 with HTTP; Sun, 30 Oct 2011 02:02:26 -0700 (PDT) Date: Sun, 30 Oct 2011 11:02:26 +0200 Message-ID: Subject: HDFS DataNode daily log growing really high and fast From: Ronen Itkin To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec54fbbb89f42a404b0806113 --bcaec54fbbb89f42a404b0806113 Content-Type: text/plain; charset=ISO-8859-1 Hey all! I am having an issue with hadoop's daily datanode log growing to + 1.8 GB. I have 3 Nodes in my hdfs cluster, all sharing the same configuration (including same log4j.properties). While running operations and jobs equally (automatically) on whole of the nodes, only one of them (data node*03*)is having this issue with the log growing high. /var/log/hadoop/hadoop-hadoop-datanode-ip-10-10-10-4.log The log does not show any exceptions, just many hdfs operations (read+write). I am currently running *Cloudera* *hadoop-0.20.2-cdh3u1* and this is my architecture: *MasterServer*: NameNode, JobTracker, HBase HMaster (*hbase-0.90.3-cdh3u1*), Zookeeper01 *Node01*: Data Node, TaskTracker, HBase HRegion *Node02*: Data Node, TaskTracker, HBase HRegion *Node03*: Data Node, TaskTracker, HBase HRegion *SecondaryServer*: Secondary NameNode, HBase Backup HMaster, Zookeeper02 *ServerX*: ZooKeeper03 Does any one can think of a good reason, why it happens ? why to a specific node? is is related to hbase operations? hdfs block scanner? Here is a sample of the log file: 2011-10-30 08:52:27,313 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /* 10.10.10.4*:50010, dest: /*10.10.10.4*:43447, bytes: 66564, op: HDFS_READ, cliID: DFSClient_hb_rs_ip-*10-10-10-4*.ec2.internal,60020,1318334166605_1318334167243, offset: 34500096, srvID: DS-75443592-10.93.67.113-50010-1318335522512, blockid: blk_2773771462926694276_25674, duration: 274702 2011-10-30 08:52:27,314 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /* 10.10.10.4*:50010, dest: /*10.10.10.4*:43448, bytes: 66564, op: HDFS_READ, cliID: DFSClient_hb_rs_ip-*10-10-10-4*.ec2.internal,60020,1318334166605_1318334167243, offset: 34631168, srvID: DS-75443592-10.93.67.113-50010-1318335522512, blockid: blk_2773771462926694276_25674, duration: 236691 Thanks, *Ronen.* --bcaec54fbbb89f42a404b0806113--