Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 66E8248BA for ; Wed, 11 May 2011 14:30:45 +0000 (UTC) Received: (qmail 59802 invoked by uid 500); 11 May 2011 14:30:44 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 59777 invoked by uid 500); 11 May 2011 14:30:44 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 59769 invoked by uid 99); 11 May 2011 14:30:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2011 14:30:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wenhao.xu@gmail.com designates 209.85.161.41 as permitted sender) Received: from [209.85.161.41] (HELO mail-fx0-f41.google.com) (209.85.161.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2011 14:30:38 +0000 Received: by fxm18 with SMTP id 18so568703fxm.14 for ; Wed, 11 May 2011 07:30:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=J0nJNwcgyxH9PIcRdQE+9H5Pcr4o8YeaqRux3rciEqw=; b=V/tjwSHjRPtXPuKaESYQUu6te+T1uzB9Ms8TS2fvbl/GfINyX4bdX26h1W8VVzRQIO KcerAlabzfql6cWlpq++keYx71UBdSChxXv8rONmWrODxGaafuW6ohP1ch1ADNotys9t MgEXD/saTjZ0MOv1lWcthQ+SC7fw6acEIG+i4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=knkMsKjgPlv2kSNzP2VQ80QkWzDghDIWsKgHnURZL+S2n9zBRZC9B8OHT1VPX7EvGU tQbujNovm76s2Fq4FasnwCk9iAg2notlz6OI6czXbI1V9Lqp1PPtMyP+6oOBTGL4dx3Q AIyfwFQPpIw3Upxxtm+U3f01S2PF9Te7iVzzQ= MIME-Version: 1.0 Received: by 10.223.7.8 with SMTP id b8mr1781018fab.19.1305124217625; Wed, 11 May 2011 07:30:17 -0700 (PDT) Received: by 10.223.24.195 with HTTP; Wed, 11 May 2011 07:30:17 -0700 (PDT) Date: Wed, 11 May 2011 22:30:17 +0800 Message-ID: Subject: How could I make sure the famous "xceiver" parameters works in the data node? From: Stanley Xu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001517478582611e4704a300e932 --001517478582611e4704a300e932 Content-Type: text/plain; charset=ISO-8859-1 Dear all, We are using hadoop 0.20.2 with a couple of patches, and hbase 0.20.6, when we are running a MapReduce job which contains a lots of random access to a hbase table. We met a lot of logs like the following at the same time in the region server and data node: For RegionServer: "INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_7212216405058183301_3974453 from any node: java.io.IOException: No live nodes contain current block" "WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to / 10.0.2.44:50010 for file /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block 7212216405058183301:java.io.IOException: Got error in response to OP_READ_BLOCK for file /hbase/CookieTag/197333923/VisitStrength/151799904199528367 for block 7212216405058183301" and For DataNode: "ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.0.2.26:50010, storageID=DS-1332752738-192.168.11.99-50010-1285486780176, infoPort=50075, ipcPort=50020):DataXceiver at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)" We changed the dfs.datanode.max.xcievers parameters to 4096 in the hdfs-site.xml in both hadoop and hdfs configuration. But when we use a VisualVM to connect to a data node, we found there is less that 100 threads(close to 100, we count 97 in a thread dump, I am guessing the lost 3 is finished or just started) DataXceiver thread. (Thread dump logs like the following) "org.apache.hadoop.hdfs.server.datanode.DataXceiver@3546286a" - Thread t@2941003 java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)" I guess our setting in the hdfs-site.xml didn't really work or be activated. We have restarted the hadoop cluster by stop-dfs.sh and start-dfs.sh and also restarted the hbase as well. I am wondering if anyone could tell me how could I make sure the xceiver parameters works or anything I should do except restart the dfs and hbase? Could I do any check in the web interface or anywhere else? Thanks in advance. Best wishes, Stanley Xu --001517478582611e4704a300e932--