Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 78661 invoked from network); 2 Mar 2011 02:43:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Mar 2011 02:43:59 -0000 Received: (qmail 21589 invoked by uid 500); 2 Mar 2011 02:43:59 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 21495 invoked by uid 500); 2 Mar 2011 02:43:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 21487 invoked by uid 99); 2 Mar 2011 02:43:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2011 02:43:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2011 02:43:56 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id DC46C4BB89 for ; Wed, 2 Mar 2011 02:43:36 +0000 (UTC) Date: Wed, 2 Mar 2011 02:43:36 +0000 (UTC) From: "Brian Bockelman (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <933926331.6967.1299033816898.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Updated: (HDFS-420) fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Bockelman updated HDFS-420: --------------------------------- Fix Version/s: 0.20.3 Assignee: Brian Bockelman Affects Version/s: 0.20.2 Status: Patch Available (was: Open) > fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse > ------------------------------------------------------------------------------------------------------- > > Key: HDFS-420 > URL: https://issues.apache.org/jira/browse/HDFS-420 > Project: Hadoop HDFS > Issue Type: Bug > Components: contrib/fuse-dfs > Affects Versions: 0.20.2 > Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build 10.0-b19, mixed mode) > Reporter: Dima Brodsky > Assignee: Brian Bockelman > Fix For: 0.20.3 > > Attachments: fuse_dfs_020_memleaks.patch > > > I run the following test: > 1. Run hadoop DFS in single node mode > 2. start up fuse_dfs > 3. copy my source tree, about 250 megs, into the DFS > cp -av * /mnt/hdfs/ > in /var/log/messages I keep seeing: > Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to 1229385138/1229963739 > and then eventually > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 > and the file system hangs. hadoop is still running and I don't see any errors in it's logs. I have to unmount the dfs and restart fuse_dfs and then everything is fine again. At some point I see the following messages in the /var/log/messages: > ERROR: dfs problem - could not close file_handle(139677114350528) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log fuse_dfs.c:1464 > Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139676770220176) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log fuse_dfs.c:1464 > Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139677114812832) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8138-93825070138960-1229251587.log fuse_dfs.c:1464 > Is this a known issue? Am I just flooding the system too much. All of this is being performed on a single, dual core, machine. > Thanks! > ttyl > Dima -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira