Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 6044 invoked from network); 13 Jun 2010 19:12:32 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Jun 2010 19:12:32 -0000 Received: (qmail 20632 invoked by uid 500); 13 Jun 2010 19:12:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 20589 invoked by uid 500); 13 Jun 2010 19:12:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 20577 invoked by uid 99); 13 Jun 2010 19:12:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Jun 2010 19:12:30 +0000 X-ASF-Spam-Status: No, hits=1.0 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of matt@backupify.com does not designate 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Jun 2010 19:12:22 +0000 Received: by vws9 with SMTP id 9so4189549vws.31 for ; Sun, 13 Jun 2010 12:12:00 -0700 (PDT) Received: by 10.229.239.67 with SMTP id kv3mr1798381qcb.245.1276456320389; Sun, 13 Jun 2010 12:12:00 -0700 (PDT) Received: from wrongway.conwaysplace.com (c-76-119-185-62.hsd1.ma.comcast.net [76.119.185.62]) by mx.google.com with ESMTPS id t3sm10225730qco.43.2010.06.13.12.11.59 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 13 Jun 2010 12:11:59 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1078) Subject: Re: File Descriptor leak From: Matthew Conway In-Reply-To: Date: Sun, 13 Jun 2010 15:11:58 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <3DA849BB-54DB-421E-8362-D153352E59A6@backupify.com> References: To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org Pretty sure as the list of file descriptors below shows (at this point = the client has exited, so doubly sure its not open sockets): # lsof -p `ps ax | grep [C]assandraDaemon | awk '{print $1}'` | awk = '{print $9}' | sort | uniq -c | sort -n | tail -n 5 2 = /usr/local/apache-cassandra-2010-06-11_12-30-33/lib/slf4j-log4j12-1.5.8.ja= r 2 = /usr/local/apache-cassandra-2010-06-11_12-30-33/lib/snakeyaml-1.6.jar 2 /usr/share/java/gnome-java-bridge.jar 1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-1-Data.db 1003 /mnt/cassandra/data/MyKeyspace/MySuperColumn-c-2-Data.db On Jun 11, 2010, at Fri Jun 11, 7:34 PM, Jonathan Ellis wrote: > it goes up by exactly 2000, which is the number of loop iterations > exactly? are you sure this isn't just counting your open sockets? >=20 > On Fri, Jun 11, 2010 at 1:53 PM, Matthew Conway = wrote: >> Thanks, I just tried apache-cassandra-2010-06-11_12-30-33 (hudson = 462) but my tests ares still reporting a leak (though not as bad), I do = the following (ruby tests using cassandra_object/cassandra, but you = should be able to get the idea): >>=20 >> should "not leak file descriptors" do >> cassandra_pid =3D `ps ax | grep [C]assandraDaemon | awk = '{print $1}'` >> original_count =3D `lsof -p #{cassandra_pid}`.lines.to_a.size >> assert original_count > 0 >> count =3D 1000 >> count.times do |n| >> ChildMetadatum.new(:service_id =3D> 4, :child_id =3D> = "def#{n}", :updated =3D> Time.now, :labels =3D> ["label2", = "label3"]).save! >> end >> count.times do |n| >> ChildMetadatum.find_by_natural_key(:service_id =3D> 4, = :child_id =3D> "def#{n}") >> ChildMetadatum.find_all_by_service_id(3) >> end >> new_count =3D `lsof -p #{cassandra_pid}`.lines.to_a.size >> assert new_count > 0 >> assert new_count < original_count * 1.1, "File descriptors = leaked from #{original_count} to #{new_count}" >> end >>=20 >> Which reports: File descriptors leaked from 112 to 2112. >> SHould I reopen the bug or create a new one? >>=20 >> Matt >>=20 >> On Jun 10, 2010, at Thu Jun 10, 6:40 PM, Jonathan Ellis wrote: >>=20 >>> Fixed in https://issues.apache.org/jira/browse/CASSANDRA-1178 >>>=20 >>> On Thu, Jun 10, 2010 at 9:01 AM, Matt Conway = wrote: >>>> Hi All, >>>> I'm running a small 4-node cluster with minimal load using >>>> the 2010-06-08_12-31-16 build from trunk, and its exhausting file >>>> descriptors pretty quickly (65K in less than an hour). Here's a = list of the >>>> files I see it leaking, I can do a more specific query if you'd = like. Am I >>>> doing something wrong, is this a known problem, something being = done wrong >>>> from the client side, or something else? Any help appreciated, = thanks, >>>> Matt >>>> root@cassandra01:~# lsof -p `ps ax | grep [C]assandraDaemon | awk = '{print >>>> $1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5 >>>> 3 /mnt/cassandra/data/system/Schema-c-2-Data.db >>>> 1278 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-7-Data.db >>>> 1405 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-9-Data.db >>>> 1895 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-5-Data.db >>>> 26655 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-11-Data.db >>>>=20 >>>>=20 >>>=20 >>>=20 >>>=20 >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of Riptano, the source for professional Cassandra support >>> http://riptano.com >>=20 >>=20 >=20 >=20 >=20 > --=20 > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com