Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 89448 invoked from network); 17 Dec 2009 01:04:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Dec 2009 01:04:25 -0000 Received: (qmail 96982 invoked by uid 500); 17 Dec 2009 01:04:25 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 96949 invoked by uid 500); 17 Dec 2009 01:04:25 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 96940 invoked by uid 99); 17 Dec 2009 01:04:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Dec 2009 01:04:25 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bburruss@real.com designates 207.188.23.6 as permitted sender) Received: from [207.188.23.6] (HELO jor-el.real.com) (207.188.23.6) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Dec 2009 01:04:22 +0000 Received: from seacas01.corp.real.com ([::ffff:192.168.139.56]) (TLS: TLSv1/SSLv3,128bits,AES128-SHA) by jor-el.real.com with esmtp; Wed, 16 Dec 2009 17:04:02 -0800 id 0009404B.4B298382.00001D28 Received: from seambx.corp.real.com ([fe80::2d15:fda7:b3b8:e268]) by seacas01.corp.real.com ([192.168.139.56]) with mapi; Wed, 16 Dec 2009 17:04:02 -0800 From: Brian Burruss To: "cassandra-user@incubator.apache.org" Date: Wed, 16 Dec 2009 17:01:52 -0800 Subject: RE: OOM Exception Thread-Topic: OOM Exception Thread-Index: Acp+s3V+7jAdDO0pRXSV/LzaKqEaAgAAQ9di Message-ID: <766B5A29D28DA442AB229AAEE2AFC44507D7B914FA@SEAMBX.corp.real.com> References: <766B5A29D28DA442AB229AAEE2AFC44507D7B914F6@SEAMBX.corp.real.com> <766B5A29D28DA442AB229AAEE2AFC44507D7B914F8@SEAMBX.corp.real.com> <766B5A29D28DA442AB229AAEE2AFC44507D7B914F9@SEAMBX.corp.real.com>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: acceptlanguage: en-US Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=_jor-el.real.com-7464-1261011842-0001-2" This is a MIME-formatted message. If you see this text it means that your E-mail software does not support MIME-formatted messages. --=_jor-el.real.com-7464-1261011842-0001-2 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable attached ... the log starts when i restarted server. notice that not too f= ar into it is when the other node went down because of OOM and i restarted = it as well. ________________________________________ From: Jonathan Ellis [jbellis@gmail.com] Sent: Wednesday, December 16, 2009 4:53 PM To: cassandra-user@incubator.apache.org Subject: Re: OOM Exception sorry, i meant the system.log the 2nd time (clear it out before replaying so it's not confused w/ other info, pls) On Wed, Dec 16, 2009 at 5:39 PM, Brian Burruss wrote: > is this what you want? they are big - i'd rather not spam everyone with = them. if you need them or the hprof files i can tar them and send them to = you. > > thx! > > > [bburruss@gen-app02 cassandra]$ ls -l ~/cassandra/btoddb/commitlog/ > total 597228 > -rw-rw-r-- 1 bburruss bburruss 134219796 Dec 16 13:52 CommitLog-126099589= 5123.log > -rw-rw-r-- 1 bburruss bburruss 134218547 Dec 16 13:52 CommitLog-126099781= 1317.log > -rw-rw-r-- 1 bburruss bburruss 134218331 Dec 16 13:52 CommitLog-126099849= 7744.log > -rw-rw-r-- 1 bburruss bburruss 134219677 Dec 16 13:53 CommitLog-126100033= 0587.log > -rw-rw-r-- 1 bburruss bburruss 74055680 Dec 16 14:49 CommitLog-126100043= 9079.log > [bburruss@gen-app02 cassandra]$ > > ________________________________________ > From: Jonathan Ellis [jbellis@gmail.com] > Sent: Wednesday, December 16, 2009 3:29 PM > To: cassandra-user@incubator.apache.org > Subject: Re: OOM Exception > > How large are the log files being replayed? > > Can you attach the log from a replay attempt? > > On Wed, Dec 16, 2009 at 5:21 PM, Brian Burruss wrote: >> sorry, thought i included everything ;) >> >> however, i am using beta2 >> >> ________________________________________ >> From: Jonathan Ellis [jbellis@gmail.com] >> Sent: Wednesday, December 16, 2009 3:18 PM >> To: cassandra-user@incubator.apache.org >> Subject: Re: OOM Exception >> >> What version are you using? 0.5 beta2 fixes the >> using-more-memory-on-startup problem. >> >> On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss wrote= : >>> i'll put my question first: >>> >>> - how can i determine how much RAM is required by cassandra? (for norm= al operation and restarting server) >>> >>> *** i've attached my storage-conf.xml >>> >>> i've gotten several more OOM exceptions since i mentioned it a week or = so ago. i started from a fresh database a couple days ago and have been ad= ding 2k blocks of data keyed off a random integer at the rate of about 400/= sec. i have a 2 node cluster, RF=3D2, Consistency for read/write is ONE. = there are ~70,420,082 2k blocks of data in the database. >>> >>> i used the default memory setup of Xmx1G when i started a couple days a= go. as the database grew to ~180G (reported by unix du command) both serve= rs OOM'ed at about the same time, within 10 minutes of each other. well ne= edless to say, my cluster is dead. so i upped the memory to 3G and the ser= vers tried to come back up, but one died again with OOM. >>> >>> Before cleaning the disk and starting over a couple days ago, i played = the game of "jack up the RAM", but eventually i didn't want to up it anymor= e when i got to 5G. the parameter, SSTable.INDEX_INTERVAL, was discussed a= few days ago that would change the number of "keys" cached in memory, so i= could modify that at the cost of read performance, but doing the math, 3G = should be plenty of room. >>> >>> it seems like startup requires more RAM than just normal running. >>> >>> so this of course concerns me. >>> >>> i have the hprof files from when the server initially crashed and when = it crashed trying to restart if anyone wants them >>> >> > --=_jor-el.real.com-7464-1261011842-0001-2 Content-Type: application/gzip; name="system.log.tar.gz" Content-Transfer-Encoding: base64 Content-Description: system.log.tar.gz Content-Disposition: attachment; filename="system.log.tar.gz"; size=3446; creation-date="Wed, 16 Dec 2009 17:02:43 GMT"; modification-date="Wed, 16 Dec 2009 17:02:43 GMT" H4sIALuCKUsAA+2cW3PaSBbH87rzKfSQqiFVQvS9JWZ3a72+xVuxyQKpPKT8IJAwmoDkkUQy/vZ7 motxQMhgjJiZPSpXxehyzr+7f+ecbtFO9pDl4dgZJXdvDnYQOJQS03+1Uj/8SxjRGq5RyinjhDLJ 3xDKBWdvLHI4SctjkuV+allver1Jmk6ybNN9z13/kx6MEK9OWZ0qi7Iml03JbE08y7q6uWhZX8Z+ FN9aX06T0WQcX/jjaPTQyZM0dH71v/lNRuSt1Q7HybcovrOS9H7ox1ZjmIzDxqK7Gn0/y/w4SP06 caRD6r0w91mjlydB0GsEfu43JkHW6E36X8O8rqVbv4hGeZg6Qe+nQm2KHklbPr6va1fVr+Ig/L1A nm4KbivBVuR1Ol2/N5qLooreWmfhKMzDYL+OEl79DM5t0uGRinRIUqJD2IyKinTQUh2CVdUfrFSH cqvSwUt1eFVxKkWJDmmzyjiVpToEJRXpUKU6lK5Khy7V4VXGaVkeUzbzKsofqiyPKVvKqnSU5TFl u0pVpKMsj2kbpmoV6SjLY9qWYnU6cCgdZXlM2y6pSkdZHnNtmEZXpKMsj7lQb2VFOsrymGtrrivS 4Zbo8GwiVqfUh9JRlk89m69N7Q+kQ2/Mp26TKJtWxYfemMfcJoV5kHrk47R1/fHktHvVuql/bLU+ NOnmtYWr1a11mozv/X5uVhdfkvTO8eHTMHQe5TlR4swb1Q79IExr934+/MfP+zRHLNPyz+/sqrxK lx7Bq1pOYSv0qoW79HpbwIy0iV7N9e2wn3wL04drP/bvYO06hUQJs/68H/kPBpG5u34yHkf5KLlr nE5/+5DcgXlFPMa4klow8ybGfvZuj3uukJxvebegLuXU3fZupjxGPHP3evuZsAl/bH+79bl+/al7 Mo2aTvfk8rzJvM1xwz0IqllHW0M/s9LQDGhgRXlm5cM0zIbJKPjFyr5HeX9oei2KLd8amAvWdTjO zUgXS1pOV3eURAnlt9Z5/NsknBiPg9EEnCWDR3+1md53/6KSUi04Yeo5CRcfPnXe1zutdve8vcwl C4Nzvy647STpNH/s6kuvEVh1h2+tYPv+ZRTuVcJzSYFHWCkuK+j+/burr8/tqzJfHvj6nEa7jyXn tpZ0F1+MklnpmdXGaXcat3vlPHfjpOElEsu7o6zrubClK47eHV7J3IXDWpC4q3OX6Uv0p/MXCSmn 438DSd3kaxhbg2QSB01LAgXC87THmZaEQBQJRZjQWruKQJmFHy2Lnaq1BRcEGVSbTph+i/qL7tAS /Ob+DPvJvZXB1TC17pIsi+4LDGvIW6uGTxeddeaH4ySet0hA5z9esbKlD8dxCgzDUoQ9Gr68vjib Dt7lVMeiQkoBJfImCUKrQTVzmHYocx3Gp6kqTr5bvyZRHAar5qVqEgkzWW2dt9uttvWle/rR6sBk sQ89Ys1rMDjr9u9PkziG09GiFdwU5fO4D6ORhymMzlXr/Pd+eG/usOCn/3h/0zIPODFMMfpDH86O MqeTGD5OZx+/9EdJFga3P83uC3Nn7uzRYNNauofMO5jA7T/9zc+tbBLPDf9o8gowhpNh/+v8wdqN n0ffQgiifJgE755/eBDFUTZcPL1+fdbtHpuZKpxTmZb82HPzTqkV9ad4ztRiXBZToyD5mCb9MMtq q1fmmPEdDaaTeIMpot8VkqPZ4xp5B3IoRNYpDLmBPkzTxNCzxGVrWgoFaYkoI8ovQVlLRBlRRpQR ZUQZUUaUEeW/OMqu9hBlRPkFKBtyEGVEGVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpQR ZUQZUUaUEWVEGVFGlBFlRPloKHuc/bFQNoIQZUQZUUaUEWVEGVFGlBFlRBlRRpQRZUQZUUaUEWVE GVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpQRZUT5L4SyKxBlRPklKC//32tEGVFGlBFl RBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpQRZUQZUUaUEWVE GVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpQRZUQZUUaUEWVE+ZgoK5twhSgjyjujPCMH UUaUEWVEGVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpT/vCi7LqKMKL8EZSAHUUaUEWVE GVFGlBFlRBlRRpQRZUQZUUaUEWVEGVFGlBFlRBlRRpT/nCgLYbt0+Wct0ThM6xQ0XCZZFt0/NsYD cK+g/SdBkELvWQ2qmcO0Q5nrMG5FmRUn360g9APnBx8cHm0S16aaLny0W5/r15+6J92r1k290z25 PG9C276cJqPJOL7wx9HooQP9Ec4b7oGa91GcZ09vsIZ+Bkyb8QmsKM+sfAiyhsko+MXKvkd5f2j6 KYot3xqYC4DwOPd7o7BQnHD5y8RRQrkJ6N8m4cQ4HIwm4CsZPLqrrSl/9y8qPY+63OXsOTEXHz51 3tc7rXb3vF3/2Gp9aJqRWdieS3BBQSdJc+P/Nd1+bl+VufXA7ec02sstJbakbBe3jBIIgwRySZjD yE/72yhoDJNx2Oj1Jmk6ybLGY8jWiSMdUu+Fuc8avTwJgl4j8HO/kT1keThurAmu0/oZXHeC3opc RgFFWzOvhBQDQynHvYmJydeCd19Jz9M702sGkSnPlUqJ5yS8HNltfAliu9rdn9OtfGnbI/oocE6C rDGTWNdeCZBGonoM29PW9fVVt/6hdTnXOR358TjKPyR389Kr2K11BgVn2h1JL0uMVKgj5i5rlNw1 5zJmZ+BE49ECeFbE8zQRQhLmwLUVQVya8SFSLgRdXl+cNddriRRQS26SIFwtIiYsTBX5NYnicK2O cA3VGdKULKGd0UoDcF9J2wegcJWUrqdXOV1TsH/8lbmStnR3ytfl4VfiiglbcXH86GMbom8m0fO8 aqPP41JITYqiz4MesMUyOxYVA11teOwpaYf6xIXmEsoQf07CK9SnMl/M5uoV5lHb+OLE5t5Ovg4T IXxThKxKrChCBBNSyoIIEbTJtO0KUoJjtQGyp6Jd5m+uqwijTD+n4DXmbyW+XFu/7vxtsy8ubbU6 Lz1GfIgN8TGX6NKK40MSAkFbHB/HEaQlc2mRIEgizGZEldUPVm3E7ilph5AVQguqPEKek/AKIVvm i9tcvmZJK/EFywdPquOHrNwUsiZIbELciiNECU2ZKI6QowjS0O+ssMZKI0joshVRxTV2qkgyfega KzXXHDrnOf/7h2uJJ2YLtdM7i/JgLfHkweKHHj9U1aZQlU1JbCJE1ZHhMU4KI0ObPqO87E1zta8P 91S0fWRwLjwBnbuG0aqA/UOjxBXgQF9z7lnmykypj/Ne+4fg0JuCYybRqzo4XOXBUqM4OI4iCBaL rltUWCVpEvPlRFnVqPht/76Sto9XRjkjVDJvdVW1JmH/gC31BXVTkdeL2DJfVNhK/AHeprgbQtZI lBXP9CiBg3pMbYiQ4wjygL+i5atkMPy2ZmXljFW7WtxX0g6TTyYIYa5YfZO+puAVZp+bXU1fJe30 yuWZ6edmV5JDHvoDLBW9TfG6KrGi8KCaUFcVh8dRBDHJpVj/ulCY7y8ls73lPOnV956AD2a2NnD5 uFDZ9JUkd7dx8enjun0GCyF5uP0z4IObTUNcuYdqA4cfW3F52DZIanN+sHEQTWqWvvSQbRBN6dpK HawNcvqNolaHbIPXZGYZRA7Vhql9BnfMt0a+v7rpnp/V35/cnLUuLpap+SzsTe7uTHLuQmn1g49J Mjr/PexPoPjNWykhXZ+bjW6m0Ibza9Zgkk9SSMnZ19nGx0kejcwOwT6k7zDOnZmRKImfbIOc3jjy 4zunNclbAygLSfowNd20/gPXrGHo31vZvd8Pp3v/Cg1fTB13wfHbzkPcd6I4DtPLMK8tL8zrjJzv SXzGjHNX8DCltGT/4RM7pR3oD/IwnX0Ka1t0tShTXPBUOok/J+nXMK1tNEmpu5PNtzOD0+2Um4wq wp/YnA7o7NYnT83v5OrdT6e+2fFq9R52J6Cw96PE+fdkMDD7dttwJhmf9M1G0osIZhhRHOW1jVfn QJcNbZnxvxvr/3zOvBAHNc91uflOp2sYa8MYwDAC2x+TLDLxV/vxyixGPFViLeg5g2gECC+MdkZR P7yCE75hYa638No8e21n3Dz530mYPlzMToDoudHZfHxhtbZ25xyycjfrk3pw8PTkldkEHvuj2qbp PyTSPV1sNE3dMhr3My1JuWmz6TIM3sPn1mCw2DOdhXFwDcjBh1rhDQvjL7AdhKPoW5hON3t2k/M4 uE/g1xI3jD3TO4WP+tOYeQslscS0fkED3vJphivRq72SbLvIotnb9iSODd8ngX9vOO77o1Ht8fLM ltB0q/r1pAyC2bVKxrnYrgymBQ9TpQ5QkQyWjuNYzIL0H65MXbhuEg+mkN4Bp1/CLMNszfVietS+ vrLM3wAsd/vXomAUvjNLsQUUZz5Uq/lfAWiQcwHrzRFMiBZ/NgKzo3zaZmvW9I1GbWmbSzBYOTRt /ocjWxbFN3jggQceePzfH/8DhRLpfAAYAQA= --=_jor-el.real.com-7464-1261011842-0001-2--