Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 59550DA67 for ; Sun, 1 Jul 2012 00:01:04 +0000 (UTC) Received: (qmail 60617 invoked by uid 500); 1 Jul 2012 00:01:04 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 60549 invoked by uid 500); 1 Jul 2012 00:01:04 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 60541 invoked by uid 99); 1 Jul 2012 00:01:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jul 2012 00:01:04 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (athena.apache.org: transitioning domain of wilhelm.von.cloud@accumulo.net does not designate 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jul 2012 00:00:58 +0000 Received: by obhx4 with SMTP id x4so1837204obh.0 for ; Sat, 30 Jun 2012 17:00:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=QgknL29SUOOWI5qCu1n5nnAOzkk72qRwxcBD/RAiHcM=; b=ZfZz7OyUiCvQM7Jra069hxYklXeGl2anGIIA5E0/wrLOTCp5x/UuzCfIpKF4+z+oCo AojOlCT6WYomIDIKvF74Zx+nYWsKSGM47ch3n1dumo8mHhT/172p1+4nQFJsGoTTCqR5 w85rH1eYeYCma5+oYzcbS51PTB7u4wdTQ1RfdKn6jwA6Cm0gNN4vxZXgPsFvnPCzrYoF 1a+pwdd97E0Kcft8QXnwnRkzyaXFkjBzO2SS3dWqH0aBjjyjiO5FSY8hdyLYUKcJjJDJ KdznUMEwn9qiPiNJhiyVPHjANf9/NGm5/LWQJz7R10oClJK99x5f4aXIaPtW5APMpJ8J ZX8w== MIME-Version: 1.0 Received: by 10.60.22.71 with SMTP id b7mr7491100oef.44.1341100837831; Sat, 30 Jun 2012 17:00:37 -0700 (PDT) Received: by 10.60.41.71 with HTTP; Sat, 30 Jun 2012 17:00:37 -0700 (PDT) X-Originating-IP: [71.179.255.217] In-Reply-To: References: <1609498285.73408.1341004784298.JavaMail.jiratomcat@issues-vm> Date: Sat, 30 Jun 2012 20:00:37 -0400 Message-ID: Subject: Re: [jira] [Created] (ACCUMULO-665) large values, complex iterator stacks, and RFile readers can consume a surprising amount of memory From: William Slacum To: dev@accumulo.apache.org Content-Type: multipart/alternative; boundary=e89a8f839e8f0c059f04c3b95fa0 X-Gm-Message-State: ALoCoQnLXKE+dJ0G4Yqt3ygkPe3uuBck8Q5Th7B1KwolrmraU/gqbPvHlS7WEOw/dreKrx257QfM X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f839e8f0c059f04c3b95fa0 Content-Type: text/plain; charset=ISO-8859-1 He's referring to something like the BooleanLogic iterator stack in the Wikipedia example. It's a tree of user iterators that are merging streams of key-value pairs together, so you end up getting many open readers and possibly many RFile blocks spread out among many HDFS blocks concurrently. On Sat, Jun 30, 2012 at 7:40 PM, David Medinets wrote: > How would you define complex iterator stack? Can you outline the elements? > On Jun 29, 2012 5:19 PM, "Eric Newton (JIRA)" wrote: > > > Eric Newton created ACCUMULO-665: > > ------------------------------------ > > > > Summary: large values, complex iterator stacks, and RFile > > readers can consume a surprising amount of memory > > Key: ACCUMULO-665 > > URL: https://issues.apache.org/jira/browse/ACCUMULO-665 > > Project: Accumulo > > Issue Type: Bug > > Components: tserver > > Affects Versions: 1.5.0, 1.4.0 > > Environment: large cluster > > Reporter: Eric Newton > > Assignee: Eric Newton > > Priority: Minor > > > > > > On a production cluster, with a complex iterator tree, a large value > > (~350M) was causing a 4G tserver to fail with out-of-memory. > > > > There were several factors contributing to the problem: > > # a bug: the query should not have been looking to the big data > > # complex iterator tree, causing many copies of the data to be held at > the > > same time > > # RFile doubles the buffer it uses to load values, and continues to use > > that large buffer for future values > > > > This ticket is for the last point. If we know we're not even going to > > look at the value, we can read past it without storing it in memory. It > is > > surprising that skipping past a large value would cause the server to run > > out of memory, especially since it should fit into memory enough times to > > be returned to the caller. > > > > > > -- > > This message is automatically generated by JIRA. > > If you think it was sent incorrectly, please contact your JIRA > > administrators: > > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > > > > > --e89a8f839e8f0c059f04c3b95fa0--