Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 892CC10A76 for ; Mon, 28 Oct 2013 22:50:29 +0000 (UTC) Received: (qmail 78467 invoked by uid 500); 28 Oct 2013 22:50:29 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 78437 invoked by uid 500); 28 Oct 2013 22:50:29 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 78429 invoked by uid 99); 28 Oct 2013 22:50:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Oct 2013 22:50:29 +0000 X-ASF-Spam-Status: No, hits=0.3 required=5.0 tests=FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of josh.elser@gmail.com designates 209.85.161.169 as permitted sender) Received: from [209.85.161.169] (HELO mail-gg0-f169.google.com) (209.85.161.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Oct 2013 22:50:24 +0000 Received: by mail-gg0-f169.google.com with SMTP id b5so1574902ggb.14 for ; Mon, 28 Oct 2013 15:50:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=bdRWw/grZNBLnC+VHgUqKI3BApPmY5uoP7oqBwUvfss=; b=q4lR+gROzT0s35kl1N9caSAQRVAtP/gdxCYE/pK0wJAOTXi8OTiFkJtXmKc/lt5IWh aXTs4MizBSxDVn9zK66DWdznnMKOoESEQzaRNiOc/jMzUa7uojfzEurIeV1rFy4gF4cK W3HA0uJUABfqDe+7OW6TOKLhyxxI8+aUAXal9kBM0uarmn8piU1k7cDWzZDWQfz2vUVA SRrq5IIwQ3/JzEy4DEn7lmpScuGO/ZoouBkDEEksAli4t3ah09XYy/agnIDSWZtEHP34 U7tnmbo7SzvOGey4c5qfGa4lYz3DjDfUWVZZ4tLejrYDBwR/X7zl/+LVcu76/VnYnjP1 VwoA== X-Received: by 10.236.41.102 with SMTP id g66mr17560835yhb.20.1383000603271; Mon, 28 Oct 2013 15:50:03 -0700 (PDT) Received: from HW10447.local (pool-72-81-136-94.bltmmd.fios.verizon.net. [72.81.136.94]) by mx.google.com with ESMTPSA id v22sm24859730yhn.12.2013.10.28.15.50.01 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 28 Oct 2013 15:50:02 -0700 (PDT) Message-ID: <526EEA19.9070606@gmail.com> Date: Mon, 28 Oct 2013 18:50:01 -0400 From: Josh Elser User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 To: user@accumulo.apache.org Subject: Re: How to reduce number of entries in memory References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org It kind of sounds like you should be more concerned about nodes randomly dropping out of your cluster :) If you're stuck on 1.4 series, you can try to up the property 'tserver.logger.count' to '3' instead of the default of '2' to ensure that you have a greater chance of not losing a WAL replica. With 1.5, you'll get your HDFS replication (which is likely 3, as well). I'm not sure off the top of my head if the 1.4 loggers have an rack locality for nodes (e.g. one replicas on rack, and one off rack). Regardless, trying to avoid network glitches is likely the better approach. If a tablet has any WAL file (regardless of the amount of data stored in it), you're still going to have to recovery/replay that WAL before the tablet can come back online in the event of a failure happening. On 10/28/13, 6:20 PM, Terry P. wrote: > Thanks for the replies. I was approaching it from a data integrity > perspective, as in wanting it flushed to disk in case of a TabletServer > failure. Last weekend we saw two TabletServers exit the cluster due to > a network glitch, and wouldn't you know that the 04 node was secondary > logger for the 03 node. > > In our case, these entries are hanging around in memory /for hours/, as > the ingest rate is not that high. > > Perhaps an hourly flush of the table via the shell to get it out to disk > would be the way to go? > > > On Mon, Oct 28, 2013 at 4:30 PM, Mike Drob > wrote: > > What are you trying to accomplish by reducing the number of entries > in memory? A tablet server will not minor compact (flush) until the > native map fills up, but keeping things in memory isn't really a > performance concern. > > You can force a one-time minor compaction via the shell using the > 'flush' command. > > > On Mon, Oct 28, 2013 at 5:19 PM, Terry P. > wrote: > > Greetings all, > For a growing table that currently from zero to 70 million > entries this weekend, I'm seeing 4.4 million entries still in > memory, though the client programs are supposed to be flushing > their entries. > > Is there a server-side setting to help reduce the number of > entries that are in memory (not yet flushed to disk)? Our > system has fairly light performance requirements, so I'm okay if > a tweak may result in reduced ingest performance. > > Thanks in advance, > Terry > > >