Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@accumulo.apache.org
Received-SPF: pass (athena.apache.org: domain of josh.elser@gmail.com
 designates 209.85.161.169 as permitted sender)
Message-ID: <526EEA19.9070606@gmail.com>
Date: Mon, 28 Oct 2013 18:50:01 -0400
From: Josh Elser <josh.elser@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;
 rv:24.0) Gecko/20100101 Thunderbird/24.0.1
MIME-Version: 1.0
To: user@accumulo.apache.org
Subject: Re: How to reduce number of entries in memory
References: 
 <CAPnhrdv951LOAnH=QemD=cL_HyJMtQeuwES_WUsy4Y52dx7Yhw@mail.gmail.com>
	<CAJRvFdrOhnBo5eL=tPpKZ5sYH+wuXkTDCgccz7v4XaosOaDjFw@mail.gmail.com>
 <CAPnhrdtGAqUo-C_tY_pXu-u8OfDA3xUe+iqWeZU8WVCX_=OKXg@mail.gmail.com>
In-Reply-To: 
 <CAPnhrdtGAqUo-C_tY_pXu-u8OfDA3xUe+iqWeZU8WVCX_=OKXg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

It kind of sounds like you should be more concerned about nodes randomly 
dropping out of your cluster :)

If you're stuck on 1.4 series, you can try to up the property 
'tserver.logger.count' to '3' instead of the default of '2' to ensure 
that you have a greater chance of not losing a WAL replica.

With 1.5, you'll get your HDFS replication (which is likely 3, as well). 
I'm not sure off the top of my head if the 1.4 loggers have an rack 
locality for nodes (e.g. one replicas on rack, and one off rack).

Regardless, trying to avoid network glitches is likely the better 
approach. If a tablet has any WAL file (regardless of the amount of data 
stored in it), you're still going to have to recovery/replay that WAL 
before the tablet can come back online in the event of a failure happening.

On 10/28/13, 6:20 PM, Terry P. wrote:
> Thanks for the replies. I was approaching it from a data integrity
> perspective, as in wanting it flushed to disk in case of a TabletServer
> failure.  Last weekend we saw two TabletServers exit the cluster due to
> a network glitch, and wouldn't you know that the 04 node was secondary
> logger for the 03 node.
>
> In our case, these entries are hanging around in memory /for hours/, as
> the ingest rate is not that high.
>
> Perhaps an hourly flush of the table via the shell to get it out to disk
> would be the way to go?
>
>
> On Mon, Oct 28, 2013 at 4:30 PM, Mike Drob <mdrob@mdrob.com
> <mailto:mdrob@mdrob.com>> wrote:
>
>     What are you trying to accomplish by reducing the number of entries
>     in memory? A tablet server will not minor compact (flush) until the
>     native map fills up, but keeping things in memory isn't really a
>     performance concern.
>
>     You can force a one-time minor compaction via the shell using the
>     'flush' command.
>
>
>     On Mon, Oct 28, 2013 at 5:19 PM, Terry P. <texpilot@gmail.com
>     <mailto:texpilot@gmail.com>> wrote:
>
>         Greetings all,
>         For a growing table that currently from zero to 70 million
>         entries this weekend, I'm seeing 4.4 million entries still in
>         memory, though the client programs are supposed to be flushing
>         their entries.
>
>         Is there a server-side setting to help reduce the number of
>         entries that are in memory (not yet flushed to disk)?  Our
>         system has fairly light performance requirements, so I'm okay if
>         a tweak may result in reduced ingest performance.
>
>         Thanks in advance,
>         Terry
>
>
>