From user-return-2187-apmail-accumulo-user-archive=accumulo.apache.org@accumulo.apache.org Thu Apr 4 18:26:05 2013 Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C115CF369 for ; Thu, 4 Apr 2013 18:26:05 +0000 (UTC) Received: (qmail 84076 invoked by uid 500); 4 Apr 2013 18:26:05 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 84027 invoked by uid 500); 4 Apr 2013 18:26:05 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 84019 invoked by uid 99); 4 Apr 2013 18:26:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 18:26:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eric.newton@gmail.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 18:26:00 +0000 Received: by mail-wi0-f177.google.com with SMTP id hm14so2917915wib.10 for ; Thu, 04 Apr 2013 11:25:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=++lw9bUEu75/ctMxIb8BX9U3Ub9WmxUD+qMYUXPk7mE=; b=RNteBCQ+s65bpJs0zEyETYYsvcxU8NKY4FdcYiD2wf755KZTVKk6/L/rUepUnMaLni f/RXfKQpGg4NcrQ4VDTd5F+UXKwwS6bnoVHX8rNLt3rzeubS+Lg8ysz40unQ3o2eGGVo m/6h59CmQ5+u5KBMg9kaiWAbOMaLEYhOWYcX+YkJ1eTVgNMr/S96Yvu8Wvu1K6J5/5oQ DTTejD+HVmGrIa3lEkLB8YyzJCmMjKaiSE1xRlCbR5n9O7MUh4ZfBMTBFDwRdnNVI0oO AmvRGOoQMWKy7gMEh5g3QOhCspUU2+RFa27rXdy1lIyDwm7H27W0zUdXek8jnvZhfu8S J7gA== MIME-Version: 1.0 X-Received: by 10.180.188.141 with SMTP id ga13mr31557640wic.9.1365099939292; Thu, 04 Apr 2013 11:25:39 -0700 (PDT) Received: by 10.217.107.138 with HTTP; Thu, 4 Apr 2013 11:25:39 -0700 (PDT) In-Reply-To: References: Date: Thu, 4 Apr 2013 14:25:39 -0400 Message-ID: Subject: Re: Increasing Ingest Rate From: Eric Newton To: "user@accumulo.apache.org" Content-Type: multipart/alternative; boundary=001a11c37f5af6ef4504d98d189d X-Virus-Checked: Checked by ClamAV on apache.org --001a11c37f5af6ef4504d98d189d Content-Type: text/plain; charset=ISO-8859-1 Have you pre-split your tablet to spread the load out to all the machines? Does the data distribution match your splits? Is the ingest data already sorted (that is, it always writes to the last tablet)? How much memory and how many threads are you using in your batchwriters? Check the ingest rates on tablet server monitor page and look for hot spots. -Eric On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin wrote: > Hello, > I am fairly new to Accumulo and am trying to figure out what is preventing > my system from ingesting data at a faster rate. We have 15 nodes running a > simple Java program that reads and writes to Accumulo and then indexes some > data into Solr. The rate of ingest is not scaling linearly with the number > of nodes that we start up. I have tried increasing several parameters > including: > - limit of file descriptors in linux > - max zookeeper connections > - tserver.memory.maps.max > - tserver_opts memory size > - tserver.mutation_queue.max > - tserver.scan.files.open.max > - tserver.walog.max.size > - tserver.cache.data.size > - tserver.cache.index.size > - hdfs setting for xceivers > No matter what changes we make, we cannot get the ingest rate to go over > 100k entries/s and about 6 Mb/s. I know Accumulo should be able to ingest > faster than this. > Thanks in advance, > > Jimmy Lin > > --001a11c37f5af6ef4504d98d189d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Have you pre-split your tablet to spread the lo= ad out to all the machines?=A0

Does th= e data distribution match your splits?
Is the ingest data a= lready sorted (that is, it always writes to the last tablet)?

How much memory and how many threads are yo= u using in your batchwriters?

Check th= e ingest rates on tablet server monitor page and look for hot spots.

-Eric


On Thu, Apr 4, 2013 at= 2:01 PM, Jimmy Lin <jimmys.email@gmail.com> wrote:
Hello,
I am fairly new to Accumulo and am trying to figure out what is prev= enting my system from ingesting data at a faster rate. We have 15 nodes ru= nning a simple Java program that reads and writes to Accumulo and then inde= xes some data into Solr. The rate of ingest is not scaling linearly with t= he number of nodes that we start up. I have tried increasing several param= eters including:
- limit of file descriptors in linux
- max zooke= eper connections
- tserver.memory.maps.max
- tserver_op= ts memory size
- tserver.mutation_queue.max
- tserver.s= can.files.open.max
- tserver.walog.max.size
- tserver.cache.data.size
- tserver.cache.index.size
- hdfs setting for xceivers
No matter what changes we make, we cannot get the ingest rate = to go over 100k entries/s and about 6 Mb/s. I know Accumulo should be able= to ingest faster than this.
Thanks in advance,
=A0
Jimmy Lin
=A0

--001a11c37f5af6ef4504d98d189d--