Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D9D15200B0F for ; Fri, 17 Jun 2016 13:04:56 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D8550160A61; Fri, 17 Jun 2016 11:04:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 55873160A50 for ; Fri, 17 Jun 2016 13:04:55 +0200 (CEST) Received: (qmail 87077 invoked by uid 500); 17 Jun 2016 11:04:54 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 87067 invoked by uid 99); 17 Jun 2016 11:04:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2016 11:04:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 0E068C0E86 for ; Fri, 17 Jun 2016 11:04:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id mTb0GGDLGK-p for ; Fri, 17 Jun 2016 11:04:51 +0000 (UTC) Received: from mail-vk0-f41.google.com (mail-vk0-f41.google.com [209.85.213.41]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 14C705F19D for ; Fri, 17 Jun 2016 11:04:51 +0000 (UTC) Received: by mail-vk0-f41.google.com with SMTP id d185so111158797vkg.0 for ; Fri, 17 Jun 2016 04:04:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=qRabgDTvyx+cM4pZMee/LYfdFtdUmZsenmS6M1O68zY=; b=rGTwnAxlmIgJks/dC7QDzLY5Z+JvdVkFMQKeBcH//T0JTEe6Fmu7Ec7zGwKcaeHd5M IiYnAnzqbzdXv0wpwcZkey9ANRAoAFjvk8CU23Ftv/ErQOaU9XM1XwzdaA55T4rC4twp KattUR0fUWqcl4Pv4RWGFTTJomXzgEApyDSLANMel/QiOWiyuL2LntvDbIqc0W1Mps5/ lYHBOiK5X28XEu90EGW6rgCLKNyiT3q4oFq9J9O3s67pYlZm3Cqa337UU+l5XYm8N47k epp8wBfvnIl4MbAMXyKYRW19AXEHj3nfpiihZAzJyTnysToc/3mP29yYrsspwPMZRaAC t99g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=qRabgDTvyx+cM4pZMee/LYfdFtdUmZsenmS6M1O68zY=; b=LANlC+YHPdntVhOr5JJl3i1eR08RjdJfTshBIZY+vhcnpt80WfgAHi7qnlpnjAQbRE ARrddWvxJKy8LtvkXahvT0YLWUILOsIQ31KLWHyFY6TewuiWj0yC4p586lpVCwQdUVTJ gtr5ydCXqZK0LPzldcAXaH7T/ONkjr1P8uyxQONd0qX0jMeIXsNUpdsMQP1SKSJ7oX8n JJ4na3lDnCbg5a8cywuIUJC1LQgXpo1SAc6moGF/S722WznrZFAUqRO1viA5pJPY83pw 8fRrguXaZk1uUl6lBQKR+HV+1bRR+wqy2s0ctz/zDxlRxLKzpK5WsB+CE3UaxTEmytnG OcKA== X-Gm-Message-State: ALyK8tKrpuHCEzAJqepspcj/djj2QNhhwcZZOaR/kmJz//pk46D9qGTP+lUzhi9S/AGd4ED/Cd9+X0nN/vUkMw== X-Received: by 10.31.32.4 with SMTP id g4mr673655vkg.52.1466161490111; Fri, 17 Jun 2016 04:04:50 -0700 (PDT) MIME-Version: 1.0 References: <57635F59.9090505@gmail.com> In-Reply-To: From: Roshan Punnoose Date: Fri, 17 Jun 2016 11:04:40 +0000 Message-ID: Subject: Re: Bulk Ingest To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=001a113da646cd33720535775235 archived-at: Fri, 17 Jun 2016 11:04:57 -0000 --001a113da646cd33720535775235 Content-Type: text/plain; charset=UTF-8 Thanks guys! Awesome stuff. On Thu, Jun 16, 2016, 11:41 PM Russ Weeks wrote: > Whoops forgot the link GroupedKeyPartitioner from the excellent > accumulo-recipes project: > https://github.com/calrissian/accumulo-recipes/blob/master/thirdparty/spark/src/main/scala/org/calrissian/accumulorecipes/spark/support/GroupedKeyPartitioner.scala > > On Thu, Jun 16, 2016 at 8:40 PM Russ Weeks > wrote: > >> > 1) Avoid lots of small files. Target as large of files as you can, relative >> to your ingest latency requirements and your max file size (set on your >> instance or table) >> >> If you're using Spark to produce the RFiles, one trick for this is to >> call coalesce() on your RDD to reduce the number of RFiles that are written >> to HDFS. >> >> > 2) Avoid having to import one file to multiple tablets. >> >> This is huge. Again, if you're using Spark you must not use the >> HashPartitioner to create RDDs or you'll wind up in a situation where every >> tablet owns a piece of every RFile. Ideally you would use something like >> the GroupedKeyPartitioner[1] to align the RDD partitions with the tablet >> splits but even the built-in RangePartitioner will be much better than the >> HashPartitioner. >> >> -Russ >> >> On Thu, Jun 16, 2016 at 7:24 PM Josh Elser wrote: >> >>> There are two big things that are required to really scale up bulk >>> loading. Sadly (I guess) they are both things you would need to be >>> implement on your own: >>> >>> 1) Avoid lots of small files. Target as large of files as you can, >>> relative to your ingest latency requirements and your max file size (set >>> on your instance or table) >>> >>> 2) Avoid having to import one file to multiple tablets. Remember that >>> the majority of the metadata update for Accumulo is updating the tablet >>> row with the new file. When you have one file which spans many tablets, >>> you are now create N metadata updates instead of just one. When you >>> create the files, take into account the split points of your table, and >>> use that try to target one file per tablet. >>> >>> Roshan Punnoose wrote: >>> > We are trying to perform bulk ingest at scale and wanted to get some >>> > quick thoughts on how to increase performance and stability. One of the >>> > problems we have is that we sometimes import thousands of small files, >>> > and I don't believe there is a good way around this in the architecture >>> > as of yet. Already I have run into an rpc timeout issue because the >>> > import process is taking longer than 5m. And another issue where we >>> have >>> > so many files after a bulk import that we have had to bump the >>> > tserver.scan.files.open.max to 1K. >>> > >>> > Here are some other configs that we have been toying with: >>> > - master.fate.threadpool.size: 20 >>> > - master.bulk.threadpool.size: 20 >>> > - master.bulk.timeout: 20m >>> > - tserver.bulk.process.threads: 20 >>> > - tserver.bulk.assign.threads: 20 >>> > - tserver.bulk.timeout: 20m >>> > - tserver.compaction.major.concurrent.max: 20 >>> > - tserver.scan.files.open.max: 1200 >>> > - tserver.server.threads.minimum: 64 >>> > - table.file.max: 64 >>> > - table.compaction.major.ratio: 20 >>> > >>> > (HDFS) >>> > - dfs.namenode.handler.count: 100 >>> > - dfs.datanode.handler.count: 50 >>> > >>> > Just want to get any quick ideas for performing bulk ingest at scale. >>> > Thanks guys >>> > >>> > p.s. This is on Accumulo 1.6.5 >>> >> --001a113da646cd33720535775235 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Thanks guys! Awesome stuff.


On Thu, Jun 16, 2016, 11:41= PM Russ Weeks <rweeks@newbr= ightidea.com> wrote:

On Th= u, Jun 16, 2016 at 8:40 PM Russ Weeks <rweeks@newbrightidea.com> wrote:
>=C2=A01) Avoid lots of small files. Target as large of files as you = can,=C2=A0relative to your ingest latency= requirements and your max file size (set=C2=A0on your instance or table)

If you're using Spark to produce the RFiles, one tric= k for this is to call coalesce() on your RDD to reduce the number of RFiles= that are written to HDFS.

>=C2= =A02) Avoid having to imp= ort one file to multiple tablets.

This is huge. Again, if you're using Sp= ark you must not use the HashPartitioner to create RDDs or you'll wind = up in a situation where every tablet owns a piece of every RFile. Ideally y= ou would use something like the GroupedKeyPartitioner[1] to align the RDD p= artitions with the tablet splits but even the built-in RangePartitioner wil= l be much better than the HashPartitioner.

-Russ

On Thu, Jun 16, 2016 at 7:24 PM Josh = Elser <josh.el= ser@gmail.com> wrote:
There = are two big things that are required to really scale up bulk
loading. Sadly (I guess) they are both things you would need to be
implement on your own:

1) Avoid lots of small files. Target as large of files as you can,
relative to your ingest latency requirements and your max file size (set on your instance or table)

2) Avoid having to import one file to multiple tablets. Remember that
the majority of the metadata update for Accumulo is updating the tablet
row with the new file. When you have one file which spans many tablets,
you are now create N metadata updates instead of just one. When you
create the files, take into account the split points of your table, and
use that try to target one file per tablet.

Roshan Punnoose wrote:
> We are trying to perform bulk ingest at scale and wanted to get some > quick thoughts on how to increase performance and stability. One of th= e
> problems we have is that we sometimes import thousands of small files,=
> and I don't believe there is a good way around this in the archite= cture
> as of yet. Already I have run into an rpc timeout issue because the > import process is taking longer than 5m. And another issue where we ha= ve
> so many files after a bulk import that we have had to bump the
> tserver.scan.files.open.max to 1K.
>
> Here are some other configs that we have been toying with:
> - master.fate.threadpool.size: 20
> - master.bulk.threadpool.size: 20
> - master.bulk.timeout: 20m
> - tserver.bulk.process.threads: 20
> - tserver.bulk.assign.threads: 20
> - tserver.bulk.timeout: 20m
> - tserver.compaction.major.concurrent.max: 20
> - tserver.scan.files.open.max: 1200
> - tserver.server.threads.minimum: 64
> - table.file.max: 64
> - table.compaction.major.ratio: 20
>
> (HDFS)
> - dfs.namenode.handler.count: 100
> - dfs.datanode.handler.count: 50
>
> Just want to get any quick ideas for performing bulk ingest at scale.<= br> > Thanks guys
>
> p.s. This is on Accumulo 1.6.5
--001a113da646cd33720535775235--