accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Configuring batch writers
Date Wed, 20 Jul 2016 02:49:29 GMT
Hundreds of tablets per server.

The numbers of tablets in your system divided by the number of 
tabletservers should likely be in the hundreds.

Jamie Johnson wrote:
> hundreds of tablets for a particular table or all tables?
>
> On Tue, Jul 19, 2016 at 1:43 PM, Josh Elser <josh.elser@gmail.com
> <mailto:josh.elser@gmail.com>> wrote:
>
>     It's very dependent on the requirements of your application and the
>     amount of data your application is serving. A general recommendation
>     which should be universal is try to limit each server to hundreds of
>     tablets. This, like everything else, is also a loose recommendation.
>
>     Likely, this will require experimentation on your end. If you can
>     share more details about the specifics of your data set and
>     requirements, we might be able to give you some more direction.
>
>     On Tue, Jul 19, 2016 at 12:35 PM, Jamie Johnson <jej2003@gmail.com
>     <mailto:jej2003@gmail.com>> wrote:
>      > Thank you, this was helpful.  What about the number of splits for
>     a table.
>      > Is there a general rule of thumb for how many splits and what
>     size they
>      > should be when trying to balance ingest/query performance?
>      >
>      > On Fri, Jul 15, 2016 at 2:38 PM, Emilio Lahr-Vivaz
>     <elahrvivaz@ccri.com <mailto:elahrvivaz@ccri.com>>
>      > wrote:
>      >>
>      >> Another thing to consider is how many tablet servers the
>     mutations are
>      >> being sent to - if they're all going to a single split, that's
>     going to
>      >> reduce your throughput a lot.
>      >>
>      >>
>      >> On 07/15/2016 02:33 PM, dlmarion@comcast.net
>     <mailto:dlmarion@comcast.net> wrote:
>      >>
>      >> The batch writer has several knobs (latency time, memory buffer,
>     etc) that
>      >> you can tune to meet your requirements. The values for those
>     settings will
>      >> depend on a lot of variables, to include:
>      >>
>      >>   - number of tablet servers
>      >>   - size of mutations
>      >>   - desired latency
>      >>   - memory buffer
>      >>   - configuration settings on the table(s) and tablet servers.
>      >>
>      >>  Suggest picking a starting point and see how it works for you,
>     such as
>      >>
>      >>   threads - equal to the number of tablet servers (unless you have a
>      >> really large number of tablet servers)
>      >>   buffer - 100MB
>      >>   latency - 10 seconds
>      >>
>      >>  If you are hitting a wall with those settings, you could
>     increase the
>      >> buffer and latency and/or change some settings on the server
>     side that have
>      >> to do with the write ahead logs.
>      >>
>      >> ________________________________
>      >> From: "Jamie Johnson" <jej2003@gmail.com <mailto:jej2003@gmail.com>>
>      >> To: user@accumulo.apache.org <mailto:user@accumulo.apache.org>
>      >> Sent: Friday, July 15, 2016 2:16:40 PM
>      >> Subject: Configuring batch writers
>      >>
>      >> Is there any documentation that outlines reasonable settings for
>     batch
>      >> writers given a known ingest rate?  For instance if I have a
>     source that is
>      >> producing in the neighborhood of 15MB of mutations per second,
>     what would a
>      >> reasonable configuration for the batch writer be to handle an
>     ingest at this
>      >> rate? What are reasonable rules of thumb to follow to ensure
>     that the
>      >> writers don't block, etc?
>      >>
>      >>
>      >
>
>

Mime
View raw message