accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <jej2...@gmail.com>
Subject Re: Configuring batch writers
Date Tue, 19 Jul 2016 19:32:03 GMT
hundreds of tablets for a particular table or all tables?

On Tue, Jul 19, 2016 at 1:43 PM, Josh Elser <josh.elser@gmail.com> wrote:

> It's very dependent on the requirements of your application and the
> amount of data your application is serving. A general recommendation
> which should be universal is try to limit each server to hundreds of
> tablets. This, like everything else, is also a loose recommendation.
>
> Likely, this will require experimentation on your end. If you can
> share more details about the specifics of your data set and
> requirements, we might be able to give you some more direction.
>
> On Tue, Jul 19, 2016 at 12:35 PM, Jamie Johnson <jej2003@gmail.com> wrote:
> > Thank you, this was helpful.  What about the number of splits for a
> table.
> > Is there a general rule of thumb for how many splits and what size they
> > should be when trying to balance ingest/query performance?
> >
> > On Fri, Jul 15, 2016 at 2:38 PM, Emilio Lahr-Vivaz <elahrvivaz@ccri.com>
> > wrote:
> >>
> >> Another thing to consider is how many tablet servers the mutations are
> >> being sent to - if they're all going to a single split, that's going to
> >> reduce your throughput a lot.
> >>
> >>
> >> On 07/15/2016 02:33 PM, dlmarion@comcast.net wrote:
> >>
> >> The batch writer has several knobs (latency time, memory buffer, etc)
> that
> >> you can tune to meet your requirements. The values for those settings
> will
> >> depend on a lot of variables, to include:
> >>
> >>   - number of tablet servers
> >>   - size of mutations
> >>   - desired latency
> >>   - memory buffer
> >>   - configuration settings on the table(s) and tablet servers.
> >>
> >>  Suggest picking a starting point and see how it works for you, such as
> >>
> >>   threads - equal to the number of tablet servers (unless you have a
> >> really large number of tablet servers)
> >>   buffer - 100MB
> >>   latency - 10 seconds
> >>
> >>  If you are hitting a wall with those settings, you could increase the
> >> buffer and latency and/or change some settings on the server side that
> have
> >> to do with the write ahead logs.
> >>
> >> ________________________________
> >> From: "Jamie Johnson" <jej2003@gmail.com>
> >> To: user@accumulo.apache.org
> >> Sent: Friday, July 15, 2016 2:16:40 PM
> >> Subject: Configuring batch writers
> >>
> >> Is there any documentation that outlines reasonable settings for batch
> >> writers given a known ingest rate?  For instance if I have a source
> that is
> >> producing in the neighborhood of 15MB of mutations per second, what
> would a
> >> reasonable configuration for the batch writer be to handle an ingest at
> this
> >> rate? What are reasonable rules of thumb to follow to ensure that the
> >> writers don't block, etc?
> >>
> >>
> >
>

Mime
View raw message