accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: How to control Minor Compaction by programming
Date Fri, 31 Jul 2015 15:11:40 GMT

Hai Pham wrote:
> Hi Keith,
>
>
> I have 4 tablet servers + 1 master. I also did a pre-split before
> ingesting and it increased the speed a lot.
>
>
> And you're right, when I created too many ingest threads, many of them
> were on the queue of thread pools and the hold time will increases. In
> some intense ingest, there was a case when a tablet was killed by master
> for the hold time exceeded 5 min. In this situation, all Tablets were in
> stuck. Only after that one is dead, the ingest was back with the
> comparable speed. But the entries in dead tablet were all gone and lost
> to the table.

You're saying that you lost data? If a server dies, all of the tablets 
that were hosted there are reassigned to other servers. This is done in 
a manner that guarantees that there is no data lost in this transition. 
If you actually lost data, this would be a critical bug, but I would 
certainly hope you just didn't realize that the data was automatically 
being hosted by another server.

> I have had no idea to repair this except for regulating the number of
> ingest threads and speed to make it more friendly to the terminal of
> Accumulo itself.
>
>
> Another myth to me is that when I did a pre-split to, e.g. 8 tablets.
> But along with the ingest operation, the tablet number increases (e.g.
> 10, 14 or bigger). Any idea?

Yep, Accumulo will naturally split tablets when they exceed a certain 
size (1GB by default for normal tables). Unless you increase the 
property table.split.threshold, as you ingest more data, you will 
observe more tablets.

Given enough time, Accumulo will naturally split your table enough. 
Pre-splitting quickly gets you to a good level of performance right away.

>
> Hai
> ------------------------------------------------------------------------
> *From:* Keith Turner <keith@deenlo.com>
> *Sent:* Friday, July 31, 2015 8:39 AM
> *To:* user@accumulo.apache.org
> *Subject:* Re: How to control Minor Compaction by programming
> How many tablets do you have? Entire tablets are minor compacted at
> once. If you have 1 tablet per tablet server, then minor compactions
> will have a lot of work to do at once. While this work is being done,
> the tablet servers memory may fill up, leading to writes being held.
>
> If you have 10 tablets per tablet server, then tablets can be compacted
> in parallel w/ less work to do at any given point in time. This can
> avoid memory filling up and writes being held.
>
> In short, its possible that adding good split points to the table (and
> therefore creating more tablets) may help w/ this issue.
>
> Also, are you seeing hold times?
>
> On Thu, Jul 30, 2015 at 11:24 PM, Hai Pham <htp0005@tigermail.auburn.edu
> <mailto:htp0005@tigermail.auburn.edu>> wrote:
>
>     Hey William, Josh and David,
>
>     Thanks for explaining, I might not have been clear: I used the web
>     interface with port 50095 to monitor the real-time charts (ingest,
>     scan, load average, minor compaction, major compaction, ...).
>
>     Nonetheless, as I witnessed, when I ingested about 100k entries ->
>     then minor compaction happened -> ingest was stuck -> the level of
>     minor compaction on the charts was just about 1.0, 2.0 and max 3.0
>     while about >20k entries were forced out of memory (I knew this by
>     looking at the number of entries in memory w.r.t the table being
>     ingested to) -> then when minor compaction ended, ingest resumed,
>     somewhat faster.
>
>     Thus I presume the level 1.0, 2.0, 3.0 is not representative for
>     number of files being minor-compacted from memory?
>
>     Hai
>     ________________________________________
>     From: Josh Elser <josh.elser@gmail.com <mailto:josh.elser@gmail.com>>
>     Sent: Thursday, July 30, 2015 7:12 PM
>     To: user@accumulo.apache.org <mailto:user@accumulo.apache.org>
>     Subject: Re: How to control Minor Compaction by programming
>
>     >
>      > Also, can you please explain the number 0, 1.0, 2.0, ... in
>     charts (web
>      > monitoring) denoting the level of Minor Compaction and Major
>     Compaction?
>
>     On the monitor, the number of compactions are of the form:
>
>     active (queued)
>
>     e.g. 4 (2), would mean that 4 are running and 2 are queued.
>
>      >
>      >
>      > Thank you!
>      >
>      > Hai Pham
>      >
>      >
>      >
>      >
>
>

Mime
View raw message