kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: Re: Recommended maximum amount of stored data per tablet server
Date Sat, 04 Aug 2018 19:57:04 GMT
How much space typically allocated just for WAL and metadata? We have 2
400GB ssds in raid5 for OS and 12 12TB hdds. Is it still a good idea to
carve out maybe 100gb on SSD or use a dedicated hdd

On Thu, Aug 2, 2018, 20:36 Todd Lipcon <todd@cloudera.com> wrote:

> On Thu, Aug 2, 2018 at 4:54 PM, Quanlong Huang <huang_quanlong@126.com>
> wrote:
>> Thank Adar and Todd! We'd like to contribute when we could.
>> Are there any concerns if we share the machines with HDFS DataNodes and
>> Yarn NodeManagers? The network bandwidth is 10Gbps. I think it's ok if they
>> don't share the same disks, e.g. 4 disks for kudu and the other 11 disks
>> for DataNode and NodeManager, and leave enough CPU & mem for kudu. Is that
>> right?
> That should be fine. Typically we actualyl recommend sharing all the disks
> for all of the services. There is a trade-off between static partitioning
> (exclusive access to a smaller number of disks) vs dynamic sharing
> (potential contention but more available resources). Unless your workload
> is very latency sensitive I usually think it's better to have the bigger
> pool of resources available even if it needs to share with other systems.
> One recommendation, though is to consider using a dedicated disk for the
> Kudu WAL and metadata, which can help performance, since the WAL can be
> sensitive to other heavy workloads monopolizing bandwidth on the same
> spindle.
> -Todd
>> At 2018-08-03 02:26:37, "Todd Lipcon" <todd@cloudera.com> wrote:
>> +1 to what Adar said.
>> One tension we have currently for scaling is that we don't want to scale
>> individual tablets too large, because of problems like the superblock that
>> Adar mentioned. However, the solution of just having more tablets is also
>> not a great one, since many of our startup time problems are primarily
>> affected by the number of tablets more than their size (see KUDU-38 as the
>> prime, ancient, example). Additionally, having lots of tablets increases
>> raft heartbeat traffic and may need to dial back those heartbeat intervals
>> to keep things stable.
>> All of these things can be addressed in time and with some work. If you
>> are interested in working on these areas to improve density that would be a
>> great contribution.
>> -Todd
>> On Thu, Aug 2, 2018 at 11:17 AM, Adar Lieber-Dembo <adar@cloudera.com>
>> wrote:
>>> The 8TB limit isn't a hard one, it's just a reflection of the scale
>>> that Kudu developers commonly test. Beyond 8TB we can't vouch for
>>> Kudu's stability and performance. For example, we know that as the
>>> amount of on-disk data grows, node restart times get longer and longer
>>> (see KUDU-2014 for some ideas on how to improve that). Furthermore, as
>>> tablets accrue more data blocks, their superblocks become larger,
>>> raising the minimum amount of I/O for any operation that rewrites a
>>> superblock (such as a flush or compaction). Lastly, the tablet copy
>>> protocol used in rereplication tries to copy the entire superblock in
>>> one RPC message; if the superblock is too large, it'll run up against
>>> the default 50 MB RPC transfer size (see src/kudu/rpc/transfer.cc).
>>> These examples are just off the top of my head; there may be others
>>> lurking. So this goes back to what I led with: beyond the recommended
>>> limit we aren't quite sure how Kudu's performance and stability are
>>> affected.
>>> All that said, you're welcome to try it out and report back with your
>>> findings.
>>> On Thu, Aug 2, 2018 at 7:23 AM Quanlong Huang <huang_quanlong@126.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > In the document of "Known Issues and Limitations", it's recommended
>>> that "maximum amount of stored data, post-replication and post-compression,
>>> per tablet server is 8TB". How is the 8TB calculated?
>>> >
>>> > We have some machines each with 15 * 4TB spinning disk drives and
>>> 256GB RAM, 48 cpu cores. Does it mean the other 52(= 15 * 4 - 8) TB space
>>> is recommended to leave for other systems? We prefer to make the machine
>>> dedicated to Kudu. Can tablet server leverage the whole space efficiently?
>>> >
>>> > Thanks,
>>> > Quanlong
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
> --
> Todd Lipcon
> Software Engineer, Cloudera

View raw message