accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyrille Savelief <csavel...@gmail.com>
Subject Re: maximize usage of cluster resources during ingestion
Date Wed, 05 Jul 2017 15:05:25 GMT
Hi Massimilian*,*

Using a MultiTableBatchWriter we are able to ingest about 600K entries/s on
a single node (30Gb of memory, 8 vCPU) running Hadoop, Zookeeper, Accumulo
and our ingest process. For us, "valleys" came from huge GC pauses.

Best,

Cyrille

Le mer. 5 juil. 2017 à 14:37, Massimilian Mattetti <MASSIMIL@il.ibm.com> a
écrit :

> Hi all,
>
> I have an Accumulo 1.8.1 cluster made by 12 bare metal servers. Each
> server has 256GB of Ram and 2 x 10 cores CPU. 2 machines are used as
> masters (running HDFS NameNodes, Accumulo Master and Monitor). The other 10
> machines has 12 Disks of 1 TB (11 used by HDFS DataNode process) and are
> running Accumulo TServer processes. All the machines are connected via a
> 10Gb network and 3 of them are running ZooKeeper. I have run some heavy
> ingestion test on this cluster but I have never been able to reach more
> than *20% *CPU usage on each Tablet Server. I am running an ingestion
> process (using batch writers) on each data node. The table is pre-split in
> order to have 4 tablets per tablet server. Monitoring the network I have
> seen that data is received/sent from each node with a peak rate of about
> 120MB/s / 100MB/s while the aggregated disk write throughput on each tablet
> servers is around 120MB/s.
>
> The table configuration I am playing with are:
> "table.file.replication": "2",
> "table.compaction.minor.logs.threshold": "10",
> "table.durability": "flush",
> "table.file.max": "30",
> "table.compaction.major.ratio": "9",
> "table.split.threshold": "1G"
>
> while the tablet server configuration is:
> "tserver.wal.blocksize": "2G",
> "tserver.walog.max.size": "8G",
> "tserver.memory.maps.max": "32G",
> "tserver.compaction.minor.concurrent.max": "50",
> "tserver.compaction.major.concurrent.max": "8",
> "tserver.total.mutation.queue.max": "50M",
> "tserver.wal.replication": "2",
> "tserver.compaction.major.thread.files.open.max": "15"
>
> the tablet server heap has been set to 32GB
>
> From Monitor UI
>
>
> As you can see I have a lot of valleys in which the ingestion rate reaches
> 0.
> What would be a good procedure to identify the bottleneck which causes the
> 0 ingestion rate periods?
> Thanks.
>
> Best Regards,
> Max
>
>

Mime
View raw message