accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Srinivasan <james.sriniva...@gmail.com>
Subject Re: Accumulo performance on various hardware configurations
Date Wed, 29 Aug 2018 14:19:57 GMT
In my limited experience of cloud services, I/O bandwidth seems to be
pretty low. Can you run a benchmark eg bonnie++?

On Wed, 29 Aug 2018, 14:39 guy sharon, <guy.sharon.1977@gmail.com> wrote:

> Well, in one experiment I used a machine with 48 cores and 192GB and the
> results actually came out worse. And in another I had 7 tservers on servers
> with 4 cores. I think I'm not configuring things correctly because I'd
> expect the improved hardware to improve performance and that doesn't seem
> to be the case.
>
> On Wed, Aug 29, 2018 at 4:00 PM Jeremy Kepner <kepner@ll.mit.edu> wrote:
>
>> Your node is fairly underpowered (2 cores and 8 GB RAM) and is less than
>> most laptops.  That said
>>
>> 6M / 12sec = 500K/sec
>>
>> is good for a single node Accumulo instance on this hardware.
>>
>> Spitting might not help since you only have 2 cores so added parallism
>> can't
>> be exploited.
>>
>> Why do you think 500K/sec is slow?
>>
>> To determine slowness one would have to compare with other database
>> technology on the same platform.
>>
>>
>> On Wed, Aug 29, 2018 at 03:04:51PM +0300, guy sharon wrote:
>> > hi,
>> >
>> > Continuing my performance benchmarks, I'm still trying to figure out if
>> the
>> > results I'm getting are reasonable and why throwing more hardware at the
>> > problem doesn't help. What I'm doing is a full table scan on a table
>> with
>> > 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
>> 2.8.4.
>> > The table is populated by
>> > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
>> > modified to write 6M entries instead of 50k. Reads are performed by
>> > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
>> > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
>> > results I got:
>> >
>> > 1. 5 tserver cluster as configured by Muchos (
>> > https://github.com/apache/fluo-muchos), running on m5d.large AWS
>> machines
>> > (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan
>> > took 12 seconds.
>> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
>> > 3. Splitting the table to 4 tablets causes the runtime to increase to 16
>> > seconds.
>> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
>> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
>> > Amazon Linux. Configuration as provided by Uno (
>> > https://github.com/apache/fluo-uno). Total time was 26 seconds.
>> >
>> > Offhand I would say this is very slow. I'm guessing I'm making some
>> sort of
>> > newbie (possibly configuration) mistake but I can't figure out what it
>> is.
>> > Can anyone point me to something that might help me find out what it is?
>> >
>> > thanks,
>> > Guy.
>>
>

Mime
View raw message