accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc <phroc...@apache.org>
Subject Re: Accumulo performance on various hardware configurations
Date Wed, 29 Aug 2018 19:18:13 GMT
Guy,
   In the case where you added servers and splits, did you check the
tablet locations to see if they migrated to separate hosts?

On Wed, Aug 29, 2018 at 3:12 PM guy sharon <guy.sharon.1977@gmail.com> wrote:
>
> hi Mike,
>
> As per Mike Miller's suggestion I started using org.apache.accumulo.examples.simple.helloworld.ReadData
from Accumulo with debugging turned off and a BatchScanner with 10 threads. I redid all the
measurements and although this was 20% faster than using the shell there was no difference
once I started playing with the hardware configurations.
>
> Guy.
>
> On Wed, Aug 29, 2018 at 10:06 PM Michael Wall <mjwall@gmail.com> wrote:
>>
>> Guy,
>>
>> Can you go into specifics about how you are measuring this?  Are you still using
"bin/accumulo shell -u root -p secret -e "scan -t hellotable -np" | wc -l" as you mentioned
earlier in the thread?  As Mike Miller suggested, serializing that back to the display and
then counting 6M entries is going to take some time.  Try using a Batch Scanner directly.
>>
>> Mike
>>
>> On Wed, Aug 29, 2018 at 2:56 PM guy sharon <guy.sharon.1977@gmail.com> wrote:
>>>
>>> Yes, I tried the high performance configuration which translates to 4G heap size,
but that didn't affect performance. Neither did setting table.scan.max.memory to 4096k (default
is 512k). Even if I accept that the read performance here is reasonable I don't understand
why none of the hardware configuration changes (except going to 48 cores, which made things
worse) made any difference.
>>>
>>> On Wed, Aug 29, 2018 at 8:33 PM Mike Walch <mwalch@apache.org> wrote:
>>>>
>>>> Muchos does not automatically change its Accumulo configuration to take advantage
of better hardware. However, it does have a performance profile setting in its configuration
(see link below) where you can select a profile (or create your own) based on your the hardware
you are using.
>>>>
>>>> https://github.com/apache/fluo-muchos/blob/master/conf/muchos.props.example#L94
>>>>
>>>> On Wed, Aug 29, 2018 at 11:35 AM Josh Elser <elserj@apache.org> wrote:
>>>>>
>>>>> Does Muchos actually change the Accumulo configuration when you are
>>>>> changing the underlying hardware?
>>>>>
>>>>> On 8/29/18 8:04 AM, guy sharon wrote:
>>>>> > hi,
>>>>> >
>>>>> > Continuing my performance benchmarks, I'm still trying to figure
out if
>>>>> > the results I'm getting are reasonable and why throwing more hardware
at
>>>>> > the problem doesn't help. What I'm doing is a full table scan on
a table
>>>>> > with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and
Hadoop
>>>>> > 2.8.4. The table is populated by
>>>>> > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
>>>>> > modified to write 6M entries instead of 50k. Reads are performed
by
>>>>> > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData
-i
>>>>> > muchos -z localhost:2181 -u root -t hellotable -p secret". Here
are the
>>>>> > results I got:
>>>>> >
>>>>> > 1. 5 tserver cluster as configured by Muchos
>>>>> > (https://github.com/apache/fluo-muchos), running on m5d.large AWS
>>>>> > machines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate
>>>>> > server. Scan took 12 seconds.
>>>>> > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
>>>>> > 3. Splitting the table to 4 tablets causes the runtime to increase
to 16
>>>>> > seconds.
>>>>> > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
>>>>> > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
>>>>> > Amazon Linux. Configuration as provided by Uno
>>>>> > (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>>>>> >
>>>>> > Offhand I would say this is very slow. I'm guessing I'm making some
sort
>>>>> > of newbie (possibly configuration) mistake but I can't figure out
what
>>>>> > it is. Can anyone point me to something that might help me find
out what
>>>>> > it is?
>>>>> >
>>>>> > thanks,
>>>>> > Guy.
>>>>> >
>>>>> >

Mime
View raw message