cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Upgredesstables doing 4K reads
Date Mon, 12 Sep 2016 15:53:53 GMT
Glad to hear,

Thanks for dropping this here for the record ;-).

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-12 17:15 GMT+02:00 Jacek Luczak <difrost.kernel@gmail.com>:

> Hi Alain,
>
> that was actually a HW issue. The nodes that were behaving badly had a
> buggy BIOS that was doing some bad things with power management. That
> all resulted in wrong handling of P-states and CPUs were not going
> into a full speed. It took a while for me to find it out but now all
> is fine, we are running at full speed.
>
> Cheers,
> -Jacek
>
> 2016-06-23 9:48 GMT+02:00 Alain RODRIGUEZ <arodrime@gmail.com>:
> > Hi,
> >
> > Sorry no one get back to you yet. Do you still have the issue?
> >
> > It's unclear to me what produces this yet. A few ideas though:
> >
> >> We are quite pedantic about OS settings. All nodes got same settings
> >> and C* configuration.
> >
> >
> > Considering this hypothesis, I hope that's 100% true.
> >
> > 2 nodes behaving badly out of 6, makes me think of an unbalanced
> cluster. Do
> > you use RF=2 there ? do you have wide rows or unbalanced data (partition
> > keys not well distributes)?
> >
> > Could you check and paste the output from nodetool cfstats and nodetool
> > cfhistograms on the most impacting tables ?
> >
> > Could those nodes have hardware issues of some kind ?
> >
> > C*heers,
> > -----------------------
> > Alain Rodriguez - alain@thelastpickle.com
> > France
> >
> > The Last Pickle - Apache Cassandra Consulting
> > http://www.thelastpickle.com
> >
> >
> >
> > 2016-06-02 13:43 GMT+02:00 Jacek Luczak <difrost.kernel@gmail.com>:
> >>
> >> Hi,
> >>
> >> I've got a 6 node C* cluster (all nodes are equal both in OS and HW
> >> setup, they are DL380 Gen9 with Smart Array RAID 50,3 on SAS 15K HDDs)
> >> which has been recently updated from 2.2.5 to 3.5. As part of the
> >> update I've done the upgradesstables.
> >>
> >> On 4 nodes the average request size issued to the block dev was never
> >> higher than 8 (that maps to 4K reads) while on remaining 2 nodes it
> >> was basically always maxed 512 (256K reads).
> >>
> >> Nodes doing 4K reads were pumping max 2K read IOPs while the 2 nodes
> >> never went up above 30 IOPs.
> >>
> >> We are quite pedantic about OS settings. All nodes got same settings
> >> and C* configuration. On all nodes block dev got noop scheduler set
> >> and read ahead aligned with strip size.
> >>
> >> During heavy read workloads we've also noticed that those 4 nodes can
> >> swing up to 10K IOPs to get data from storage, the 2 are much below.
> >>
> >> What can cause such difference?
> >>
> >> -Jacek
> >
> >
>

Mime
View raw message