cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roni Balthazar <ronibaltha...@gmail.com>
Subject Re: Many pending compactions
Date Wed, 18 Feb 2015 13:49:11 GMT
Are you running repairs within gc_grace_seconds? (default is 10 days)
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

Double check if you set cold_reads_to_omit to 0.0 on tables with STCS
that you do not read often.

Are you using default values for the properties
min_compaction_threshold(4) and max_compaction_threshold(32)?

Which Consistency Level are you using for reading operations? Check if
you are not reading from DC_B due to your Replication Factor and CL.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


Cheers,

Roni Balthazar

On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam <ptrstpppp@gmail.com> wrote:
> I don't have problems with DC_B (replica) only in DC_A(my system write only
> to it) I have read timeouts.
>
> I checked in OpsCenter SSTable count  and I have:
> 1) in DC_A  same +-10% for last week, a small increase for last 24h (it is
> more than 15000-20000 SSTables depends on node)
> 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics.
> Now I have less then 1000 SSTables
>
> What did you measure during system optimizations? Or do you have an idea
> what more should I check?
> 1) I look at CPU Idle (one node is 50% idle, rest 70% idle)
> 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are
> spikes
> 3) system RAM usage is almost full
> 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A
> it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s)
>
> something else?
>
>
>
> On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar <ronibalthazar@gmail.com>
> wrote:
>>
>> Hi,
>>
>> You can check if the number of SSTables is decreasing. Look for the
>> "SSTable count" information of your tables using "nodetool cfstats".
>> The compaction history can be viewed using "nodetool
>> compactionhistory".
>>
>> About the timeouts, check this out:
>> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
>> Also try to run "nodetool tpstats" to see the threads statistics. It
>> can lead you to know if you are having performance problems. If you
>> are having too many pending tasks or dropped messages, maybe will you
>> need to tune your system (eg: driver's timeout, concurrent reads and
>> so on)
>>
>> Regards,
>>
>> Roni Balthazar
>>
>> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstpppp@gmail.com> wrote:
>> > Hi,
>> > Thanks for your "tip" it looks that something changed - I still don't
>> > know
>> > if it is ok.
>> >
>> > My nodes started to do more compaction, but it looks that some
>> > compactions
>> > are really slow.
>> > In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput
>> > to
>> > 999, but I do not see difference.
>> >
>> > Can we check something more? Or do you have any method to monitor
>> > progress
>> > with small files?
>> >
>> > Regards
>> >
>> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar
>> > <ronibalthazar@gmail.com>
>> > wrote:
>> >>
>> >> HI,
>> >>
>> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
>> >> the solution...
>> >> The number of SSTables decreased from many thousands to a number below
>> >> a hundred and the SSTables are now much bigger with several gigabytes
>> >> (most of them).
>> >>
>> >> Cheers,
>> >>
>> >> Roni Balthazar
>> >>
>> >>
>> >>
>> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstpppp@gmail.com> wrote:
>> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
>> >> > Compaction
>> >> > are running but VERY slow with "idle" IO.
>> >> >
>> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about
>> >> > ~120000
>> >> > (only
>> >> > xxx-Data.db) in DC_B has only ~4000.
>> >> >
>> >> > I don't know if this change anything but:
>> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really
>> >> > big
>> >> > ones,
>> >> > but most is really small (almost 10000 files are less then 100mb).
>> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
>> >> >
>> >> > Do you think that above flag will help us?
>> >> >
>> >> >
>> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstpppp@gmail.com>
wrote:
>> >> >>
>> >> >> I set setcompactionthroughput 999 permanently and it doesn't change
>> >> >> anything. IO is still same. CPU is idle.
>> >> >>
>> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
>> >> >> <ronibalthazar@gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> You can run "nodetool compactionstats" to view statistics on
>> >> >>> compactions.
>> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number
of
>> >> >>> SSTables when you use Size-Tiered compaction.
>> >> >>> You can also create a cron job to increase the value of
>> >> >>> setcompactionthroughput during the night or when your IO is
not
>> >> >>> busy.
>> >> >>>
>> >> >>> From http://wiki.apache.org/cassandra/NodeTool:
>> >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput
999
>> >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput
16
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> Roni Balthazar
>> >> >>>
>> >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstpppp@gmail.com>
>> >> >>> wrote:
>> >> >>> > One think I do not understand. In my case compaction is
running
>> >> >>> > permanently.
>> >> >>> > Is there a way to check which compaction is pending? The
only
>> >> >>> > information is
>> >> >>> > about total count.
>> >> >>> >
>> >> >>> >
>> >> >>> > On Monday, February 16, 2015, Ja Sam <ptrstpppp@gmail.com>
wrote:
>> >> >>> >>
>> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway
night build
>> >> >>> >> is
>> >> >>> >> available from
>> >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/
>> >> >>> >>
>> >> >>> >> I read about cold_reads_to_omit It looks promising.
Should I set
>> >> >>> >> also
>> >> >>> >> compaction throughput?
>> >> >>> >>
>> >> >>> >> p.s. I am really sad that I didn't read this before:
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On Monday, February 16, 2015, Carlos Rolo <rolo@pythian.com>
>> >> >>> >> wrote:
>> >> >>> >>>
>> >> >>> >>> Hi 100% in agreement with Roland,
>> >> >>> >>>
>> >> >>> >>> 2.1.x series is a pain! I would never recommend
the current
>> >> >>> >>> 2.1.x
>> >> >>> >>> series
>> >> >>> >>> for production.
>> >> >>> >>>
>> >> >>> >>> Clocks is a pain, and check your connectivity!
Also check
>> >> >>> >>> tpstats
>> >> >>> >>> to
>> >> >>> >>> see
>> >> >>> >>> if your threadpools are being overrun.
>> >> >>> >>>
>> >> >>> >>> Regards,
>> >> >>> >>>
>> >> >>> >>> Carlos Juzarte Rolo
>> >> >>> >>> Cassandra Consultant
>> >> >>> >>>
>> >> >>> >>> Pythian - Love your data
>> >> >>> >>>
>> >> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin:
>> >> >>> >>> linkedin.com/in/carlosjuzarterolo
>> >> >>> >>> Tel: 1649
>> >> >>> >>> www.pythian.com
>> >> >>> >>>
>> >> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
>> >> >>> >>> <r.etzenhammer@t-online.de> wrote:
>> >> >>> >>>>
>> >> >>> >>>> Hi,
>> >> >>> >>>>
>> >> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded
from 2.1.0
>> >> >>> >>>> (suggested
>> >> >>> >>>> by
>> >> >>> >>>> Al
>> >> >>> >>>> Tobey from DataStax)
>> >> >>> >>>> 7) minimal reads (usually none, sometimes
few)
>> >> >>> >>>>
>> >> >>> >>>> those two points keep me repeating an anwser
I got. First
>> >> >>> >>>> where
>> >> >>> >>>> did
>> >> >>> >>>> you
>> >> >>> >>>> get 2.1.3 from? Maybe I missed it, I will
have a look. But if
>> >> >>> >>>> it
>> >> >>> >>>> is
>> >> >>> >>>> 2.1.2
>> >> >>> >>>> whis is the latest released version, that
version has many
>> >> >>> >>>> bugs -
>> >> >>> >>>> most of
>> >> >>> >>>> them I got kicked by while testing 2.1.2.
I got many problems
>> >> >>> >>>> with
>> >> >>> >>>> compactions not beeing triggred on column
families not beeing
>> >> >>> >>>> read,
>> >> >>> >>>> compactions and repairs not beeing completed.
 See
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
>> >> >>> >>>>
>> >> >>> >>>> Apart from that, how are those both datacenters
connected?
>> >> >>> >>>> Maybe
>> >> >>> >>>> there
>> >> >>> >>>> is a bottleneck.
>> >> >>> >>>>
>> >> >>> >>>> Also do you have ntp up and running on all
nodes to keep all
>> >> >>> >>>> clocks
>> >> >>> >>>> in
>> >> >>> >>>> thight sync?
>> >> >>> >>>>
>> >> >>> >>>> Note: I'm no expert (yet) - just sharing my
2 cents.
>> >> >>> >>>>
>> >> >>> >>>> Cheers,
>> >> >>> >>>> Roland
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>> --
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >
>> >> >>
>> >> >>
>> >> >
>> >
>> >
>
>

Mime
View raw message