cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From varun saluja <saluj...@gmail.com>
Subject Re: Long running compaction on huge hint table.
Date Wed, 17 May 2017 01:53:39 GMT
Hi,


 Truncatehints on nodes is running for more than 7 hours now. Nothing
mentioned for same in sysemt logs even.

And compaction stats reports increase in hints total bytes.

pending tasks: 1
   compaction type   keyspace   table     completed          total    unit
  progress
        Compaction     system   hints   12152557998   869257869352   bytes
     1.40%
Active compaction remaining time :   0h27m14s

Can anything else be checked here? Will manually deleting system.hint files
and restart node fix this.



Regards,
Varun Saluja

On 16 May 2017 at 23:29, varun saluja <saluja50@gmail.com> wrote:

> Hi Jeff,
>
> I ran nodetool truncatehints  on all nodes. Its running for more than 30
> mins now. Status for compactstats reports same.
>
> pending tasks: 1
>    compaction type   keyspace   table     completed          total    unit
>   progress
>         Compaction     system   hints   11189118129   851658989612   bytes
>      1.31%
> Active compaction remaining time :   0h26m43s
>
> Will truncatehints takes time for completion? Could not see anything
> related truncatehints in system logs.
>
> Please let me know if anything else can be checked here.
>
> Regards,
> Varun Saluja
>
>
>
> On 16 May 2017 at 20:58, varun saluja <saluja50@gmail.com> wrote:
>
>> Thanks a lot Jeff.
>>
>> You have explaned very well here. We have consitency as local quorum.
>> Will follow truncate hints and repair therafter.
>>
>> I hope this brings cluster in stable state
>>
>> Thanks again.
>>
>> Regards,
>> Varun Saluja
>>
>> Sent from my iPhone
>>
>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa <jjirsa@apache.org> wrote:
>> >
>> >
>> > In Cassandra versions up to 3.0, hints are stored within a table, where
>> the partition key is the host ID of the server for which the hints are
>> stored.
>> >
>> > In such a data model, accumulating 800GB of hints is almost certain to
>> cause very wide rows, which will in turn cause GC pressure when you attempt
>> to read the hints for delivery. This will cause GC pauses, which will cause
>> hints to fail to be delivered, which will cause more hints to be stored.
>> This is bad.
>> >
>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1,
>> your most likely corrective course is to use 'nodetool truncatehints' on
>> all servers, followed by 'nodetool repair' to deliver the data you lost by
>> truncating the hints.
>> >
>> > NOTE: this is ONLY safe if you wrote with a consistency level stronger
>> than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you
>> truncate hints.
>> >
>> > - Jeff
>> >
>> >> On 2017-05-16 06:50 (-0700), varun saluja <saluja50@gmail.com> wrote:
>> >> Thanks for update.
>> >> I could see lot of io waits. This causing  Gc and mutation drops .
>> >> But as i mentioned we do not have high load for now. Hint replays are
>> creating such high disk I/O.
>> >> compactionstats show very high hint bytes like 780gb around. Is this
>> normal?
>> >>
>> >> Just mentioning we are using flash disks.
>> >>
>> >> In such case, if i run truncatehints , will it remove or decrease size
>> of hints bytes in compaction stats. I can trigger repair therafter.
>> >> Please let me know if any recommendation on same.
>> >>
>> >> Also , table which we dumped from kafka which created this much hints
>> and compaction pendings is also dropped today. Because we have to redump
>> table again once cluster is stable.
>> >>
>> >> Regards,
>> >> Varun
>> >>
>> >> Sent from my iPhone
>> >>
>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth <nitan@bamlabs.com> wrote:
>> >>>
>> >>> Yes but it means data has to be replicated using repair.
>> >>>
>> >>> Hints are out come of unhealthy nodes, focus on finding why you have
>> mutation drops, is it node, io or network etc. ideally you shouldn't see
>> increasing hints all the time.
>> >>>
>> >>> Sent from my iPhone
>> >>>
>> >>>> On May 16, 2017, at 7:58 AM, varun saluja <saluja50@gmail.com>
>> wrote:
>> >>>>
>> >>>> Hi Nitan,
>> >>>>
>> >>>> Thanks for response.
>> >>>>
>> >>>> Yes, I could see mutation drops and increase count in system.hints.
>> Is there any way , i can proceed to truncate hints like using nodetool
>> truncatehints.
>> >>>>
>> >>>>
>> >>>> Regards,
>> >>>> Varun Saluja
>> >>>>
>> >>>>> On 16 May 2017 at 17:52, Nitan Kainth <nitan@bamlabs.com>
wrote:
>> >>>>> Do you see mutation drops?
>> >>>>> Select count from system.hints; is it increasing?
>> >>>>>
>> >>>>> Sent from my iPhone
>> >>>>>
>> >>>>>> On May 16, 2017, at 5:52 AM, varun saluja <saluja50@gmail.com>
>> wrote:
>> >>>>>>
>> >>>>>> Hi Experts,
>> >>>>>>
>> >>>>>> We are facing issue on production cluster. Compaction on
>> system.hint table is running from last 2 days.
>> >>>>>>
>> >>>>>>
>> >>>>>> pending tasks: 1
>> >>>>>>   compaction type   keyspace   table     completed     
    total
>>                     unit   progress
>> >>>>>>              Compaction     system   hints   20623021829
>>  877874092407   bytes      2.35%
>> >>>>>> Active compaction remaining time :   0h27m15s
>> >>>>>>
>> >>>>>>
>> >>>>>> Active compaction remaining time shows in minutes.  But,
this is
>> job is running like indefinitely.
>> >>>>>>
>> >>>>>> We have 3 node cluster V 2.1.7. And we ran  write intensive
job
>> last week on particular table.
>> >>>>>> Compaction on this table finished but hint table size is
growing
>> continuously.
>> >>>>>>
>> >>>>>> Can someone Please help me.
>> >>>>>>
>> >>>>>>
>> >>>>>> Thanks & Regards,
>> >>>>>> Varun Saluja
>> >>>>>>
>> >>>>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> > For additional commands, e-mail: user-help@cassandra.apache.org
>> >
>>
>
>

Mime
View raw message