cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Mills <...@bitbrew.com>
Subject Re: Repair Issues
Date Thu, 24 Oct 2019 20:59:03 GMT
Hi Sergio,

No, not at this time.

It was in use with this cluster previously, and while there were no
reaper-specific issues, it was removed to help simplify investigation of
the underlying repair issues I've described.

Thanks.

On Thu, Oct 24, 2019 at 4:21 PM Sergio <lapostadisergio@gmail.com> wrote:

> Are you using Cassandra reaper?
>
> On Thu, Oct 24, 2019, 12:31 PM Ben Mills <ben@bitbrew.com> wrote:
>
>> Greetings,
>>
>> Inherited a small Cassandra cluster with some repair issues and need some
>> advice on recommended next steps. Apologies in advance for a long email.
>>
>> Issue:
>>
>> Intermittent repair failures on two non-system keyspaces.
>>
>> - platform_users
>> - platform_management
>>
>> Repair Type:
>>
>> Full, parallel repairs are run on each of the three nodes every five days.
>>
>> Repair command output for a typical failure:
>>
>> [2019-10-18 00:22:09,109] Starting repair command #46, repairing keyspace
>> platform_users with repair options (parallelism: parallel, primary range:
>> false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters:
>> [], hosts: [], # of ranges: 12)
>> [2019-10-18 00:22:09,242] Repair session
>> 5282be70-f13d-11e9-9b4e-7f6db768ba9a for range
>> [(-1890954128429545684,2847510199483651721],
>> (8249813014782655320,-8746483007209345011],
>> (4299912178579297893,6811748355903297393],
>> (-8746483007209345011,-8628999431140554276],
>> (-5865769407232506956,-4746990901966533744],
>> (-4470950459111056725,-1890954128429545684],
>> (4001531392883953257,4299912178579297893],
>> (6811748355903297393,6878104809564599690],
>> (6878104809564599690,8249813014782655320],
>> (-4746990901966533744,-4470950459111056725],
>> (-8628999431140554276,-5865769407232506956],
>> (2847510199483651721,4001531392883953257]] failed with error [repair
>> #5282be70-f13d-11e9-9b4e-7f6db768ba9a on platform_users/access_tokens_v2,
>> [(-1890954128429545684,2847510199483651721],
>> (8249813014782655320,-8746483007209345011],
>> (4299912178579297893,6811748355903297393],
>> (-8746483007209345011,-8628999431140554276],
>> (-5865769407232506956,-4746990901966533744],
>> (-4470950459111056725,-1890954128429545684],
>> (4001531392883953257,4299912178579297893],
>> (6811748355903297393,6878104809564599690],
>> (6878104809564599690,8249813014782655320],
>> (-4746990901966533744,-4470950459111056725],
>> (-8628999431140554276,-5865769407232506956],
>> (2847510199483651721,4001531392883953257]]] Validation failed in /10.x.x.x
>> (progress: 26%)
>> [2019-10-18 00:22:09,246] Some repair failed
>> [2019-10-18 00:22:09,248] Repair command #46 finished in 0 seconds
>>
>> Additional Notes:
>>
>> Repairs encounter above failures more often than not. Sometimes on one
>> node only, though occasionally on two. Sometimes just one of the two
>> keyspaces, sometimes both. Apparently the previous repair schedule for
>> this cluster included incremental repairs (script alternated between
>> incremental and full repairs). After reading this TLP article:
>>
>>
>> https://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html
>>
>> the repair script was replaced with cassandra-reaper (v1.4.0), which was
>> run with its default configs. Reaper was fine but only obscured the ongoing
>> issues (it did not resolve them) and complicated the debugging process and
>> so was then removed. The current repair schedule is as described above
>> under Repair Type.
>>
>> Attempts at Resolution:
>>
>> (1) nodetool scrub was attempted on the offending keyspaces/tables to no
>> effect.
>>
>> (2) sstablescrub has not been attempted due to the current design of the
>> Docker image that runs Cassandra in each Kubernetes pod - i.e. there is no
>> way to stop the server to run this utility without killing the only pid
>> running in the container.
>>
>> Related Error:
>>
>> Not sure if this is related, though sometimes, when either:
>>
>> (a) Running nodetool snapshot, or
>> (b) Rolling a pod that runs a Cassandra node, which calls nodetool drain
>> prior shutdown,
>>
>> the following error is thrown:
>>
>> -- StackTrace --
>> java.lang.RuntimeException: Last written key
>> DecoratedKey(10df3ba1-6eb2-4c8e-bddd-c0c7af586bda,
>> 10df3ba16eb24c8ebdddc0c7af586bda) >= current key
>> DecoratedKey(00000000-0000-0000-0000-000000000000,
>> 17343121887f480c9ba87c0e32206b74) writing into
>> /cassandra_data/data/platform_management/device_by_tenant_v2-e91529202ccf11e7ab96d5693708c583/.device_by_tenant_tags_idx/mb-45-big-Data.db
>>             at
>> org.apache.cassandra.io.sstable.format.big.BigTableWriter.beforeAppend(BigTableWriter.java:114)
>>             at
>> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:153)
>>             at
>> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.append(SimpleSSTableMultiWriter.java:48)
>>             at
>> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:441)
>>             at
>> org.apache.cassandra.db.Memtable$FlushRunnable.call(Memtable.java:477)
>>             at
>> org.apache.cassandra.db.Memtable$FlushRunnable.call(Memtable.java:363)
>>             at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>             at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>             at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>             at java.lang.Thread.run(Thread.java:748)
>>
>> Here are some details on the environment and configs in the event that
>> something is relevant.
>>
>> Environment: Kubernetes
>> Environment Config: Stateful set of 3 replicas
>> Storage: Persistent Volumes
>> Storage Class: SSD
>> Node OS: Container-Optimized OS
>> Container OS: Ubuntu 16.04.3 LTS
>>
>> Version: Cassandra 3.7
>> Data Centers: 1
>> Racks: 3 (one per zone)
>> Nodes: 3
>> Tokens: 4
>> Replication Factor: 3
>> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
>> Compaction Strategy: STCS (all tables)
>> Read/Write Requirements: Blend of both
>> Data Load: <1GB per node
>> gc_grace_seconds: default (10 days - all tables)
>>
>> Memory: 4Gi per node
>> CPU: 3.5 per node (3500m)
>>
>> Java Version: 1.8.0_144
>>
>> Heap Settings:
>>
>> -XX:+UnlockExperimentalVMOptions
>> -XX:+UseCGroupMemoryLimitForHeap
>> -XX:MaxRAMFraction=2
>>
>> GC Settings: (CMS)
>>
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=1
>> -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSWaitDuration=30000
>> -XX:+CMSParallelInitialMarkEnabled
>> -XX:+CMSEdenChunksRecordAlways
>>
>> Any ideas are much appreciated.
>>
>

Mime
View raw message