kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re:
Date Tue, 02 Jul 2019 22:02:01 GMT
On Tue, Jul 2, 2019 at 1:25 AM Дмитрий Павлов <dm.pavlov@inbox.ru> wrote:

>
> Hi guys
>
> I'm encountered a strange behaviour about replication of 2 tablets in my
> table in Kudu cluster
> The table was in UNDER REPLICATED status. So i stoped all activity on
> cluster to make it cold.
>

Do you have the original ksck output for this table while it was in
UNDER_REPLICATED state? Do you have the tserver and master logs from the
time during which it was under-replicated? If you grep for this tablet ID
you can hopefully find a reason why it was unable to re-replicate.


> But even in 2 hours table was in UNDER REPLICATED state so i checked
> rows_updated/rows_inserted metric and found out what replication process is
> very slow 1-2 K rows per second. I checked the logs of 2 tservers where
> tablets were located i found following errors:
>

Note that rows_updated and rows_inserted metrics are unrelated to
re-replication of under-replicated tablets. They only represent the number
of rows inserted/updated by end users of the tablet. Re-replication of a
missing tablet happens by physical data copies, not by row-level operations.


>
> W0701 12:10:07.124253    10 kernel_stack_watchdog.cc:198] Thread 141396
> stuck at /tmp/apache-kudu-1.9
> Kernel stack:
>
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
>
> User stack:
>
>     @     0x7f04971a45d0  (unknown)
>
>     @           0xb4b21c  kudu::consensus::LogCache::EvictSomeUnlocked()
>
>     @           0xb4bec6  kudu::consensus::LogCache::EvictThroughOp()
>
>     @           0xb47d2f
>  kudu::consensus::PeerMessageQueue::ResponseFromPeer()
>     @           0xb490b1
>  kudu::consensus::PeerMessageQueue::LocalPeerAppendFinished()
>     @           0xb4bbcc  kudu::consensus::LogCache::LogCallback()
>
>     @           0xb972d2  kudu::log::Log::AppendThread::HandleGroup()
>
>     @           0xb97c2d  kudu::log::Log::AppendThread::DoWork()
>
>     @          0x1e5bdff  kudu::ThreadPool::DispatchThread()
>
>     @          0x1e51634  kudu::Thread::SuperviseThread()
>
>     @     0x7f049719cdd5  start_thread
>
>     @     0x7f0495473ead  __clone
>

This is just a warning about a potential latency blip, and likely
completely unrelated to the problem you're reporting.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message