kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Alves <davidral...@gmail.com>
Subject Re: Data inconsistency after restart
Date Sat, 09 Dec 2017 06:13:22 GMT
Hi Petter

  Don't have answers yet, but I do have some more questions 
  (inline)

Petter von Dolwitz (Hem) writes:

> Hi David,
>
> In short to summarize:
>
> 1. I ingest data. Kudus maintenance threads stops working (soft 
> memory
> limit) and incoming data is throttled. There are no errors 
> reported on the
> client side.
  What is the "client side"? impala? spark? java/c++.

> 2. I stop ingestion and wait until i *think* Kudu is finsished.
  The question above is pertinent. Impala will not return until a 
  query
  is fully successful, though it may return an error and leave a 
  query
  only half-way executed. If you're using the client apis directly
  though are you checking for error when inserting?

> 3. I restart Kudu.
> 4. I validate the inserted data by doing count(*) on groups of 
> data in
> Kudu. For several groups, Kudu reports a lot of rows missing.
  Kudu's default scan mode is READ_LATEST. While this is the most
  performance oriented mode, its also the one with the least 
  guarantees
  so, on startup its possible that it reads from a stale replica, 
  giving
  the _appearance_ that rows went missing. Things to try here:
  - Try the same query a few minutes later. Is the answer 
  different?
  - If the above is true consider changing your scan mode to
  READ_AT_SNAPHOT. In this mode data is guaranteed not to be 
  state,
  though you might have to wait for all replicas to be ready

> 5. I ingest the same data again. Client reports that all row are 
> already
> present.
  This isn't surprising _if_ the problem is indeed from state 
  replicas.
> 6. Doing the count(*) exercise again now gives me the correct 
> number of
> rows.
>
> This tells me that the data was ingested into Kudu on the first 
> attempt but
> a scan did not find the data. Inserting the data again made it
> visible.
  Can it be that after the scan it's just that enough time has 
  elapsed
  so that all replicas are caught up? I'd say this is likely the 
  case.
>
> Br,
> Petter
>
> 2017-12-07 21:39 GMT+01:00 David Alves <davidralves@gmail.com>:
>
>> Hi Petter
>>
>>    I'd like to clarify what exactly happened and exactly what 
>>    are you
>> referring to as "inconsistency".
>>    From what I understand of the first error you observed, the 
>>    Kudu was
>> underprovisioned, memory wise, and the ingest jobs/queries 
>> failed. Is that
>> right? Since Kudu doesn't have atomic multi-row writes, it's 
>> currently
>> expected in this case that you'll end up with partially written 
>> data.
>>    If you tried the same job again, and it succeeded, for 
>>    certain types of
>> operation (UPSERT, INSERT IGNORE) then the remaining rows would 
>> be written
>> and all the data would be there as expected.
>>    I'd like to distinguish this lack of atomicity on multi-row
>> transactions from "inconsistency", which is what you might 
>> observe if an
>> operation didn't fail, but you couldn't see all the data. For 
>> this latter
>> case there are options you can choose to avoid any 
>> inconsistency.
>>
>> Best
>> David
>>
>>
>>
>> On Wed, Dec 6, 2017 at 4:26 AM, Petter von Dolwitz (Hem) <
>> petter.von.dolwitz@gmail.com> wrote:
>>
>>> Thanks for your reply Andrew!
>>>
>>> >How did you verify that all the data was inserted and how did 
>>> >you find
>>> some data missing?
>>> This was done using Impala. We counted the rows for groups 
>>> representing
>>> the chunks we inserted.
>>>
>>> >Following up on what I posted, take a look at
>>> https://kudu.apache.org/docs/transaction_semantics.html#_
>>> read_operations_scans. It seems definitely possible that not 
>>> all of the
>>> rows had finished inserting when counting, or that the scans 
>>> were sent to a
>>> stale replica.
>>> Before we shut down we could only see the following in the 
>>> logs. I.e., no
>>> sign that ingestion was still ongoing.
>>>
>>> kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201
>>> 07:27:35.010694 90793 maintenance_manager.cc:383] P
>>> a38902afefca4a85a5469d149df9b4cb: we have exceeded our soft 
>>> memory limit
>>> (current capacity is 67.52%).  However, there are no ops 
>>> currently runnable
>>> which would free memory.
>>>
>>> Also the (cloudera) metric 
>>> total_kudu_rows_inserted_rate_across_kudu_replicas
>>> showed zero.
>>>
>>> Still it seems like some data became inconsistent after 
>>> restart. But if
>>> the maintenance_manager performs important jobs that are 
>>> required to ensure
>>> that all data is inserted then I can understand why we ended 
>>> up with
>>> inconsistent data. But, if I understand you correct,  you are 
>>> saying that
>>> these jobs are not critical for ingestion. In the link you 
>>> provided I read
>>> "Impala scans are currently performed as READ_LATEST and have 
>>> no
>>> consistency guarantees.". I would assume this means that it 
>>> does not
>>> guarantee consistency if new data is inserted but should give 
>>> valid (and
>>> same) results if no new data is inserted?
>>>
>>> I have not tried the ksck tool yet. Thank you for reminding. I 
>>> will have
>>> a look.
>>>
>>> Br,
>>> Petter
>>>
>>>
>>> 2017-12-06 1:31 GMT+01:00 Andrew Wong <awong@cloudera.com>:
>>>
>>>> How did you verify that all the data was inserted and how did 
>>>> you find
>>>>> some data missing? I'm wondering if it's possible that the 
>>>>> initial
>>>>> "missing" data was data that Kudu was still in the process 
>>>>> of inserting
>>>>> (albeit slowly, due to memory backpressure or somesuch).
>>>>>
>>>>
>>>> Following up on what I posted, take a look at
>>>> https://kudu.apache.org/docs/transaction_semantics.html#_
>>>> read_operations_scans. It seems definitely possible that not 
>>>> all of the
>>>> rows had finished inserting when counting, or that the scans 
>>>> were sent to a
>>>> stale replica.
>>>>
>>>> On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong 
>>>> <awong@cloudera.com> wrote:
>>>>
>>>>> Hi Petter,
>>>>>
>>>>> When we verified that all data was inserted we found that 
>>>>> some data was
>>>>>> missing. We added this missing data and on some chunks we 
>>>>>> got the
>>>>>> information that all rows were already present, i.e impala 
>>>>>> says something
>>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the 
>>>>>> verification again now
>>>>>> shows that the Kudu table is complete. So, even though we 
>>>>>> did not insert
>>>>>> any data on some chunks, a count(*) operation over these 
>>>>>> chunks now returns
>>>>>> a different value.
>>>>>
>>>>>
>>>>> How did you verify that all the data was inserted and how 
>>>>> did you find
>>>>> some data missing? I'm wondering if it's possible that the 
>>>>> initial
>>>>> "missing" data was data that Kudu was still in the process 
>>>>> of inserting
>>>>> (albeit slowly, due to memory backpressure or somesuch).
>>>>>
>>>>> Now to my question. Will data be inconsistent if we recycle 
>>>>> Kudu after
>>>>>> seeing soft memory limit warnings?
>>>>>
>>>>>
>>>>> Your data should be consistently written, even with those 
>>>>> warnings.
>>>>> AFAIK they would cause a bit of slowness, not incorrect 
>>>>> results.
>>>>>
>>>>> Is there a way to tell when it is safe to restart Kudu to 
>>>>> avoid these
>>>>>> issues? Should we use any special procedure when restarting 
>>>>>> (e.g. only
>>>>>> restart the tablet servers, only restart one tablet server 
>>>>>> at a time or
>>>>>> something like that)?
>>>>>
>>>>>
>>>>> In general, you can use the `ksck` tool to check the health 
>>>>> of your
>>>>> cluster. See 
>>>>> https://kudu.apache.org/docs/command_line_tools_referenc
>>>>> e.html#cluster-ksck for more details. For restarting a 
>>>>> cluster, I
>>>>> would recommend taking down all tablet servers at once, 
>>>>> otherwise tablet
>>>>> replicas may try to replicate data from the server that was 
>>>>> taken down.
>>>>>
>>>>> Hope this helped,
>>>>> Andrew
>>>>>
>>>>> On Tue, Dec 5, 2017 at 10:42 AM, Petter von Dolwitz (Hem) <
>>>>> petter.von.dolwitz@gmail.com> wrote:
>>>>>
>>>>>> Hi Kudu users,
>>>>>>
>>>>>> We just started to use Kudu (1.4.0+cdh5.12.1). To make a 
>>>>>> baseline for
>>>>>> evaluation we ingested 3 month worth of data. During 
>>>>>> ingestion we were
>>>>>> facing messages from the maintenance threads that a soft 
>>>>>> memory limit were
>>>>>> reached. It seems like the background maintenance threads 
>>>>>> stopped
>>>>>> performing their tasks at this point in time. It also so 
>>>>>> seems like the
>>>>>> memory was never recovered even after stopping ingestion so 
>>>>>> I guess there
>>>>>> was a large backlog being built up. I guess the root cause 
>>>>>> here is that we
>>>>>> were a bit too conservative when giving Kudu memory. After 
>>>>>> a reststart a
>>>>>> lot of maintenance tasks were started (i.e. compaction).
>>>>>>
>>>>>> When we verified that all data was inserted we found that 
>>>>>> some data
>>>>>> was missing. We added this missing data and on some chunks 
>>>>>> we got the
>>>>>> information that all rows were already present, i.e impala 
>>>>>> says something
>>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the 
>>>>>> verification again now
>>>>>> shows that the Kudu table is complete. So, even though we 
>>>>>> did not insert
>>>>>> any data on some chunks, a count(*) operation over these 
>>>>>> chunks now returns
>>>>>> a different value.
>>>>>>
>>>>>> Now to my question. Will data be inconsistent if we recycle 
>>>>>> Kudu after
>>>>>> seeing soft memory limit warnings?
>>>>>>
>>>>>> Is there a way to tell when it is safe to restart Kudu to 
>>>>>> avoid these
>>>>>> issues? Should we use any special procedure when restarting 
>>>>>> (e.g. only
>>>>>> restart the tablet servers, only restart one tablet server 
>>>>>> at a time or
>>>>>> something like that)?
>>>>>>
>>>>>> The table design uses 50 tablets per day (times 90 
>>>>>> days). It is 8 TB
>>>>>> of data after 3xreplication over 5 tablet servers.
>>>>>>
>>>>>> Thanks,
>>>>>> Petter
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Andrew Wong
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Andrew Wong
>>>>
>>>
>>>
>>


--
David Alves

Mime
View raw message