kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Petter von Dolwitz (Hem)" <petter.von.dolw...@gmail.com>
Subject Re: Data inconsistency after restart
Date Fri, 08 Dec 2017 14:17:03 GMT
Hi David,

In short to summarize:

1. I ingest data. Kudus maintenance threads stops working (soft memory
limit) and incoming data is throttled. There are no errors reported on the
client side.
2. I stop ingestion and wait until i *think* Kudu is finsished.
3. I restart Kudu.
4. I validate the inserted data by doing count(*) on groups of data in
Kudu. For several groups, Kudu reports a lot of rows missing.
5. I ingest the same data again. Client reports that all row are already
present.
6. Doing the count(*) exercise again now gives me the correct number of
rows.

This tells me that the data was ingested into Kudu on the first attempt but
a scan did not find the data. Inserting the data again made it visible.

Br,
Petter

2017-12-07 21:39 GMT+01:00 David Alves <davidralves@gmail.com>:

> Hi Petter
>
>    I'd like to clarify what exactly happened and exactly what are you
> referring to as "inconsistency".
>    From what I understand of the first error you observed, the Kudu was
> underprovisioned, memory wise, and the ingest jobs/queries failed. Is that
> right? Since Kudu doesn't have atomic multi-row writes, it's currently
> expected in this case that you'll end up with partially written data.
>    If you tried the same job again, and it succeeded, for certain types of
> operation (UPSERT, INSERT IGNORE) then the remaining rows would be written
> and all the data would be there as expected.
>    I'd like to distinguish this lack of atomicity on multi-row
> transactions from "inconsistency", which is what you might observe if an
> operation didn't fail, but you couldn't see all the data. For this latter
> case there are options you can choose to avoid any inconsistency.
>
> Best
> David
>
>
>
> On Wed, Dec 6, 2017 at 4:26 AM, Petter von Dolwitz (Hem) <
> petter.von.dolwitz@gmail.com> wrote:
>
>> Thanks for your reply Andrew!
>>
>> >How did you verify that all the data was inserted and how did you find
>> some data missing?
>> This was done using Impala. We counted the rows for groups representing
>> the chunks we inserted.
>>
>> >Following up on what I posted, take a look at
>> https://kudu.apache.org/docs/transaction_semantics.html#_
>> read_operations_scans. It seems definitely possible that not all of the
>> rows had finished inserting when counting, or that the scans were sent to a
>> stale replica.
>> Before we shut down we could only see the following in the logs. I.e., no
>> sign that ingestion was still ongoing.
>>
>> kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201
>> 07:27:35.010694 90793 maintenance_manager.cc:383] P
>> a38902afefca4a85a5469d149df9b4cb: we have exceeded our soft memory limit
>> (current capacity is 67.52%).  However, there are no ops currently runnable
>> which would free memory.
>>
>> Also the (cloudera) metric total_kudu_rows_inserted_rate_across_kudu_replicas
>> showed zero.
>>
>> Still it seems like some data became inconsistent after restart. But if
>> the maintenance_manager performs important jobs that are required to ensure
>> that all data is inserted then I can understand why we ended up with
>> inconsistent data. But, if I understand you correct,  you are saying that
>> these jobs are not critical for ingestion. In the link you provided I read
>> "Impala scans are currently performed as READ_LATEST and have no
>> consistency guarantees.". I would assume this means that it does not
>> guarantee consistency if new data is inserted but should give valid (and
>> same) results if no new data is inserted?
>>
>> I have not tried the ksck tool yet. Thank you for reminding. I will have
>> a look.
>>
>> Br,
>> Petter
>>
>>
>> 2017-12-06 1:31 GMT+01:00 Andrew Wong <awong@cloudera.com>:
>>
>>> How did you verify that all the data was inserted and how did you find
>>>> some data missing? I'm wondering if it's possible that the initial
>>>> "missing" data was data that Kudu was still in the process of inserting
>>>> (albeit slowly, due to memory backpressure or somesuch).
>>>>
>>>
>>> Following up on what I posted, take a look at
>>> https://kudu.apache.org/docs/transaction_semantics.html#_
>>> read_operations_scans. It seems definitely possible that not all of the
>>> rows had finished inserting when counting, or that the scans were sent to a
>>> stale replica.
>>>
>>> On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong <awong@cloudera.com> wrote:
>>>
>>>> Hi Petter,
>>>>
>>>> When we verified that all data was inserted we found that some data was
>>>>> missing. We added this missing data and on some chunks we got the
>>>>> information that all rows were already present, i.e impala says something
>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the verification again now
>>>>> shows that the Kudu table is complete. So, even though we did not insert
>>>>> any data on some chunks, a count(*) operation over these chunks now returns
>>>>> a different value.
>>>>
>>>>
>>>> How did you verify that all the data was inserted and how did you find
>>>> some data missing? I'm wondering if it's possible that the initial
>>>> "missing" data was data that Kudu was still in the process of inserting
>>>> (albeit slowly, due to memory backpressure or somesuch).
>>>>
>>>> Now to my question. Will data be inconsistent if we recycle Kudu after
>>>>> seeing soft memory limit warnings?
>>>>
>>>>
>>>> Your data should be consistently written, even with those warnings.
>>>> AFAIK they would cause a bit of slowness, not incorrect results.
>>>>
>>>> Is there a way to tell when it is safe to restart Kudu to avoid these
>>>>> issues? Should we use any special procedure when restarting (e.g. only
>>>>> restart the tablet servers, only restart one tablet server at a time
or
>>>>> something like that)?
>>>>
>>>>
>>>> In general, you can use the `ksck` tool to check the health of your
>>>> cluster. See https://kudu.apache.org/docs/command_line_tools_referenc
>>>> e.html#cluster-ksck for more details. For restarting a cluster, I
>>>> would recommend taking down all tablet servers at once, otherwise tablet
>>>> replicas may try to replicate data from the server that was taken down.
>>>>
>>>> Hope this helped,
>>>> Andrew
>>>>
>>>> On Tue, Dec 5, 2017 at 10:42 AM, Petter von Dolwitz (Hem) <
>>>> petter.von.dolwitz@gmail.com> wrote:
>>>>
>>>>> Hi Kudu users,
>>>>>
>>>>> We just started to use Kudu (1.4.0+cdh5.12.1). To make a baseline for
>>>>> evaluation we ingested 3 month worth of data. During ingestion we were
>>>>> facing messages from the maintenance threads that a soft memory limit
were
>>>>> reached. It seems like the background maintenance threads stopped
>>>>> performing their tasks at this point in time. It also so seems like the
>>>>> memory was never recovered even after stopping ingestion so I guess there
>>>>> was a large backlog being built up. I guess the root cause here is that
we
>>>>> were a bit too conservative when giving Kudu memory. After a reststart
a
>>>>> lot of maintenance tasks were started (i.e. compaction).
>>>>>
>>>>> When we verified that all data was inserted we found that some data
>>>>> was missing. We added this missing data and on some chunks we got the
>>>>> information that all rows were already present, i.e impala says something
>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the verification again now
>>>>> shows that the Kudu table is complete. So, even though we did not insert
>>>>> any data on some chunks, a count(*) operation over these chunks now returns
>>>>> a different value.
>>>>>
>>>>> Now to my question. Will data be inconsistent if we recycle Kudu after
>>>>> seeing soft memory limit warnings?
>>>>>
>>>>> Is there a way to tell when it is safe to restart Kudu to avoid these
>>>>> issues? Should we use any special procedure when restarting (e.g. only
>>>>> restart the tablet servers, only restart one tablet server at a time
or
>>>>> something like that)?
>>>>>
>>>>> The table design uses 50 tablets per day (times 90 days). It is 8 TB
>>>>> of data after 3xreplication over 5 tablet servers.
>>>>>
>>>>> Thanks,
>>>>> Petter
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Andrew Wong
>>>>
>>>
>>>
>>>
>>> --
>>> Andrew Wong
>>>
>>
>>
>

Mime
View raw message