accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: Values go to a wrong table during recovery.
Date Wed, 18 Feb 2015 23:43:45 GMT
Sorry, that link should be: https://issues.apache.org/jira/browse/ACCUMULO


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Wed, Feb 18, 2015 at 6:42 PM, Christopher <ctubbsii@apache.org> wrote:

> Hi Denis,
>
> This doesn't sound like a known bug to me. Your hypothesis is reasonable,
> since WALs use a surrogate ID, which maps to table ID/tablet information,
> when read back. It is possible that it incorrectly interprets this mapping
> and replays data into the wrong table. Given the amount of testing we do,
> my instinct is to think this is unlikely, but if we can confirm this bug,
> it would definitely be a very critical one.
>
> To rule out some scenarios, is it possible that your clients are writing
> to the wrong tables? Have you ever seen a failure affecting a table which
> does not exist (like what might happen if there's an off-by-one error in
> the WAL code)? Or affecting the metadata tables?
>
> Can you reproduce this error reliably, or can you share the relevant
> ingest code which can reproduce this failure? Also, what kind of tablet
> server failures are you experiencing when this happens?
>
> If you could file a bug report at
> https://issues.apache.org/browse/ACCUMULO with any details and/or
> attachments to help us address the issue, we would greatly appreciate it.
> This seems like something we'd want to fix pretty quickly.
>
> Thanks!
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
> On Wed, Feb 18, 2015 at 6:26 PM, Denis <denis@camfex.cz> wrote:
>
>> Hello.
>>
>> Few times I noticed that some tables have values they cannot have, and
>> those entries have timestamp close to a tabletserver failure time.
>> (I mean wrong format, one table has msgpack values at least 10 bytes
>> long and another table has 1-byte values and after a failure I read
>> one or two 1-byte values in the table where I expect to read msgpack).
>>
>> I suspect that during recovery process, when WAL is being read, some
>> entries are inserted to a wrong table.
>>
>> May be it is a know bug as I am still using Accumulo 1.6.1
>>
>
>

Mime
View raw message