phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geoffrey Jacoby (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (PHOENIX-5604) Index rebuilds and read repairs should not skip WAL
Date Thu, 05 Dec 2019 21:55:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Geoffrey Jacoby resolved PHOENIX-5604.
--------------------------------------
    Resolution: Not A Problem

As [~kozdemir] suggested, the skip wal flag in the IndexRebuildRegionScanner doesn't appear
to get applied to the actual index updates, since the mutations in the IndexRebuildRegionScanner
are "virtual" mutations of the data table that get ignored by the IndexRegionObserver, and
just triggers rebuilds. 

Since these mutations aren't "real", skip wal is appropriate (though I'm not sure it actually
"does" anything since the commit never gets written to HBase.) 

> Index rebuilds and read repairs should not skip WAL
> ---------------------------------------------------
>
>                 Key: PHOENIX-5604
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5604
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>         Attachments: PHOENIX-5604-4.x-HBase-1.5.patch
>
>
> Currently both Index read repairs and IndexTool build/rebuilds in the new design continue
to skip the WAL, following the same pattern the old Indexer used. However, there are key differences
between the old and new logic that make this no longer the correct choice.
> First, recall that all HBase replication is based on tailing the WAL, and that any transaction
that skips the WAL doesn't get replicated. 
> In the old logic, the data table write (and WAL append) would be accompanied by an IndexedKeyValue
which would contain enough information to reconstitute the index edit in the event of a failure
before the index edit could be committed. So skipping the WAL during recovery was _potentially_ OK,
because writing to the WAL would be redundant locally. (But that still seems to me wrong in
a case with replication, since I don't believe IndexedKeyValues are replicated, since they
use the "magic" METAFAMILY cf.)  
> In the new logic, on a normal write, we write to the index first (which will go into
a WAL), then the data table (into a potentially different RS's WAL), and lastly the verified
flag flip into the Index, into the original index write's WAL. If something goes wrong with
stage 2 or 3, read repair will fix it, but if the repair action – whether a put or delete
– doesn't go into the WAL, a DR buddy of the index will be out of sync. 
> This is even more important on an async initial build of an index, where if I understand
right, there is no WAL append for the index write at all in the current UngroupedAggregateRegionObserver
rebuild logic. The same would be the case of a rebuild of a new-style index in the event of
non-Phoenix related corruption (such as HDFS or raw HBase level). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message