phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geoffrey Jacoby (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-5027) PhoenixIndexImportDirectMapper retried mappers can succeed without inserting all index data
Date Fri, 16 Nov 2018 21:16:00 GMT
Geoffrey Jacoby created PHOENIX-5027:
----------------------------------------

             Summary: PhoenixIndexImportDirectMapper retried mappers can succeed without inserting
all index data
                 Key: PHOENIX-5027
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5027
             Project: Phoenix
          Issue Type: Bug
            Reporter: Geoffrey Jacoby


On two recent occasions I've rebuilt a large global immutable index by doing a DROP/CREATE
and ended up with missing index data, though it doesn't happen every time. Here's what happened:

1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, and invokes IndexTool.
2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
3. Some mappers fail because of timeouts due to heavy splitting on the new index table
4. Those mappers are retried and succeed. The MR job as a whole completes successfully.
5. RowCounter and IndexScrutinyTool show millions of rows are missing from the index, with
keys that imply they were part of the failed mappers

Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in PhoenixIndexImportDirectMapper
_looks_ idempotent on a rerun, so I've been struggling to find the cause of the missing index
data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message