phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William <>
Subject Provide an option to infinite retry when updating index failed
Date Wed, 09 Aug 2017 03:09:56 GMT
Hi all,
To maintain consistency between data table and its index tables, we have to do a transactional
update cross regions on different region servers. For non-transactional table, we cannot guarantee
this consistency for mutable global secondary index. Here are the problems of existing solutions:
1. disable index write
  a) update system.catalog to change index status, and set timestamp, may lead to chain failures
  b) partially rebuild index may not be a good solution for production env, because:
     b1) may execute for a long time for large table (several TBs)
     b2) there might be only a few inconsistent data which needs to be caught up but we have
to do a full table time-ranged scan over the data table
     b3) if there are deletes/updates and a major compaction took place, it'll leave dirty
data in index tables
  c) selects that hits the disabled index will degenerate to full table scan against data
table which may quickly exhausts the read ability of the whole cluster

2. disable data table write
  a) selects that hits index still works
  b) actually data table write is not disabled, but raise an exception. So  still needs to
rebuild index tables when index regions are back online, which has the same issues in 1.b
  c) as index rebuild is needed, system.catalog still needs to be updated, so chain failure
may still happen.

What should be guaranteed:
1. absolutely no chain failure
2. absolutely no inconsistency no matter what happened
3. selects that hit the index will not degenerate

New solution:
1. When update index failed, retry forever until succeed
2. Do the same retry when replaying WAL
3. No need to update catalog table to avoid potential chain failures
4. This index failure policy is an option that can be switched on/off

About this solution:
1. Simple
2. When update index failed, we give up the write ability to maintain consistency and read
ability. This is acceptable for mutable global index as its read ability is more important.
3. No need to rebuild index afterwards, as long as the pending retries complete, indexes will
be in sync.
4. In worst case, some or all of the RS will not be able to write.
5. We cannot handle index updates failure elegantly because we are not doing real transactions.
So this solution is a simple but effective way to achieve consistency without transactions,
though there is a price.

What does everybody think?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message