phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Soldatov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-4537) RegionServer initiating compaction can trigger schema migration and deadlock the system
Date Thu, 18 Jan 2018 22:50:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Soldatov updated PHOENIX-4537:
-------------------------------------
    Description: 
[~sergey.soldatov] has been doing some great digging around a test failure we've been seeing
at $dayjob. The situation goes like this.

0. Run some arbitrary load
1. Stop HBase
2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} and {{phoenix.schema.mapSystemTablesToNamespace=true}}
in hbase-site.xml)
3. Start HBase
4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run before a client
first connects

When the RegionServer initiates the compaction, it will end up running {{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}}
which opens a Phoenix connection. While the RegionServer won't upgrade system tables, it *will*
try to migrate them into the schema mapped variants (e.g. SYSTEM.CATALOG to SYSTEM:CATALOG).

However, one of the first steps in the schema migration is to disable the SYSTEM.CATALOG table.
However, the SYSTEM.CATALOG table can't be disabled until the region is CLOSED, and the region
cannot be CLOSED until the compaction is finished. *deadlock*

The "obvious" fix is to avoid RegionServers from triggering system table migrations, but Sergey
and [~elserj] both think that this will end badly (RegionServers falling over because they
expect the tables to be migrated and they aren't).

Thoughts? [~ankit.singhal], [~jamestaylor], any others?

  was:
[~sergey.soldatov] has been doing some great digging around a test failure we've been seeing
at $dayjob. The situation goes like this.

0. Run some arbitrary load
1. Stop HBase
2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} and {{phoenix.schema.mapSystemTablesToNamespace=true}}
in hbase-site.xml)
3. Start HBase
4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run before a client
first connects

When the RegionServer initiates the compaction, it will end up running {{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}}
which opens a Phoenix connection. While the RegionServer won't upgrade system tables, it *will*
try to migrate them into the schema mapped variants (e.g. SYSTEM.CATALOG to SYSTEM:CATALOG).

However, one of the first steps in the schema migration is to disable the SYSTEM.CATALOG table.
However, the SYSTEM.CATALOG table can't be disabled until the region is CLOSED, and the region
cannot be CLOSED until the compaction is finished. *deadlock*

The "obvious" fix is to avoid RegionServers from triggering system table migrations, but Sergey
and I both think that this will end badly (RegionServers falling over because they expect
the tables to be migrated and they aren't).

Thoughts? [~ankit.singhal], [~jamestaylor], any others?


> RegionServer initiating compaction can trigger schema migration and deadlock the system
> ---------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4537
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4537
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Romil Choksi
>            Priority: Critical
>             Fix For: 5.0.0, 4.14.0
>
>
> [~sergey.soldatov] has been doing some great digging around a test failure we've been
seeing at $dayjob. The situation goes like this.
> 0. Run some arbitrary load
> 1. Stop HBase
> 2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} and {{phoenix.schema.mapSystemTablesToNamespace=true}}
in hbase-site.xml)
> 3. Start HBase
> 4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run before a
client first connects
> When the RegionServer initiates the compaction, it will end up running {{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}}
which opens a Phoenix connection. While the RegionServer won't upgrade system tables, it *will*
try to migrate them into the schema mapped variants (e.g. SYSTEM.CATALOG to SYSTEM:CATALOG).
> However, one of the first steps in the schema migration is to disable the SYSTEM.CATALOG
table. However, the SYSTEM.CATALOG table can't be disabled until the region is CLOSED, and
the region cannot be CLOSED until the compaction is finished. *deadlock*
> The "obvious" fix is to avoid RegionServers from triggering system table migrations,
but Sergey and [~elserj] both think that this will end badly (RegionServers falling over because
they expect the tables to be migrated and they aren't).
> Thoughts? [~ankit.singhal], [~jamestaylor], any others?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message