phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound
Date Fri, 13 Oct 2017 05:57:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203061#comment-16203061
] 

James Taylor edited comment on PHOENIX-4242 at 10/13/17 5:56 AM:
-----------------------------------------------------------------

Here's some initial feedback, but would you mind reviewing this too, [~tdsilva]?
- You'll want to select COLUMN_NAME in addition to COLUMN_FAMILY as it looks like we store
the tenant ID there. Then we'll need to create a new PhoenixConnection for the {{PhoenixRuntime.getTableNoCache()}}
call whenever the value of COLUMN_NAME changes (creating a connection is not an expensive
operation). You'd set the PhoenixRuntime.TenantId connection property so that you'd only be
looking for the tables for that tenant. You can collect up as many different TABLE_SCHEM,
TABLE_NAME pairs for the same TENANT_ID and query for them in a single shot (i.e. WHERE (%s,
%s, %s, %s) IN ((?,?,?,?), (?,?,?,?), ...).
{code}
+    private static final String CHILD_LINK_QUERY =
+            String.format(
+                "SELECT " + COLUMN_FAMILY + " FROM " + SYSTEM_CATALOG
+                        + " WHERE (%s, %s, %s, %s) IN ((?,?,?,?))",
+                TENANT_ID, TABLE_SCHEM, TABLE_NAME, LINK_TYPE);
+
+    private static void getAllDescendantViewsRecursive(Connection conn, PName tenantId, PName
tableSchem, PName tableName, List<PTable> results) throws SQLException {
+          try (PreparedStatement ps = conn.prepareStatement(CHILD_LINK_QUERY)) {
+              ps.setString(1, tenantId == null ? null : tenantId.getString());
+              ps.setString(2, tableSchem == null ? null : tableSchem.getString());
+              ps.setString(3, tableName.getString());
+              ps.setByte(4, PTable.LinkType.CHILD_TABLE.getSerializedValue());
+              try (ResultSet rs = ps.executeQuery()) {
+                  while (rs.next()) {
+                      String childTableName = rs.getString(1);
+                      PTable childPTable = PhoenixRuntime.getTableNoCache(conn, childTableName);
+                      results.add(childPTable);
+                      getAllDescendantViewsRecursive(conn, childPTable.getTenantId(),
+                          childPTable.getSchemaName(), childPTable.getTableName(), results);
+                  }
+              }
+          }
+    }
{code}
- Create a test that requires the above. A good way would be to have the same named table
for different tenants. You should only pickup the correct children.
- If the number of children is large (which can be the case as we'll have a child linking
row for every tenant), then we'll be churning the LRU server side cache by loading the PTable
for every child. I think instead we can only get the index linking rows and in turn lookup
the INDEX_DISABLED_TIMESTAMP for each index.
- An important one to follow up on is PHOENIX-4263 to see if our partial index rebuilder handles
rebuilding indexes on views.
- [~tdsilva] - did we consider encoding the tenant ID in the COLUMN_FAMILY column so we could
keep the COLUMN_NAME null to ensure linking rows would be grouped together? Or an alternative
would be to add another nullable column (like CHILD_TENANT_ID) at the end of the SYSTEM CATALOG
row key. We may want to consider changing this because there can be many many child linking
rows potentially. Maybe file a JIRA for further discussion?


was (Author: jamestaylor):
Here's some initial feedback, but would you mind reviewing this too, [~tdsilva]?
- You'll want to select COLUMN_NAME in addition to COLUMN_FAMILY as it looks like we store
the tenant ID there. Then we'll need to create a new PhoenixConnection for the {{PhoenixRuntime.getTableNoCache()}}
call whenever the value of COLUMN_NAME changes (creating a connection is not an expensive
operation). You'd set the PhoenixRuntime.TenantId connection property so that you'd only be
looking for the tables for that tenant. You can collect up as many different TABLE_SCHEM,
TABLE_NAME pairs for the same TENANT_ID and query for them in a single shot (i.e. WHERE (%s,
%s, %s, %s) IN ((?,?,?,?), (?,?,?,?), ...).
{code}
+    private static final String CHILD_LINK_QUERY =
+            String.format(
+                "SELECT " + COLUMN_FAMILY + " FROM " + SYSTEM_CATALOG
+                        + " WHERE (%s, %s, %s, %s) IN ((?,?,?,?))",
+                TENANT_ID, TABLE_SCHEM, TABLE_NAME, LINK_TYPE);
+
+    private static void getAllDescendantViewsRecursive(Connection conn, PName tenantId, PName
tableSchem, PName tableName, List<PTable> results) throws SQLException {
+          try (PreparedStatement ps = conn.prepareStatement(CHILD_LINK_QUERY)) {
+              ps.setString(1, tenantId == null ? null : tenantId.getString());
+              ps.setString(2, tableSchem == null ? null : tableSchem.getString());
+              ps.setString(3, tableName.getString());
+              ps.setByte(4, PTable.LinkType.CHILD_TABLE.getSerializedValue());
+              try (ResultSet rs = ps.executeQuery()) {
+                  while (rs.next()) {
+                      String childTableName = rs.getString(1);
+                      PTable childPTable = PhoenixRuntime.getTableNoCache(conn, childTableName);
+                      results.add(childPTable);
+                      getAllDescendantViewsRecursive(conn, childPTable.getTenantId(),
+                          childPTable.getSchemaName(), childPTable.getTableName(), results);
+                  }
+              }
+          }
+    }
{code}
- Create a test that requires the above. A good way would be to have the same named table
for different tenants. You should only pickup the correct children.
- An important one to follow up on is PHOENIX-4263 to see if our partial index rebuilder handles
rebuilding indexes on views.
- [~tdsilva] - did we consider encoding the tenant ID in the COLUMN_FAMILY column so we could
keep the COLUMN_NAME null to ensure linking rows would be grouped together? Or an alternative
would be to add another nullable column (like CHILD_TENANT_ID) at the end of the SYSTEM CATALOG
row key. We may want to consider changing this because there can be many many child linking
rows potentially. Maybe file a JIRA for further discussion?

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --------------------------------------------------------------
>
>                 Key: PHOENIX-4242
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4242
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>            Reporter: Vincent Poon
>            Assignee: Vincent Poon
>         Attachments: PHOENIX-4242.v2.master.patch, PHOENIX-4242.v3.master.patch, PHOENIX-4747.v1.master.patch
>
>
> The post-compact hook in the Indexer seems to log extraneous log messages indicating
NPE or TableNotFound.  The TableNotFound exceptions seem to indicate actual table names prefixed
with MERGE or RESTORE, and sometimes suffixed with a digit, so perhaps these are views or
something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable to permanently
disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable to permanently
disable indexes being partially rebuild for MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table undefined.
tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message