phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-4704) Presplit index tables when building asynchronously
Date Wed, 23 May 2018 19:37:01 GMT


Hudson commented on PHOENIX-4704:

FAILURE: Integrated in Jenkins build PreCommit-PHOENIX-Build #1885 (See [])
PHOENIX-4704 Presplit index tables when building asynchronously (vincentpoon: rev 6ab9b372f16f37b11e657b6803c6a60007815824)
* (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/
* (edit) phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/

> Presplit index tables when building asynchronously
> --------------------------------------------------
>                 Key: PHOENIX-4704
>                 URL:
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Vincent Poon
>            Assignee: Vincent Poon
>            Priority: Major
>             Fix For: 4.14.0, 5.0.0
>         Attachments: PHOENIX-4704.master.v1.patch, PHOENIX-4704.master.v2.patch
> For large data tables with many regions, if we build the index asynchronously using the
IndexTool, the index table will initial face a hotspot as all data region mappers attempt
to write to the sole new index region.  This can potentially lead to the index getting disabled
if writes to the index table timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of regions in
the data table) to the IndexTool to first do a MR job to gather stats on the indexed column
values, and then attempt to presplit the index table before we do the actual index build MR

This message was sent by Atlassian JIRA

View raw message