phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keren Gu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-2209) Building Local Index Asynchronously via IndexTool fails to populate index table
Date Wed, 26 Aug 2015 17:57:49 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Keren Gu updated PHOENIX-2209:
------------------------------
    Summary: Building Local Index Asynchronously via IndexTool fails to populate index table
 (was: Building Local Index Asynchronously via IndexTool fails to create index)

> Building Local Index Asynchronously via IndexTool fails to populate index table
> -------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2209
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2209
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.0
>         Environment: CDH: 5.4.4
> HBase: 1.0.0
> Phoenix: 4.5.0 (https://github.com/SiftScience/phoenix/tree/4.5-HBase-1.0) with hacks
for CDH compatibility. 
>            Reporter: Keren Gu
>              Labels: IndexTool, LocalIndex, index
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Using the Asynchronous Index population tool to create local index (of 1 column) on tables
with 10 columns, and 65M, 250M, 340M, and 1.3B rows respectively. 
> Table Schema as follows (with generic column names): 
> {quote}
> CREATE TABLE PH_SOJU_SHORT (
> id INT PRIMARY KEY,
> c2 VARCHAR NULL,
> c3 VARCHAR NULL,
> c4 VARCHAR NULL,
> c5 VARCHAR NULL,
> c6 VARCHAR NULL,
> c7 DOUBLE NULL,
> c8 VARCHAR NULL,
> c9 VARCHAR NULL,
> c10 BIGINT NULL
> )
> {quote}
> Example command used (for 65M row table): 
> {quote}
> 0: jdbc:phoenix:localhost> create index LC_INDEX_SOJU_EVAL_FN on PH_SOJU_SHORT(C4)
async;
> {quote}
> And MR job started with command: 
> {quote}
> $ hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table PH_SOJU_SHORT --index-table
LC_INDEX_SOJU_EVAL_FN --output-path LC_INDEX_SOJU_EVAL_FN_HFILE
> {quote}
> The IndexTool MR jobs finished in 18min, 77min, 77min, and 2hr 34min respectively, but
all index tables where empty. 
> For the table with 65M rows, IndexTool had 12 mappers and reducers. MR Counters show
Map input and output records = 65M, Reduce Input and output records = 65M. PhoenixJobCounters
input and output records are all 65M. 
> IndexTool Reducer Log tail: 
> {quote}
> ...
> 2015-08-25 00:26:44,687 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last
merge-pass, with 32 segments left of total size: 22805636866 bytes
> 2015-08-25 00:26:44,693 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
File Output Committer Algorithm version is 1
> 2015-08-25 00:26:44,765 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
hadoop.native.lib is deprecated. Instead, use io.native.lib.available
> 2015-08-25 00:26:44,908 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
> 2015-08-25 00:26:45,060 INFO [main] org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:36:43,880 INFO [main] org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2:
Writer=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/_temporary/attempt_1440094483400_5974_r_000000_0/0/496b926ad624438fa08626ac213d0f92,
wrote=10737418236
> 2015-08-25 00:36:45,967 INFO [main] org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:38:43,095 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1440094483400_5974_r_000000_0
is done. And is in the process of committing
> 2015-08-25 00:38:43,123 INFO [main] org.apache.hadoop.mapred.Task: Task attempt_1440094483400_5974_r_000000_0
is allowed to commit now
> 2015-08-25 00:38:43,132 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
Saved output of task 'attempt_1440094483400_5974_r_000000_0' to hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/task_1440094483400_5974_r_000000
> 2015-08-25 00:38:43,158 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1440094483400_5974_r_000000_0'
done.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message