hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Goryunov <a.goryu...@gmail.com>
Subject Distributed execution for UNION ALL
Date Fri, 04 May 2012 12:52:52 GMT
Hello,

I have a query like

SELECT * FROM (
SELECT 1, concat(1_timestamp, ', ', 2_account_id )
FROM table_1 WHERE 2_account_id = 1132576 LIMIT 1000000000
UNION ALL
SELECT 2, concat(1_timestamp, ', ', 2_account_id )
FROM table_2 WHERE 2_account_id = 1132576 LIMIT 1000000000
UNION ALL
SELECT 3, concat(1_timestamp, ', ', 2_account_id )
FROM table_3 WHERE 2_account_id = 1132576 LIMIT 1000000000
UNION ALL
.... // some hundred tables here
) res;

Parallel jobs set to true in hive config and it creates mapreduce max map
tasks on the requested node.

What should be done to distribute that jobs over the all cluster nodes?

Thanks.

Mime
View raw message