hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Ferguson <>
Subject Number of Mappers
Date Mon, 12 Jan 2009 04:45:31 GMT
If I'm running a query like this:

hive> SELECT TRANSFORM(actor_id) USING '/my/script' AS (actor_id,  
percentile, count) FROM activities;

It creates a map job for each file. I need every row that is in the  
table to be run through a single instance of the script since certain  
parts require global list information. Do I need to rework this query  
to use a reducer or can I change some configuration variable to load  
in all of my data from this table and run it through /my/script all at  

Josh F.

View raw message