hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexis De La Cruz Toledo <>
Subject Why a GroupBy Operator need Two MapReduce jobs to solved?
Date Thu, 12 Apr 2012 23:56:30 GMT
Hi! I have a doubt, Why a GroupBy Operator is solved
in two MapReduce Job.
1. First the aggregation functions(sum(), count(), avg(), max(), etc) are
solved partial
2. After in another MapReduce Job the aggregation function is final.


Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message