hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajat Jain <>
Subject Question regarding HIVE-6888
Date Sat, 22 Aug 2015 06:30:28 GMT
This is regarding the fix that was incorporated in HIVE-6888
<> (commit

The fix was issued because the MapWork objects were being leaked due to
having multiple AMs. However, there are cases when this fix clears gWorkMap
prematurely and it is populated (and cleared) again. For example, when
HiveInputFormat.getSplits() is called from HiveSplitGenerator.initialize().

Here, gWorkMap is cleared when getSplits() is called, and populated again
when splitGrouper.generateGroupedSplits() is called. gWorkMap is finally
cleared in the 'finally' block of HiveSplitGenerator.initialize().

In our codebase, we do some modification to MapWork in the getSplits()
call, and those changes are negated when clearMapWork() is called inside
HiveInputFormat.getSplits(). I'm wondering if this call is really required?

View raw message