hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hong Tang <ht...@yahoo-inc.com>
Subject [VOTE] port HADOOP-6218 (Split TFile by Record Sequence Number) to hadoop 0.20/0.21
Date Mon, 12 Oct 2009 22:55:29 GMT
HADOOP-6218 exposed the internal "Location" object as a global Record  
Sequence Number (RecNum). The feature is useful in a number of ways:  
(1) support progress reporting for upper layers (object file, zebra);  
(2) use RecNum as cursor by a secondary index; (3) support aligned  
split across multiple parallel TFiles. Given that TFile is still at  
its early stage of being adopted, I suggest that we port the patch  
back to hadoop 0.20/0.21 now.


View raw message