hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Subramanian, Sanjay (HQP)" <sanjay.subraman...@roberthalf.com>
Subject Fast Data transfer to S3 for Hive processing
Date Fri, 18 Apr 2014 17:41:51 GMT
Hey guys

This has been one painpoint for me after transferring from running hadoop clusters in house
behind company firewall to Amazon EMR …..Slow data transfer to S3

Recently I did some tests with Expedat S3 and seems pretty good for large files as compared
with S3CMD tool

Hope this helps folks facing potentially similar issues as I am

Thanks
Warm Regards

Sanjay

GITHUB:https://github.com/sanjaysubramanian/big_data_latte
linkedin:http://www.linkedin.com/in/subramaniansanjay

Test results
==========

11G / 151863 files
=========================
EXPEDAT S3  51332 seconds
S3CMD tool  58507 seconds

11G / 1 file
=========================
EXPEDAT S3  558 seconds
S3CMD tool  3063 seconds



Mime
View raw message