hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Herberts <mathias.herbe...@gmail.com>
Subject Re: One petabyte of data loading into HDFS with in 10 min.
Date Wed, 05 Sep 2012 15:12:57 GMT
It greatly depends on the form thie PB is stored under, if we're
talking N files with N >> 1 then you might get better performance by
sharding the import job on multiple boxes.

If it's a single 1PB file then Infiniband might be your best bet, but
won't get you close to 10'

View raw message