hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 胡斐 <hufe...@gmail.com>
Subject Custom FileInputFormat.class
Date Mon, 01 Dec 2014 16:38:01 GMT
Hi,

I want to custom FileInputFormat.class. In order to determine which host
the specific part of a file belongs to, I need to open the file in HDFS and
read some information. It will take me nearly 500ms to open a file and get
the information I need. But now I have thousands of files to deal with, so
it would be a long time if I deal with all of them as the above.

Is there any better solution to reduce the time when the number of files is
large?

Thanks in advance!
Fei

Mime
View raw message