hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean <lagaru...@yahoo.fr>
Subject Fast grep on hdfs files
Date Fri, 19 Dec 2014 21:59:28 GMT
I want to be able to grep customs strings in lot of files stored in hdfs.
I have at least a size of 500GB-2TB to grep splitted in ~50-200 files.

What would be the best way to have the faster results : 
- lines matching 
- filenames containing the lines matched

I tested with a map reduce grep but it's slow for interactive user.

Do i need to index  everything in hive,solr ?
Spark will be faster than mapreduce ?


View raw message