hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pathu...@yahoo.com
Subject Million docs and word count scenario
Date Fri, 29 Mar 2013 12:15:31 GMT
If there r 1 million docs in an enterprse and we need to perform word count computation on
all the docs what is the first step to be done.  Is it to extract all the text of all the
docs  into a single file and then put into hdfs or put each one separately in hdfs. 

Sent from BlackBerry® on Airtel
View raw message