hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chandravadana <Chandravadana.Selvach...@cognizant.com>
Subject Re: 1 file per record
Date Fri, 26 Sep 2008 09:33:32 GMT


i'm writing an appln which computes using the entire data from a file.
for that purpose i dont want to split my file and the entire file shd go to
map task..
i've been able to override isSplitable() do it and the file is not getting
split now..
i had to store the input values to an array..(in map func) and then proceed
with my computation. when i displayed that array i found only the last line
of the file getting displayed... does this mean that data is read line by
line by the line reader and not continously.
if so, what shd i do inorder to read complete contents of the file...

Thank you
Chandravadana S

Enis Soztutar wrote:
> Yes, you can use MultiFileInputFormat.
> You can extend the MultiFileInputFormat to return a RecordReader, which 
> reads a record for each file in the MultiFileSplit.
> Enis
> chandra wrote:
>> hi..
>> By setting isSplitable false, we can set 1 file with n records 1 mapper.
>> Is there any way to set 1 complete file per record..
>> Thanks in advance
>> Chandravadana S
>> This e-mail and any files transmitted with it are for the sole use of the
>> intended recipient(s) and may contain confidential and privileged
>> information.
>> If you are not the intended recipient, please contact the sender by reply
>> e-mail and destroy all copies of the original message. 
>> Any unauthorized review, use, disclosure, dissemination, forwarding,
>> printing or copying of this email or any action taken in reliance on this
>> e-mail is strictly 
>> prohibited and may be unlawful.

View this message in context: http://www.nabble.com/1-file-per-record-tp19644985p19685269.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message