hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: Why LineRecordWriter.write(..) is synchronized
Date Thu, 08 Aug 2013 11:00:40 GMT
I may be nitpicking here but if "perhaps the answer is no" then I conclude:
Perhaps the other implementations of RecordWriter are a race condition/file
corruption ready to occur.


On Thu, Aug 8, 2013 at 12:50 PM, Harsh J <harsh@cloudera.com> wrote:

> While we don't fork by default, we do provide a MultithreadedMapper
> implementation that would require such synchronization. But if you are
> asking is it necessary, then perhaps the answer is no.
> On Aug 8, 2013 3:43 PM, "Azuryy Yu" <azuryyyu@gmail.com> wrote:
>
>> its not hadoop forked threads, we may create a line record writer, then
>> call this writer concurrently.
>> On Aug 8, 2013 4:00 PM, "Sathwik B P" <sathwik.bp@gmail.com> wrote:
>>
>>> Hi,
>>> Thanks for your reply.
>>> May I know where does hadoop fork multiple threads to use a single
>>> RecordWriter.
>>>
>>> regards,
>>> sathwik
>>>
>>> On Thu, Aug 8, 2013 at 7:06 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:
>>>
>>>> because we may use multi-threads to write a single file.
>>>> On Aug 8, 2013 2:54 PM, "Sathwik B P" <sathwik@apache.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> LineRecordWriter.write(..) is synchronized. I did not find any other
>>>>> RecordWriter implementations define the write as synchronized.
>>>>> Any specific reason for this.
>>>>>
>>>>> regards,
>>>>> sathwik
>>>>>
>>>>
>>>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Mime
View raw message