hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Vyas <jayunit...@gmail.com>
Subject Re: Why LineRecordWriter.write(..) is synchronized
Date Thu, 08 Aug 2013 11:53:07 GMT
Then is this a bug?  Synchronization in absence of any race condition is normally considered
"bad".

In any case id like to know why this writer is synchronized whereas the other one are not..
That is, I think, then point at issue: either other writers should be synchronized or else
this one shouldn't be - consistency across the write implementations is probably desirable
so that changes to output formats or record writers don't lead to bugs in multithreaded environments
.

On Aug 8, 2013, at 6:50 AM, Harsh J <harsh@cloudera.com> wrote:

> While we don't fork by default, we do provide a MultithreadedMapper implementation that
would require such synchronization. But if you are asking is it necessary, then perhaps the
answer is no.
> 
> On Aug 8, 2013 3:43 PM, "Azuryy Yu" <azuryyyu@gmail.com> wrote:
>> its not hadoop forked threads, we may create a line record writer, then call this
writer concurrently.
>> 
>> On Aug 8, 2013 4:00 PM, "Sathwik B P" <sathwik.bp@gmail.com> wrote:
>>> Hi,
>>> Thanks for your reply.
>>> May I know where does hadoop fork multiple threads to use a single RecordWriter.
>>> 
>>> regards,
>>> sathwik
>>> 
>>> On Thu, Aug 8, 2013 at 7:06 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:
>>>> because we may use multi-threads to write a single file.
>>>> 
>>>> On Aug 8, 2013 2:54 PM, "Sathwik B P" <sathwik@apache.org> wrote:
>>>>> Hi,
>>>>> 
>>>>> LineRecordWriter.write(..) is synchronized. I did not find any other
RecordWriter implementations define the write as synchronized.
>>>>> Any specific reason for this.
>>>>> 
>>>>> regards,
>>>>> sathwik

Mime
View raw message