hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2919) Create fewer copies of buffer data during sort/spill
Date Tue, 11 Mar 2008 00:30:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577260#action_12577260

Chris Douglas commented on HADOOP-2919:

I haven't been able to reproduce this failure in Linux or on MacOS. Looking at the console
output, the timeout looks related to HADOOP-2971. I'm seeing a handful of the following errors
from Hudson:

    [junit] 2008-03-10 23:22:51,803 INFO  dfs.DataNode (DataNode.java:run(1985)) - PacketResponder
blk_1646669170773132170 1 Exception java.net.SocketTimeoutException: 60000 millis timeout
while waiting for / (local: / to be ready for read
    [junit]   at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:188)
    [junit]   at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:135)
    [junit]   at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:121)
    [junit]   at java.io.DataInputStream.readFully(DataInputStream.java:176)
    [junit]   at java.io.DataInputStream.readLong(DataInputStream.java:380)
    [junit]   at org.apache.hadoop.dfs.DataNode$PacketResponder.run(DataNode.java:1957)
    [junit]   at java.lang.Thread.run(Thread.java:595)

Since the failure is coming from TestMiniMRDFSSort- code this patch certainly affects- this
result is not auspicious, but I suspect the issue is not related to this patch.

> Create fewer copies of buffer data during sort/spill
> ----------------------------------------------------
>                 Key: HADOOP-2919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2919
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>             Fix For: 0.17.0
>         Attachments: 2919-0.patch, 2919-1.patch, 2919-2.patch, 2919-3.patch
> Currently, the sort/spill works as follows:
> Let r be the number of partitions
> For each call to collect(K,V) from map:
> * If buffers do not exist, allocate a new DataOutputBuffer to collect K,V bytes, allocate
r buffers for collecting K,V offsets
> * Write K,V into buffer, noting offsets
> * Register offsets with associated partition buffer, allocating/copying accounting buffers
if nesc
> * Calculate the total mem usage for buffer and all partition collectors by iterating
over the collectors
> * If total mem usage is greater than half of io.sort.mb, then start a new thread to spill,
blocking if another spill is in progress
> For each spill (assuming no combiner):
> * Save references to our K,V byte buffer and accounting data, setting the former to null
(will be recreated on the next call to collect(K,V))
> * Open a SequenceFile.Writer for this partition
> * Sort each partition separately (the current version of sort reuses, but still requires
wrapping, indices in IntWritable objects)
> * Build a RawKeyValueIterator of sorted data for the partition
> * Deserialize each key and value and call SequenceFile::append(K,V) on the writer for
this partition
> There are a number of opportunities for reducing the number of copies, creations, and
operations we perform in this stage, particularly since growing many of the buffers involved
requires that we copy the existing data to the newly sized allocation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message