hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent
Date Thu, 15 Oct 2009 09:34:31 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Douglas updated MAPREDUCE-64:
-----------------------------------

    Attachment: M64-2.patch

Patch for review. Added comments, changed some debug messages to info.

The fixed-size kvoffsets and kvindices arrays are replaced by per-record allocations into
an IntBuffer overlay of the serialization buffer. The logic for the interlaced buffers is
fundamentally the same as the existing code. Once the soft limit is reached, collection continues
from the free space between the end of the meta and serialization, at an offset proportional
to the average record size. This eliminates io.sort.record.percent and should use the space
allocated within io.sort.mb more efficiently without configuration or study.

Possibly controversial points:
* After each record is serialized, there is a zero-length write from collect. This ensures
that all the boundaries are checked and that the collection thread blocks when it is out of
space. While it would be possible to block in collect, keeping the logic separate was cleaner
and should impose no real penalty.
* Once the soft limit is reached, at least half the free space is left for serialization data,
but this makes no similar accommodation for metadata. Occasional, extremely large records
may harm concurrency by skewing the average, as the average record size is based on the counters
(i.e. the full task duration)
* Dropped {{final}} on several variables and methods to avoid penalties for auto-generated
accessors; also dropped volatile where unnecessary.
* Suppressed some findbugs warnings. The inconsistent sync warnings should be spurious, as
the variables are only referenced in the collection thread. The unreleased lock exception
warning appeared without modifying the spill thread, but I moved the check back outside that
thread anyway.
* The TestMapCollection unit test ported to the new API takes longer. The old test runs in
near the same duration on my machine, so I suspect this is probably an unrelated delay.

> Map-side sort is hampered by io.sort.record.percent
> ---------------------------------------------------
>
>                 Key: MAPREDUCE-64
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-64
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Arun C Murthy
>            Assignee: Chris Douglas
>         Attachments: M64-0.patch, M64-1.patch, M64-2.patch
>
>
> Currently io.sort.record.percent is a fairly obscure, per-job configurable, expert-level
parameter which controls how much accounting space is available for records in the map-side
sort buffer (io.sort.mb). Typically values for io.sort.mb (100) and io.sort.record.percent
(0.05) imply that we can store ~350,000 records in the buffer before necessitating a sort/combine/spill.
> However for many applications which deal with small records e.g. the world-famous wordcount
and it's family this implies we can only use 5-10% of io.sort.mb i.e. (5-10M) before we spill
inspite of having _much_ more memory available in the sort-buffer. The word-count for e.g.
results in ~12 spills (given hdfs block size of 64M). The presence of a combiner exacerbates
the problem by piling serialization/deserialization of records too...
> Sure, jobs can configure io.sort.record.percent, but it's tedious and obscure; we really
can do better by getting the framework to automagically pick it by using all available memory
(upto io.sort.mb) for either the data or accounting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message