hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR
Date Thu, 15 Nov 2012 14:58:18 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498062#comment-13498062
] 

Arun C Murthy commented on MAPREDUCE-2454:
------------------------------------------

Asokan - thanks again for being patient and working with me through this, and also for spending
time with me in person on the design. 
I think this is very close! Thanks again!

Also, Alejandro, thanks for taking a look.

Some comments:
# All the apis should be marked 'LimitedPrivate' (not Public), 'Unstable' to make it clear
that this is only for implementers.
# I'd rather keep the names as MapOutputBuffer or MapOutputSortingOutput rather than PostMapProcessor
to make it clear that this is the sort buffer. Similarly, we should just call it ReduceInputMerger
or some such?
# We shouldn't need to make TaskReporter public since we already have a public Reporter api,
correct?
# Perhaps one of my biggest concerns is about MapOutput, I don't understand why it has a 'shuffle'
method and shuffles the output itself - it is merely meant to be an abstraction of the output
- can you pls help me understand this?
# Since you've already made SpillRecord (java) public, we don't have to move out IndexRecord
outside?
# In general, it would be useful if you could not make formatting changes in a large patch
to keep it smaller - I see it's gone from 60K or so 100+K again! :)

Thanks again.
                
> Allow external sorter plugin for MR
> -----------------------------------
>
>                 Key: MAPREDUCE-2454
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
>            Reporter: Mariappan Asokan
>            Assignee: Mariappan Asokan
>            Priority: Minor
>              Labels: features, performance, plugin, sort
>         Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, KeyValueIterator.java,
MapOutputSorterAbstract.java, MapOutputSorter.java, mapreduce-2454.patch, mapreduce-2454.patch,
mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
mapreduce-2454.patch, mapreduce-2454.patch, mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz,
ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to facilitate external
sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message