hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3473) io.sort.factor should default to 100 instead of 10
Date Fri, 30 May 2008 23:04:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601290#action_12601290
] 

Doug Cutting commented on HADOOP-3473:
--------------------------------------

Changing this has memory implications, no?  Buffers are allocated for each stream being merged.
 Buffers should be large enough so that transfer dominates seek, i.e., @ 10ms/seek, 100MB/s
transfer, seek=transfer at 1MB.  So for merging not to be seek-bound with 100 buffers, the
total buffer size needs to be substantially larger than 100MB, which is currently the default
for io.sort.mb.  So I can see increasing this to 50 w/o changing the default for io.sort.mb.

BTW, you've proposed a solution in the description rather than a problem.  The problem, I
assume, is that the sort-factor is non-optimal.  Perhaps a better solution to this problem
is to not specify the sort factor at all, but rather to have the sort code determine it automatically
based on io.sort.mb?  So if you increase io.sort.mb, you'd get a larger sort factor.  Of course,
then we'd have to make some assumptions about disk performance...

> io.sort.factor should default to 100 instead of 10
> --------------------------------------------------
>
>                 Key: HADOOP-3473
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3473
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: conf
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.18.0
>
>
> 10 is *really* conservative and can make merges much much more expensive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message