giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-1160) Fix memory estimation in MemoryEstimatorOrcal
Date Tue, 19 Sep 2017 20:24:01 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172284#comment-16172284
] 

ASF GitHub Bot commented on GIRAPH-1160:
----------------------------------------

GitHub user dlogothetis opened a pull request:

    https://github.com/apache/giraph/pull/49

    Fix bug in memory estimation

    Method MemoryEstimatorOracle.calculateRegression() exits if the number of valid columns
to use for the regression is not the same as the total number of columns. This is wrong, the
regression can still run on only the valid columns. This causes memory estimation to never
be used in practice, and OOC starts spilling only when memory usage gets very high.
    
    This is fixed in https://github.com/apache/giraph/pull/34 too, but I want to make these
changes one-by-one so that we can test in isolation.
    
    Tests:
    - mvn clean install
    - Snapshot tests, including snapshot test that uses OOC.
    - Run 3 production jobs and verified that this reduces data spills and jobs finish faster.
The max % spilled is reduced by more than 40%.
    
    JIRA: https://issues.apache.org/jira/browse/GIRAPH-1160
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dlogothetis/giraph fix_mem_est

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/giraph/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #49
    
----
commit f5a124beef6b65bf8f9178120fefc1360566fda6
Author: Dionysios Logothetis <dionysios@fb.com>
Date:   2017-09-19T14:47:56Z

    Fix bug in memory estimation

----


> Fix memory estimation in MemoryEstimatorOrcal
> ---------------------------------------------
>
>                 Key: GIRAPH-1160
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1160
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Dionysios Logothetis
>
> Method MemoryEstimatorOracle.calculateRegression() exits if the number of valid columns
to use for the regression is not the same as the total number of columns. This is wrong, the
regression can run on only the valid columns. This causes the memory estimation to be very
off.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message