spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] (SPARK-19371) Cannot spread cached partitions evenly across executors
Date Mon, 30 Jan 2017 13:12:43 GMT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml"> 
    <head> 
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
        <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0"
/> <base href="https://issues.apache.org/jira" /> 
        <title>Message Title</title> 
    </head> 
    <body class="jira" style="color: #333; font-family: Arial, sans-serif; font-size: 14px;
line-height: 1.429"> 
        <table id="background-table" cellpadding="0" cellspacing="0" width="100%" style="border-collapse:
collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt; background-color: #f5f5f5; border-collapse:
collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt"> 
            <!-- header here --> 
            <tr> 
                <td id="header-pattern-container" style="padding: 0px; border-collapse:
collapse; padding: 10px 20px"> 
                    <table id="header-pattern" cellspacing="0" cellpadding="0" border="0"
style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt"> 
                        <tr> 
                            <td id="header-avatar-image-container" valign="top" style="padding:
0px; border-collapse: collapse; vertical-align: top; width: 32px; padding-right: 8px">
<img id="header-avatar-image" class="image_fix" src="cid:jira-generated-image-avatar-srowen-6f4d17d8-f8d9-4209-9b75-2c3d6d359dfe"
height="32" width="32" border="0" style="border-radius: 3px; vertical-align: top" /> 
                            </td> 
                            <td id="header-text-container" valign="middle" style="padding:
0px; border-collapse: collapse; vertical-align: middle; font-family: Arial, sans-serif; font-size:
14px; line-height: 20px; mso-line-height-rule: exactly; mso-text-raise: 1px"> <a class="user-hover"
rel="srowen" id="email_srowen" href="https://issues.apache.org/jira/secure/ViewProfile.jspa?name=srowen"
style="color:#3b73af;; color: #3b73af; text-decoration: none">Sean Owen</a> <strong>commented</strong>
on <a href="https://issues.apache.org/jira/browse/SPARK-19371" style="color: #3b73af; text-decoration:
none"><img src="cid:jira-generated-image-static-bug-cf24f270-9cab-4232-87a4-2af308a8bd54"
height="16" width="16" border="0" align="absmiddle" alt="Bug" /> SPARK-19371</a>

                            </td> 
                        </tr> 
                    </table> 
                </td> 
            </tr> 
            <tr> 
                <td id="email-content-container" style="padding: 0px; border-collapse:
collapse; padding: 0 20px"> 
                    <table id="email-content-table" cellspacing="0" cellpadding="0" border="0"
width="100%" style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt;
border-spacing: 0; border-collapse: separate"> 
                        <tr> 
                            <!-- there needs to be content in the cell for it to render
in some clients --> 
                            <td class="email-content-rounded-top mobile-expand" style="padding:
0px; border-collapse: collapse; color: #fff; padding: 0 15px 0 16px; height: 15px; background-color:
#fff; border-left: 1px solid #ccc; border-top: 1px solid #ccc; border-right: 1px solid #ccc;
border-bottom: 0; border-top-right-radius: 5px; border-top-left-radius: 5px; height: 10px;
line-height: 10px; padding: 0 15px 0 16px; mso-line-height-rule: exactly">
                                &nbsp;
                            </td> 
                        </tr> 
                        <tr> 
                            <td class="email-content-main mobile-expand " style="padding:
0px; border-collapse: collapse; border-left: 1px solid #ccc; border-right: 1px solid #ccc;
border-top: 0; border-bottom: 0; padding: 0 15px 0 16px; background-color: #fff"> 
                                <table class="page-title-pattern" cellspacing="0" cellpadding="0"
border="0" width="100%" style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace:
0pt"> 
                                    <tr> 
                                        <td style="vertical-align: top;; padding: 0px;
border-collapse: collapse; padding-right: 5px; font-size: 20px; line-height: 30px; mso-line-height-rule:
exactly" class="page-title-pattern-header-container"> <span class="page-title-pattern-header"
style="font-family: Arial, sans-serif; padding: 0; font-size: 20px; line-height: 30px; mso-text-raise:
2px; mso-line-height-rule: exactly; vertical-align: middle"> <a href="https://issues.apache.org/jira/browse/SPARK-19371"
style="color: #3b73af; text-decoration: none">Re: Cannot spread cached partitions evenly
across executors</a> </span> 
                                        </td> 
                                    </tr> 
                                </table> 
                            </td> 
                        </tr> 
                        <tr> 
                            <td id="text-paragraph-pattern-top" class="email-content-main
mobile-expand  comment-top-pattern" style="padding: 0px; border-collapse: collapse; border-left:
1px solid #ccc; border-right: 1px solid #ccc; border-top: 0; border-bottom: 0; padding: 0
15px 0 16px; background-color: #fff; border-bottom: none; padding-bottom: 0"> 
                                <table class="text-paragraph-pattern" cellspacing="0" cellpadding="0"
border="0" width="100%" style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace:
0pt; font-family: Arial, sans-serif; font-size: 14px; line-height: 20px; mso-line-height-rule:
exactly; mso-text-raise: 2px"> 
                                    <tr> 
                                        <td class="text-paragraph-pattern-container mobile-resize-text
" style="padding: 0px; border-collapse: collapse; padding: 0 0 10px 0"> 
                                            <p style="margin: 10px 0 0 0">In your case
you want to cache a copy of the data in memory, which makes the locality of the original data
not relevant (good, because you can't always get locality even on HDFS). Once in memory, being
unbalanced across executors isn't a big deal, because (in theory, but mostly true in practice)
one execution slot is as good as another, because it's a core operating on local memory. I
do believe Spark already prefers spreading tasks across executors, all else equal. </p>

                                            <p style="margin: 10px 0 0 0">A shuffle
could be the right thing in front of this operation, because there's going to be some copying
already, so the extra cost isn't high. Coupled with not waiting for locality on the initial
read I think this is indeed the right approach in Spark now.</p> 
                                        </td> 
                                    </tr> 
                                </table> 
                            </td> 
                        </tr> 
                        <tr> 
                            <td class="email-content-main mobile-expand " style="padding:
0px; border-collapse: collapse; border-left: 1px solid #ccc; border-right: 1px solid #ccc;
border-top: 0; border-bottom: 0; padding: 0 15px 0 16px; background-color: #fff"> 
                                <table id="actions-pattern" cellspacing="0" cellpadding="0"
border="0" width="100%" style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace:
0pt; font-family: Arial, sans-serif; font-size: 14px; line-height: 20px; mso-line-height-rule:
exactly; mso-text-raise: 1px"> 
                                    <tr> 
                                        <td id="actions-pattern-container" valign="middle"
style="padding: 0px; border-collapse: collapse; padding: 10px 0 10px 24px; vertical-align:
middle; padding-left: 0"> 
                                            <table align="left" style="border-collapse:
collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt"> 
                                                <tr> 
                                                    <td class="actions-pattern-action-icon-container"
style="padding: 0px; border-collapse: collapse; font-family: Arial, sans-serif; font-size:
14px; line-height: 20px; mso-line-height-rule: exactly; mso-text-raise: 0px; vertical-align:
middle"> <a href="https://issues.apache.org/jira/browse/SPARK-19371#add-comment" target="_blank"
title="Add Comment" style="color: #3b73af; text-decoration: none"> <img class="actions-pattern-action-icon-image"
src="cid:jira-generated-image-static-comment-icon-a4e2d7e4-4404-405c-984e-e5e04e920d3a" alt="Add
Comment" title="Add Comment" height="16" width="16" border="0" style="vertical-align: middle"
/> </a> 
                                                    </td> 
                                                    <td class="actions-pattern-action-text-container"
style="padding: 0px; border-collapse: collapse; font-family: Arial, sans-serif; font-size:
14px; line-height: 20px; mso-line-height-rule: exactly; mso-text-raise: 4px; padding-left:
5px"> <a href="https://issues.apache.org/jira/browse/SPARK-19371#add-comment" target="_blank"
title="Add Comment" style="color: #3b73af; text-decoration: none">Add Comment</a>

                                                    </td> 
                                                </tr> 
                                            </table> 
                                        </td> 
                                    </tr> 
                                </table> 
                            </td> 
                        </tr> 
                        <!-- there needs to be content in the cell for it to render in
some clients --> 
                        <tr> 
                            <td class="email-content-rounded-bottom mobile-expand" style="padding:
0px; border-collapse: collapse; color: #fff; padding: 0 15px 0 16px; height: 5px; line-height:
5px; background-color: #fff; border-top: 0; border-left: 1px solid #ccc; border-bottom: 1px
solid #ccc; border-right: 1px solid #ccc; border-bottom-right-radius: 5px; border-bottom-left-radius:
5px; mso-line-height-rule: exactly">
                                &nbsp;
                            </td> 
                        </tr> 
                    </table> 
                </td> 
            </tr> 
            <tr> 
                <td id="footer-pattern" style="padding: 0px; border-collapse: collapse;
padding: 12px 20px"> 
                    <table id="footer-pattern-container" cellspacing="0" cellpadding="0"
border="0" style="border-collapse: collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt">

                        <tr> 
                            <td id="footer-pattern-text" class="mobile-resize-text" width="100%"
style="padding: 0px; border-collapse: collapse; color: #999; font-size: 12px; line-height:
18px; font-family: Arial, sans-serif; mso-line-height-rule: exactly; mso-text-raise: 2px">
                                 This message was sent by Atlassian JIRA <span id="footer-build-information">(v6.3.15#6346-<span
title="dbc023dd75cecacf443c4b235f66124b15f5c5fe" data-commit-id="dbc023dd75cecacf443c4b235f66124b15f5c5fe}">sha1:dbc023d</span>)</span>

                            </td> 
                            <td id="footer-pattern-logo-desktop-container" valign="top"
style="padding: 0px; border-collapse: collapse; padding-left: 20px; vertical-align: top">

                                <table style="border-collapse: collapse; mso-table-lspace:
0pt; mso-table-rspace: 0pt"> 
                                    <tr> 
                                        <td id="footer-pattern-logo-desktop-padding" style="padding:
0px; border-collapse: collapse; padding-top: 3px"> <img id="footer-pattern-logo-desktop"
src="cid:jira-generated-image-static-footer-desktop-logo-49da3a6d-b796-49b9-a764-139b84fce7b1"
alt="Atlassian logo" title="Atlassian logo" width="169" height="36" class="image_fix" />

                                        </td> 
                                    </tr> 
                                </table> 
                            </td> 
                        </tr> 
                    </table> 
                </td> 
            </tr> 
        </table>   
    </body>
</html>
Mime
View raw message