asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taewoo Kim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1733) Hash Table used by hash-join doesn't conform to the budget.
Date Sat, 03 Dec 2016 22:09:58 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15718817#comment-15718817
] 

Taewoo Kim commented on ASTERIXDB-1733:
---------------------------------------

Like the ASTERIXDB-1566 case, 

When the build phase of optimize hybrid hash join starts, since we don't have the estimation
on input size, we use the default size (140GB). The result is that we will have so many partitions
that is similar to the number of frames that were allocated to a join. For example, if we
use 192MB as join memory, the number of partitions will be 972. As a result, this will generate
a lot of small chunks that is smaller than 1MB. Rather than having several disk I/Os of smaller
chunks, more bigger chunk would be preferable. 

I think for the first input (build), like we do to estimate the hash table size (cardinality)
for hash group-by using MIN(# of possible hash entries, # of possible tuples for the given
memory budget), the same logic can be applied to here. 

> Hash Table used by hash-join doesn't conform to the budget.
> -----------------------------------------------------------
>
>                 Key: ASTERIXDB-1733
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1733
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>
> The hash table in the hash join doesn't conform to the frame limit (join.memory parameter
in the configuration file). This is related to ASTERIXDB-1556.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message