hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17411) LLAP IO may incorrectly release a refcount in some rare cases
Date Wed, 30 Aug 2017 01:07:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146439#comment-16146439
] 

Prasanth Jayachandran commented on HIVE-17411:
----------------------------------------------

Not sure if there is repro for this issue. If there is, can this be tested by not projecting
the large dictionary column?
looks good otherwise +1

> LLAP IO may incorrectly release a refcount in some rare cases
> -------------------------------------------------------------
>
>                 Key: HIVE-17411
>                 URL: https://issues.apache.org/jira/browse/HIVE-17411
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-17411.patch
>
>
> In a large stream whose buffers are not reused, and that is separated into many CB (e.g.
due to a small ORC compression buffer size), it may happen that some, but not all, buffers
that are read together as a unit are evicted from cache.
> If CacheBuffer follows BufferChunk in the buffer list when a stream like this is read,
the latter will be converted to ProcCacheChunk;  it is possible for early refcount release
logic from the former to release the refcount (for a dictionary stream, the initial refCount
is always released early), and then backtrack to the latter to see if we can unlock more buffers.
It would then to decref an uninitialized MemoryBuffer in ProcCacheChunk because ProcCacheChunk
looks like a CacheChunk. PCC initial refcounts are released separately after the data is uncompressed.
> I'm assuming this would almost never happen with non-stripe-level streams because one
would need a large RG to span 2+ CBs, no overlap with next/previous RGs in 2+ buffers for
the early release to kick in, and an unfortunate eviction order. However it's possible with
large-ish dictionaries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message