Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 88D7D200B6B for ; Fri, 9 Sep 2016 13:45:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8779E160AC2; Fri, 9 Sep 2016 11:45:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CF982160AB6 for ; Fri, 9 Sep 2016 13:45:21 +0200 (CEST) Received: (qmail 30315 invoked by uid 500); 9 Sep 2016 11:45:20 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 30302 invoked by uid 99); 9 Sep 2016 11:45:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Sep 2016 11:45:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 972F82C1B79 for ; Fri, 9 Sep 2016 11:45:20 +0000 (UTC) Date: Fri, 9 Sep 2016 11:45:20 +0000 (UTC) From: "marco polo (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-4391) Source deepcopies cannot be used safely in separate threads in tserver MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 09 Sep 2016 11:45:22 -0000 [ https://issues.apache.org/jira/browse/ACCUMULO-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476871#comment-15476871 ] marco polo commented on ACCUMULO-4391: -------------------------------------- Why is that decompressor being shared? Why isn't the thread being given access to its own decompressor on its own block read? > Source deepcopies cannot be used safely in separate threads in tserver > ---------------------------------------------------------------------- > > Key: ACCUMULO-4391 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4391 > Project: Accumulo > Issue Type: Bug > Components: core > Affects Versions: 1.6.5 > Reporter: Ivan Bella > Assignee: Ivan Bella > Fix For: 1.6.6, 1.7.3, 1.8.1, 2.0.0 > > Original Estimate: 24h > Time Spent: 12.5h > Remaining Estimate: 11.5h > > We have iterators that create deep copies of the source and use them in separate threads. As it turns out this is not safe and we end up with many exceptions, mostly down in the ZlibDecompressor library. Curiously if you turn on the data cache for the table being scanned then the errors disappear. > After much hunting it turns out that the real bug is in the BoundedRangeFileInputStream. The read() method therein appropriately synchronizes on the underlying FSDataInputStream, however the available() method does not. Adding similar synchronization on that stream fixes the issues. On a side note, the available() call is only invoked within the hadoop CompressionInputStream for use in the getPos() call. That call does not appear to actually be used at least in this context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)