Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DC1C8200D5C for ; Thu, 30 Nov 2017 21:58:04 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DAAB0160BF6; Thu, 30 Nov 2017 20:58:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2E3F8160C01 for ; Thu, 30 Nov 2017 21:58:04 +0100 (CET) Received: (qmail 13924 invoked by uid 500); 30 Nov 2017 20:58:03 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 13816 invoked by uid 99); 30 Nov 2017 20:58:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Nov 2017 20:58:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E7D92C5BE1 for ; Thu, 30 Nov 2017 20:58:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.811 X-Spam-Level: X-Spam-Status: No, score=-99.811 tagged_above=-999 required=6.31 tests=[KB_WAM_FROM_NAME_SINGLEWORD=0.2, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 975l1ZyqmvYt for ; Thu, 30 Nov 2017 20:58:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id B2D435F589 for ; Thu, 30 Nov 2017 20:58:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id F0D3EE0732 for ; Thu, 30 Nov 2017 20:58:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A774321054 for ; Thu, 30 Nov 2017 20:58:00 +0000 (UTC) Date: Thu, 30 Nov 2017 20:58:00 +0000 (UTC) From: "Keith Turner (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (ACCUMULO-4744) Using RFile API with cache and multiple files hides data MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 30 Nov 2017 20:58:05 -0000 [ https://issues.apache.org/jira/browse/ACCUMULO-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Turner reassigned ACCUMULO-4744: -------------------------------------- Assignee: Keith Turner > Using RFile API with cache and multiple files hides data > -------------------------------------------------------- > > Key: ACCUMULO-4744 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4744 > Project: Accumulo > Issue Type: Bug > Affects Versions: 1.8.0, 1.8.1 > Reporter: Keith Turner > Assignee: Keith Turner > Priority: Critical > Labels: pull-request-available > Fix For: 1.8.2, 2.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Noticed this bug in source code while working on ACCUMULO-4641. When using the RFile API introduced in 1.8 to read from multiple files with cache enabled, not all data may be seen. This happens because internally the code gives all input sources the same cache id. Therefore index and data blocks from multiple files collide in the cache. > This bug does not happen when reading data through tserver, only the RFile API. > {code:java} > Scanner scanner = > RFile.newScanner() > .from(file1, file2, file3) //multiple input files > .withFileSystem(localFs) > .withIndexCache(1000000) //enabled cache > .withDataCache(10000000) //enabled cache > .build(); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)