Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E4862200D44 for ; Mon, 20 Nov 2017 22:11:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E31AE160BF9; Mon, 20 Nov 2017 21:11:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3788A160BE1 for ; Mon, 20 Nov 2017 22:11:01 +0100 (CET) Received: (qmail 55494 invoked by uid 500); 20 Nov 2017 21:11:00 -0000 Mailing-List: contact dev-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list dev@asterixdb.apache.org Received: (qmail 55482 invoked by uid 99); 20 Nov 2017 21:11:00 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Nov 2017 21:11:00 +0000 Received: from [10.17.1.220] (unknown [206.169.106.2]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id D5EEA1A00C7 for ; Mon, 20 Nov 2017 21:10:59 +0000 (UTC) From: "Till Westmann" To: dev@asterixdb.apache.org Subject: Re: Temporary Datasets Date: Mon, 20 Nov 2017 13:10:57 -0800 Message-ID: In-Reply-To: <8e405afc-3cde-7afb-1756-fe57101d7ed7@gmail.com> References: <8e405afc-3cde-7afb-1756-fe57101d7ed7@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Mailer: MailMate (1.9.7r5425) archived-at: Mon, 20 Nov 2017 21:11:02 -0000 +1 Till On 20 Nov 2017, at 12:05, Mike Carey wrote: > +1 to remove them and then re-create them later - based on the state > of the AsterixDB storage world and cluster dynamics at that time.  (I > think we'll have a better chance of getting them perfect if we re-do > them then - I don't remember that it took Yingyi very long to do them > the first time, so I think the re-do path will beat the fix-up path if > we want them again.)  As far as I know, since we don't document them, > nobody is using them - and I think the engineering cost of maintaining > orphaned code is too high (not worth it). > > Any thoughts to the contrary? > > Cheers, > > Mike > > > On 11/20/17 10:52 AM, Murtadha Hubail wrote: >> Hi all, >> >> >> As you might be aware, we have a feature in AsterixDB to create >> temporary datasets that differ from regular datasets in some ways >> such as: >> Their existence is not persisted in metadata, but only in the CC >> metadata cache. >> They don’t’ generate any transaction logs >> Their files are deleted on NC restart. >> If they are not accessed for some period of time, their metadata >> records are removed from the CC metadata cache. >> >> Temporary datasets were originally introduced to serve as a staging >> area between AsterixDB and external systems such as Perglix. However, >> as the system evolved over the years, the assumptions they were built >> on don’t hold anymore and they could lead to undesired consequences >> such as leaking files after a CC restart or inability to access the >> dataset files on a restarted NC. Therefore, I’m proposing to remove >> the support for the current temporary datasets and we may add the >> feature with a careful design at a later stage. >> >> >> Any thoughts or concerns on removing temporary datasets? >> >> >> Cheers, >> >> Murtadha >> >>