Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67EDF18E41 for ; Sun, 14 Feb 2016 07:17:31 +0000 (UTC) Received: (qmail 64990 invoked by uid 500); 14 Feb 2016 07:17:30 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 64928 invoked by uid 500); 14 Feb 2016 07:17:30 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 64916 invoked by uid 99); 14 Feb 2016 07:17:30 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Feb 2016 07:17:30 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id F4147C0887 for ; Sun, 14 Feb 2016 07:17:29 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.448 X-Spam-Level: * X-Spam-Status: No, score=1.448 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id whSmoUbh8vL9 for ; Sun, 14 Feb 2016 07:17:29 +0000 (UTC) Received: from mail-vk0-f47.google.com (mail-vk0-f47.google.com [209.85.213.47]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A3A305FB14 for ; Sun, 14 Feb 2016 07:17:28 +0000 (UTC) Received: by mail-vk0-f47.google.com with SMTP id e185so87501364vkb.1 for ; Sat, 13 Feb 2016 23:17:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=fepUJSR+tQHLJ4eQiwz//AAXSe6ugTiSOXY3OAhcP/E=; b=IWRPNSqHT7/Bys9Z081xrCw4707E8x5G/TQ5qaSCOGqlkwjCV4/lB+LCKb40a/8c2t lLoeRNJ+lJoPZ+0in8BJ1MpIjtAXRnDK9dHJdNi76xAq0CrqzxzN/8/JZJp28Elc6ydO 3h0a2CEVeGS1QbM3n64PRekrwJECpDZDHEIXDOFYNZPq7lixXwTEkrkZm6MuJOBQ/T+L wjraRgjZUTV08koQfh0e0DlvNnVqcMOo5CmyDLcqg8JJFq8sXe+xXzrHJTp4yaqJlrvS tPrTKBH1c7mOG3MU9ve8K1R8xSYvwbdCNZb7qPACWmMGapiaKnSK588+4APgbPt7IBya DqnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=fepUJSR+tQHLJ4eQiwz//AAXSe6ugTiSOXY3OAhcP/E=; b=ZFmyzjk8s/55H818I7xVo3RDjXA2Fvk5NGYv1hJQXaKu85deZnqDOouX8YIXf9cTLR XpLdgx8y95vd8sLbpSXMVJZYq61V8tG56FyveqvMwoewuGOQM82MAI8/oZaHRHL6QON1 QfgJ1KqwquzvKqt8riFuuuayapYbR31TwZB0QNM5JsTs7vz2r8ZdhgQtM/tbfjQu3Ga3 YuL7zJTbRsV0nMSzH1h8mEFCKkYgulYE5gpsoz9nwLZbfkdunt5mtNDEMcfb6XZKP5Om 63g8P7s5Rm4IqnCIEpaLCyDgUfc810gNWMTopE9AhCTmt6JtTvuXx3A8Va6yUhCtKHqh ELAw== X-Gm-Message-State: AG10YORYw9kFi0UZOxKekTSIY5iJQW779h3RvD5+jCUubn4MDkRQJGj1SUUVStWSGk9j2GvRUj1hsYfVRTCyEw== MIME-Version: 1.0 X-Received: by 10.31.58.193 with SMTP id h184mr8404962vka.111.1455434241779; Sat, 13 Feb 2016 23:17:21 -0800 (PST) Received: by 10.31.15.133 with HTTP; Sat, 13 Feb 2016 23:17:21 -0800 (PST) Date: Sun, 14 Feb 2016 12:47:21 +0530 Message-ID: Subject: external data set support From: Sandeep Joshi To: dev@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=001a1143ff70f9b12e052bb5b0d9 --001a1143ff70f9b12e052bb5b0d9 Content-Type: text/plain; charset=UTF-8 Can someone describe the level of support for External data sets and the future roadmap ? Let me divide the question into four broad issues: 1) Schema catalog : One would have implement IMetadataProvider, IDataSource, IDataSourceIndex and other related classes. Is there any functionality missing from the current schema implementation for external data sets ? One of the papers says that one should add comparators and hash functions for any new data types introduced by the external data set. Which interface does one have to implement for that ? 2) Query optimization : There is no cost-based optimizer yet within Algebricks, therefore there is no API to support retrieval and use of table statistics from an external data source. Is something planned in this regard ? 3) Data fetch and update : The VLDB'14 paper states that external data sets are read-only, static and without indices, but the current codebase has support for IExternalIndex and IIndexibleExternalDataSource, so presumably I can fetch records from an external data source (base table scan as well as index). Can I write to an external data source ? 4) Hyracks runtime : For data retrieval, is it sufficient to implement the interfaces within asterix.external.api or does one also have to add some Hyracks operators which are constructed via contributeRuntimeOperator ? -Sandeep --001a1143ff70f9b12e052bb5b0d9--