Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B59C2200B99 for ; Wed, 5 Oct 2016 23:03:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id B43CE160AEB; Wed, 5 Oct 2016 21:03:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F086B160AC9 for ; Wed, 5 Oct 2016 23:03:22 +0200 (CEST) Received: (qmail 69321 invoked by uid 500); 5 Oct 2016 21:03:21 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 69300 invoked by uid 99); 5 Oct 2016 21:03:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2016 21:03:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C3A692C2A62 for ; Wed, 5 Oct 2016 21:03:21 +0000 (UTC) Date: Wed, 5 Oct 2016 21:03:21 +0000 (UTC) From: "Sergey Shelukhin (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HIVE-14870) OracleStore: RawStore implementation optimized for Oracle MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 05 Oct 2016 21:03:23 -0000 [ https://issues.apache.org/jira/browse/HIVE-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549918#comment-15549918 ] Sergey Shelukhin edited comment on HIVE-14870 at 10/5/16 9:03 PM: ------------------------------------------------------------------ We only need very limited functionality compared to DN. The layer like this already exists in ACID so I don't see why it cannot be reused and augmented. The only changes needed would be the ability to replace some parts to optimize for Oracle (or other DBs), via some sort of a plugin option (or even a switch statement) which will not be pretty but is imho preferable to the alternatives. As I see it, I would be merely -0 on the thing in itself - it's bad enough to have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third and then another federation thing that is not hidden on a lower level like the direct sql one. The direct sql one caused (and will probably cause ;)) a few problems and special cases, simple as it is... plus the confusion with failures-that-are-not-really-failures, failure to fall back, sudden unexplained slowdowns when the fallback is successful, etc.). There are probably all kinds of other issues; e.g. off the top of my head, how does this work with upgrade scripts - would we need to create and maintain another set? Would scripts to switch the schema between the old and the new always be the same, or would there need to be a back and forth script for every version eventually (I don't think one would ever need that but it is a possibility)? Etc. However, my main meta concern is about the approach - what do we do if someone wants to have an optimized MySqlEngine, or MsSqlEngine, AzureEngine, etc? They would totally c/p the Oracle one, rewrite a few critical SQL queries, and submits a patch. That can quickly turn into a maintenance nightmare. It appears to me that the existing custom-SQL layer in ACID could be reused, if desired (or used as inspiration) to make this store ANSI-ish (does it have any significant limitations currently?). That way we can keep query optimizations in a plugin (or even a switch statement if need be). This also has an additional advantage of being able to deprecate and then ditch ORM altogether, which would simplify things instead of making them more complex. Another alternative path (that could be pursued in parallel) is making RawStore pluggable so that such specific implementations could be used, while not being a supported part of Hive codebase. was (Author: sershe): We only need very limited functionality compared to DN. The layer like this already exists in ACID so I don't see why it cannot be reused and augmented. The only changes needed would be the ability to replace some parts to optimize for Oracle (or other DBs), which will not be pretty but is imho preferable to the alternatives. As I see it, I would be merely -0 on the thing in itself - it's bad enough to have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third and then another federation thing that is not hidden on a lower level like the direct sql one. The direct sql one caused (and will probably cause ;)) a few problems and special cases, simple as it is... plus the confusion with failures-that-are-not-really-failures, failure to fall back, sudden unexplained slowdowns when the fallback is successful, etc.). There are probably all kinds of other issues; e.g. off the top of my head, how does this work with upgrade scripts - would we need to create and maintain another set? Would scripts to switch the schema between the old and the new always be the same, or would there need to be a back and forth script for every version eventually (I don't think one would ever need that but it is a possibility)? Etc. However, my main meta concern is about the approach - what do we do if someone wants to have an optimized MySqlEngine, or MsSqlEngine, AzureEngine, etc? They would totally c/p the Oracle one, rewrite a few critical SQL queries, and submits a patch. That can quickly turn into a maintenance nightmare. It appears to me that the existing custom-SQL layer in ACID could be reused, if desired (or used as inspiration) to make this store ANSI-ish (does it have any significant limitations currently?). That way we can keep query optimizations in a plugin (or even a switch statement if need be). This also has an additional advantage of being able to deprecate and then ditch ORM altogether, which would simplify things instead of making them more complex. Another alternative path (that could be pursued in parallel) is making RawStore pluggable so that such specific implementations could be used, while not being a supported part of Hive codebase. > OracleStore: RawStore implementation optimized for Oracle > --------------------------------------------------------- > > Key: HIVE-14870 > URL: https://issues.apache.org/jira/browse/HIVE-14870 > Project: Hive > Issue Type: Improvement > Components: Metastore > Reporter: Chris Drome > Assignee: Chris Drome > Attachments: OracleStoreDesignProposal.pdf > > > The attached document is a proposal for a RawStore implementation which is optimized for Oracle and replaces DataNucleus. The document outlines schema changes, OracleStore implementation details, and performance tests against ObjectStore, ObjectStore+DirectSQL, and OracleStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)