From user-return-25953-archive-asf-public=cust-asf.ponee.io@hive.apache.org Mon Apr 16 23:16:12 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id ADFE7180608 for ; Mon, 16 Apr 2018 23:16:11 +0200 (CEST) Received: (qmail 73145 invoked by uid 500); 16 Apr 2018 21:16:10 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 73134 invoked by uid 99); 16 Apr 2018 21:16:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2018 21:16:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C1D16180156 for ; Mon, 16 Apr 2018 21:16:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id nPSpCDwkWiq8 for ; Mon, 16 Apr 2018 21:16:08 +0000 (UTC) Received: from mail-io0-f180.google.com (mail-io0-f180.google.com [209.85.223.180]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 144F65F54E for ; Mon, 16 Apr 2018 21:16:08 +0000 (UTC) Received: by mail-io0-f180.google.com with SMTP id q84so19811231iod.10 for ; Mon, 16 Apr 2018 14:16:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=m6tXJk1Ig7cADnhPPMDj9gffRh1AY3gDZutw/0vY0wc=; b=Okxtjk/2U8aE3Yb9yXucAnuYBQDwLepooMbDP0BWzgVsNSZIhuSzMZoZcwo1Okztps z1L3ONwXdf+KYuJgQzyONU9+XKFY+AScENKMejCZkEfIRs55X92FZhpL4QlpXiLpdHSh JEChYGC7lOklGv3E+98RkeOop1JbU0ajtbvK3QU+2y13+cuWDwq2t9EK6jNW1eoZFIDg QrdoBidFry6UFzy+3wO+o6n9uKEiDiqcn9wAOQjscgb6L1jvlIBNA6vDD1ldoStbPTmB qY2dIpAo+C9hNI9lovRDbVPVdZ/ndkEYW4VpF2igLDOKdMLvylXYJTya//U8VZZL4h0L /LNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=m6tXJk1Ig7cADnhPPMDj9gffRh1AY3gDZutw/0vY0wc=; b=slslGYVma+LDlTubaE0hS3Szlux50NCJAM5+1LlPE+9wqxvnWdbgEJXghGZsosjhl9 loQPHxzC/g2FgcWUDhZED7HuDMiCUE82D1hqtDo+Pi57NCuo37rQqFvqPTsYmJRd9uLI O5OZ5s/Y3LAbX+2wpBnUoj3hGF0jNjbjZPHaXWW0vANI6ftv4vnOqwY9SwwbInqk/cI3 bDXsO/ZMu6pqU1dJ07Hf04mVTJlP0lXePPqKgtuUsl5qnjusyQ44qp/ekk/Hom+ZAMRp taEFk+zzOEamQ12edp+z98/euHQB3z3HqOZSQIhlRTc1iEHWg4ktFXtGgOjlcZuOaj1M DusQ== X-Gm-Message-State: ALQs6tBkkZIZ98oWHnBb4Xz5pAEMGjcgQVUopiSp0RGXvxI5rDE7GDNk nT33hmsg+RLV0u6vV/wk+Lzd7TwIu1HKm0+CWy4= X-Google-Smtp-Source: AIpwx49lPr8FoK+l7UuJLBJ8RMrY6oytr35IfTRenIIjp7Imd5j0lCtJ0nFfDNf5Pg8UoV2VvmlWj/fAqlSGnUeMbn8= X-Received: by 10.107.173.102 with SMTP id w99mr25754004ioe.13.1523913366751; Mon, 16 Apr 2018 14:16:06 -0700 (PDT) MIME-Version: 1.0 References: <5367339E-4126-4F7B-95E1-4F221BD71DFE@gmail.com> In-Reply-To: <5367339E-4126-4F7B-95E1-4F221BD71DFE@gmail.com> From: Joel D Date: Mon, 16 Apr 2018 21:15:56 +0000 Message-ID: Subject: Re: Business Rules Engine for Hive To: user@hive.apache.org Content-Type: multipart/alternative; boundary="001a1144a308e4d3610569fdba00" --001a1144a308e4d3610569fdba00 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The business rules we've here are currently embedded in hive code. They range from basic standardization using case blocks to complex multi-column validation. Thanks. On Mon, Apr 16, 2018 at 5:03 PM J=C3=B6rn Franke wro= te: > The question is what do your rules do? Do you need to maintain a factbase > or do they just check data quality within certain tables? > > On 16. Apr 2018, at 22:28, Joel D wrote: > > Ok. > > Rough ideas: > To keep the business logic outside code, I was thinking to give a custom > UI. > > Next read from UI data and build UDFs using the rules defined outside the > UDF. > > 1 UDF per data object. > > Not sure these are just thoughts. > > On Mon, Apr 16, 2018 at 1:40 PM J=C3=B6rn Franke w= rote: > >> I would not use Drools with Spark, it does not scale to the distributed >> setting. >> >> You could translate the rules to hive queries but this would not be >> exactly the same thing. >> >> > On 16. Apr 2018, at 17:59, Joel D wrote: >> > >> > Hi, >> > >> > Any suggestions on how to implement Business Rules Engine with Hive >> ETLs? >> > >> > For spark based Etl jobs, I was exploring Drools but not sure about >> Hive. >> > >> > Thanks. >> > --001a1144a308e4d3610569fdba00 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
The business rules we've here are currently embe= dded in hive code. They range from basic standardization using case blocks = to complex multi-column validation.

Thanks.=C2=A0

On Mon, = Apr 16, 2018 at 5:03 PM J=C3=B6rn Franke <jornfranke@gmail.com> wrote:
The question is what do your r= ules do? Do you need to maintain a factbase or do they just check data qual= ity within certain tables?

On 16. Apr= 2018, at 22:28, Joel D <games2013.sam@gmail.com> wrote:

Ok.=C2=A0

Rough ideas:
To keep = the business logic outside code, I was thinking to give a custom UI.
<= div dir=3D"auto">
Next read from UI data and bui= ld UDFs using the rules defined outside the UDF.
1 UDF per data object.

=
Not sure these are just thoughts.=C2=A0

On Mon, Apr 16, 2018 at 1:40 PM J=C3=B6rn Fra= nke <jornfrank= e@gmail.com> wrote:
I would = not use Drools with Spark, it does not scale to the distributed setting.
You could translate the rules to hive queries but this would not be exactly= the same thing.

> On 16. Apr 2018, at 17:59, Joel D <games2013.sam@gmail.com> wrote:
>
> Hi,
>
> Any suggestions on how to implement Business Rules Engine with Hive ET= Ls?
>
> For spark based Etl jobs, I was exploring Drools but not sure about Hi= ve.
>
> Thanks.
--001a1144a308e4d3610569fdba00--