atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nigel Jones (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-1955) Validation for Attributes
Date Wed, 19 Jul 2017 15:11:00 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093242#comment-16093242
] 

Nigel Jones commented on ATLAS-1955:
------------------------------------

I believe here you are modeling the fact that an email has to follow a certain format, so
that metadata should be captured in atlas
However the actual validation for instances of this data, ie a customer record being stored
in a DB, would typically be outside atlas. In addition the validation may differ as it would
be specific to the data processing system being used - an ETL engine, hbase, a filesystem,
different languages. 
So I think the model is more similar to that of Policies, Rules & how ranger works
In atlas we have a business-centric definition of a policy, but the actual implementation
sits at the enforcement point (in this case a ranger rule)
I'm interesting in being able to add capability to capture metadata from ranger so we can
then tie back the rule implementation to the policy, to aid in compliance checks, reporting
-as well as allow ranger to query atlas for policies when a security admin is creating a rule
So I wonder if the same pattern applies here with validation?

> Validation for Attributes
> -------------------------
>
>                 Key: ATLAS-1955
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1955
>             Project: Atlas
>          Issue Type: New Feature
>          Components:  atlas-core
>    Affects Versions: 0.9-incubating
>            Reporter: Israel Varea
>             Fix For: 0.9-incubating
>
>
> It would be very nice that Atlas model could contain a way to represent attribute validation.

> A simple example is that we would like to model a Person, with attributes Name, Email
and Country. Now we would like to specify that Email has to follow a specific regular expression,
so it would be nice if we could set Email -> hasValidation -> EmailRegex, with EmailRegex
having:
> Name: Email Regular Expresion
> Expression: /[0-9a-z]+@[0-9a-z]+.[0-9a-z]+/
> For more complex types of validation, e.g. checking card number validity, it could be
added some external validator function/service.
> Name: Credit Card Number Validator
> Validator: org.apache.atlas.validators.creditcard or https://host:port/creditCardValidator
> For validations from a reference table, for example a country name, it could be:
> Name: Country Name Ref Validator
> Reference Column: <country_name_column>
> where <country_name_column> would be an instance of type Hive_Column or HBase_Column.
> Since this is a kind of Standarization, it could be placed in [Area 5|https://cwiki.apache.org/confluence/display/ATLAS/Area+5+-+Standards].
> A similar approach is followed in software [Kylo|https://github.com/Teradata/kylo/tree/master/integrations/spark/spark-validate-cleanse]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message