sentry-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SENTRY-2024) Specify Char Set for AUTHZ_OBJ_NAME
Date Tue, 21 Nov 2017 19:05:00 GMT

    [ https://issues.apache.org/jira/browse/SENTRY-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261307#comment-16261307
] 

Sergio Peña commented on SENTRY-2024:
-------------------------------------

[~LinaAtAustin] I understand that indexes are useful when we DO too many queries but they
are HARMFUL when the table has many inserts and updates. In this case, Sentry is querying
the table before doing an insert or update, so doesn't this index on the authz_obj cause more
overhead than performance benefits? 

Also, authz obj names will be unique and there won't be any conflict to keep a unique index
in the table because Sentry already solves duplicates in a different way, do we still need
a unique index here?

Is there any other reason to keep the index?

> Specify Char Set for AUTHZ_OBJ_NAME
> -----------------------------------
>
>                 Key: SENTRY-2024
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2024
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-2024.001.patch, SENTRY-2024.001.patch, SENTRY-2024.001.patch,
SENTRY-2024.002.patch, SENTRY-2024.002.patch
>
>
> AUTHZ_OBJ_NAME has (384) chars. It is constructed by sentry from notification event as
{color:red}DB_NAME + "." + TBL_NAME{color}. To be consistent with hive, sentry should use
the same char set as what's used by DB_NAME  and TBL_NAME
> Besides, if its table char set is utf8, the constrain AUTHZOBJNAME will have error  "Specified
key was too long; max key length is 767 bytes" for mysql.
> The solution is to specify the char set for this field, so it works for mysql regardless
the char set of its table or DB.
> Reference:
> 1) CREATE TABLE AUTHZ_PATHS_MAPPING
>  (
>      AUTHZ_OBJ_ID BIGINT NOT NULL generated always as identity (start with 1),
>      AUTHZ_OBJ_NAME VARCHAR({color:red}384{color}),
>      CREATE_TIME_MS BIGINT NOT NULL
> );
> CREATE UNIQUE INDEX AUTHZOBJNAME ON AUTHZ_PATHS_MAPPING ({color:red}AUTHZ_OBJ_NAME{color});
> 2) AUTHZ_OBJ_NAME  is constructed by sentry from notification event as DB_NAME + "."
+ TBL_NAME. To be consistent with hive, sentry should use the same char set as what's used
by DB_NAME  and TBL_NAME in NotificationProcessor.processAlterTable().
> 3) hive uses latin1 as default charset for notification event.
> CREATE TABLE IF NOT EXISTS `NOTIFICATION_LOG`
> (
>     `NL_ID` BIGINT(20) NOT NULL,
>     `EVENT_ID` BIGINT(20) NOT NULL,
>     `EVENT_TIME` INT(11) NOT NULL,
>     `EVENT_TYPE` varchar(32) NOT NULL,
>     `DB_NAME` varchar(128),
>     `TBL_NAME` varchar(128),
>     `MESSAGE` mediumtext,
>     PRIMARY KEY (`NL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> 4) actual installation shows both fields are in latin1 char set.
> SHOW FULL COLUMNS FROM NOTIFICATION_LOG;
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
> | Field      | Type         | Collation         | Null | Key | Default | Extra | Privileges
                     | Comment |
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
> | NL_ID      | bigint(20)   | NULL              | NO   | PRI | NULL    |       | select,insert,update,references
|         |
> | EVENT_ID   | bigint(20)   | NULL              | NO   |     | NULL    |       | select,insert,update,references
|         |
> | EVENT_TIME | int(11)      | NULL              | NO   |     | NULL    |       | select,insert,update,references
|         |
> | EVENT_TYPE | varchar(32)  | latin1_swedish_ci | NO   |     | NULL    |       | select,insert,update,references
|         |
> | DB_NAME    | varchar(128) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references
|         |
> | TBL_NAME   | varchar(128) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references
|         |
> | MESSAGE    | mediumtext   | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references
|         |
> +------------+--------------+-------------------+------+-----+---------+-------+---------------------------------+---------+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message