phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chinmay Kulkarni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-374) Enable access to dynamic columns in * or cf.* selection
Date Tue, 18 Dec 2018 01:57:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723581#comment-16723581
] 

Chinmay Kulkarni commented on PHOENIX-374:
------------------------------------------

[~tdsilva] A couple of points:
 * When upserting data for dynamic columns, we will need to embed the data type of the dynamic
column. We can achieve this using a scan attribute which stores info about dynamic columns
and can be resolved on the server-side in a RegionObserver coprocessor in the _doPostScannerOpen_
method.
 * When selecting wildcards or CF wildcards, we currently set the column family of the scan,
however that also sets the "columns" that can be iterated over in the scan. Based on a config
(defaulting to false), we can either set or not set column families for the scanner in case
of wildcard queries. On top of this, we would need to stop projecting columns in the ResultSet
and add APIs in the PhoenixResultSet in order to figure out:  number of dynamic columns,
data types of each dynamic column, getValue for the dynamic column (this last one is basically
a combination of getting the data type and then coercing ResultSet value to be of that type
such as getInt, getBoolean, etc.).

Let me know what you think.

> Enable access to dynamic columns in * or cf.* selection
> -------------------------------------------------------
>
>                 Key: PHOENIX-374
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-374
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: nicolas maillard
>            Assignee: Chinmay Kulkarni
>            Priority: Critical
>
> As of recent work we can now read and write columns that are not in the schema, AKA dynamic
columns. the Select and Upsert allow dynamic columns to be specified. 
> I think two additions are still needed.
> - Alter dynamicly: In the Upsert and/or Select statement  the ability to add on the specified
dynamic column to schema. Say Upsert into Table (key, cf.dynColumn varchar SCHEMAADD) values
(..)
> and for select: 
>      - select key, cf.dynColumn varchar from T would only read
>      - select key from T(cf.dynColumn varchar ) would only read and wrtie to schema
> - Select a complete column Family: More complex, accessing a whole Column Family with
all rows known in schema or not.
>  select cf.* from T
> today this works for know columns it could be nice to have this for all columns of a
family in the schema or not. I'm trying right now to extend this to schema for unknown columns.
However every new row can a lot of very different unknowcolumns. The defined ones will be
first but the unknown one will be appended at the end.
> This means the metadata might need to be updated at every row to account for all new
columns discovered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message