beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xumingming <...@git.apache.org>
Subject [GitHub] beam pull request #4041: [BEAM-2528] BeamSQL DDL :: CreateTable
Date Thu, 26 Oct 2017 12:29:02 GMT
GitHub user xumingming opened a pull request:

    https://github.com/apache/beam/pull/4041

    [BEAM-2528] BeamSQL DDL :: CreateTable

    I started this PR as an initial attempt to implement the BeamSQL `create table` statement.
 The implementation might not be so mature, but I hope this could be a place we can discuss
deeper about the create table. I will introduce this PR in the following 3 aspects:
    
    * MetaStore
    * TableProvider
    * Grammar
    
    ## MetaStore
    
    Metastore is responsible for handling the CRUD of table during a session. e.g. create
a table, query all tables, query a table by the specified name etc. When a table is created,
the table meta info can be persisted by the metastore, but the default `InMemoryMetaStore`
will only store the meta info in memory, so it will NOT be persisted, but user can implement
the `MetaStore` interface to make a persistent implementation.
    
    The table names in MetaStore need to be unique.
    
    ## TableProvider
    
    The tables in MetaStore can come from many different sources, the construction of a usable
table is the responsibility of a `TableProvider`, TableProvider have the similar interface
like `MetaStore`, but it only handles a specific type of table, e.g. `TextTableProvider` only
handle text tables, while `KafakaTableProvider` only handle kafka tables.
    
    In this PR, only `TextTableProvider`  and  `KafakaTableProvider` are implemented as example.
    
    ## Grammar
    
    The grammar for create a TEXT table is:
    
    ```sql
    CREATE TABLE ORDERS(
       ID INT PRIMARY KEY COMMENT 'this is the primary key',
       NAME VARCHAR(127) COMMENT 'this is the name'
    )
    COMMENT 'this is the table orders'
    LOCATION 'text://home/admin/orders'
    TBLPROPERTIES '{"format": "Excel"}'
    ```
    
    `LOCATION` dictates where the data of the table is stored. The scheme of the LOCATION
dictate the table type, e.g. in the above example, the table type is `text`, using the table
type we can find the corresponding `TextTableProvider` using the ServiceLoader merchanism.
    
    `TBLPROPERTIES` is used to specify some other properties of the table, in the above example,
we specified the format of each line of text file: `Excel`(one variant of CSV format).
    
    The grammar for create a KAFKA table is:
    
    ```sql
     CREATE TABLE ORDERS(
       ID INT PRIMARY KEY COMMENT 'this is the primary key',
       NAME VARCHAR(127) COMMENT 'this is the name'
     )
     COMMENT 'this is the table orders'
     LOCATION 'kafka://localhost:2181/brokers?topic=test'
     TBLPROPERTIES '{"bootstrap.servers":"localhost:9092", "topics": ["topic1", "topic2"]}'
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xumingming/beam BEAM-2528-create-table-from-master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/4041.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4041
    
----
commit ab5ee759116c7e8139d7d17aca631f018a48fb40
Author: James Xu <xumingmingv@gmail.com>
Date:   2017-09-13T12:36:37Z

    [BEAM-2528] create table

commit 813b9a7a5e7232d3028bc6b859d96c0d856c1517
Author: James Xu <xumingmingv@gmail.com>
Date:   2017-10-26T12:17:09Z

    minor

----


---

Mime
View raw message