phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Horen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-6) Support ON DUPLICATE KEY construct
Date Mon, 07 Nov 2016 22:46:58 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645719#comment-15645719
] 

Gary Horen commented on PHOENIX-6:
----------------------------------

in the near term this will probably be mostly single rows / commit. As time goes on other
use cases might present larger sized batches.

I would not expect a single row to be updated many times in a single commit. That would be
rare for us.

Peak arrival rate (occuring a handful of times / day) would be in the order of a handful per
second, for now. Modal arrival rate during the day will probably be several / minute.

The current scenario is counting views for feed items. Some feed items will be very popular,
others will be viewed seldom. My wild guess would be that the ratio of popular to unpopular
will be 10:1 with a gentle downward asymptote between them.


> Support ON DUPLICATE KEY construct
> ----------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.9.0
>
>         Attachments: PHOENIX-6.patch, PHOENIX-6_4.x-HBase-0.98.patch, PHOENIX-6_v2.patch,
PHOENIX-6_v3.patch, PHOENIX-6_v4.patch, PHOENIX-6_v5.patch, PHOENIX-6_wip1.patch, PHOENIX-6_wip2.patch,
PHOENIX-6_wip3.patch, PHOENIX-6_wip4.patch
>
>
> To support inserting a new row only if it doesn't already exist, we should support the
"on duplicate key" construct for UPSERT. With this construct, the UPSERT VALUES statement
would run atomically and would thus require a read before write which would obviously have
a negative impact on performance. For an example of similar syntax , see MySQL documentation
at http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
> See this discussion for more detail: https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J.
A related discussion is on PHOENIX-2909.
> Initially we'd support the following:
> # This would prevent the setting of VAL to 0 if the row already exists:
> {code}
> UPSERT INTO T (PK, VAL) VALUES ('a',0) 
> ON DUPLICATE KEY IGNORE;
> {code}
> # This would increment the valueS of COUNTER1 and COUNTER2 if the row already exists
and otherwise initialize them to 0:
> {code}
> UPSERT INTO T (PK, COUNTER1, COUNTER2) VALUES ('a',0,0) 
> ON DUPLICATE KEY UPDATE COUNTER1 = COUNTER1 + 1, COUNTER2 = COUNTER2 + 1;
> {code}
> So the general form is:
> {code}
> UPSERT ... VALUES ... [ ON DUPLICATE KEY [IGNORE | UPDATE <column>=<expression>,
...] ]
> {code}
> The following restrictions will apply:
> * The <column> may not be part of the primary key constraint - only KeyValue columns
will be allowed.
> * This new clause cannot be used with
> ** Immutable tables since the whole point is to atomically update a row in place which
isn't allowed for immutable tables. 
> ** Transactional tables because these use optimistic concurrency as their mechanism for
consistency and isolation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message