phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5090) Discuss: Allow transactional writes without buffering the entire transaction on the client.
Date Tue, 09 Apr 2019 08:17:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813110#comment-16813110
] 

Lars Hofhansl commented on PHOENIX-5090:
----------------------------------------

Something is not right. I see partial data sometimes (i.e. when I commit a larger transaction
- 50-60k rows in one session - I sometime see some rows first and then the rest of the rows).
Only when issuing a commit with autocommit=off.
When I see it it's always off by a very small number. I doubt anything is wrong with Omid,
so there's a problem with this change in Phoenix.

I don't quite understand that, yet, since Phoenix does the very same if you have autocommit
off and you do a query after an upsert. In that case it sends all uncommitted data to the
server.

> Discuss: Allow transactional writes without buffering the entire transaction on the client.
> -------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5090
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5090
>             Project: Phoenix
>          Issue Type: Wish
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Major
>         Attachments: 5090-looksee.txt, 5090-v1.txt, 5090-v2.txt, 5090-v3.txt
>
>
> Currently it is not possible execute transactions in Phoenix that are too large to be
buffered entirely on the client.
> Both Tephra and Omid support writing uncommitted data to HBase immediately and at full
speed. The client still needs to keep tracks of the rows changes for:
> # Conflict detection
> # (for Omid) writing the shadow cells
> I'd like to do some brainstorming here.
> * It should *always* be enough to only hold on to the changed rows (and columns?) only
for _conflict resolution_ and free the rest from the client as soon as the uncommitted data
is written to HBase.
> * For the shadows cells we need only keep the rows changed, right?
> * There are situations where we can avoid the client site buffering entirely (perhaps
only for Tephra) when we declare a table or upsert not to participate in conflict resolution.
> [~tdsilva], [~ohads], [~yonigo], [~jamestaylor], [~vincentpoon], more, better ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message