phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ohad Shacham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5090) Discuss: Allow transactional writes without buffering the entire transaction on the client.
Date Mon, 14 Jan 2019 15:08:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742178#comment-16742178
] 

Ohad Shacham commented on PHOENIX-5090:
---------------------------------------

>  Is it fair to say that Omid is designed for very many, small transactions, and not
for extremely large transactions?

Extremely large transaction requires to store a lot of metadata information, then I assume
the answer is yes.

 

> What would need to change in HBase so you can use those?

If we could ask HBase to delete all the row's column with version which is equal (not smaller)
to the transaction version then we can keep only the row information (in row level conflict
analysis) and use this function to delete in case of aborts.

 

>   Is that done on the server in the context of scan operations? 

Assuming the filtering is done at the server side (as we do for Phoenix) then yes :(.

 

> Perhaps good to take offline and have a quick chat?

Sounds great, we are 10 hours ahead so meeting at 11am your time would be a good match for
us. Can Tuesday or Wednesday work?

 

Overall, for row level conflict analysis it would be great to add row level shadow cell.
This will significantly increase the scalability for large transactions all will solve the
issues above. Adding an HBase delete that deletes all columns with the exact version will let
Omid do efficient deletion in case of abort. Otherwise, removing these during the regular
GC is also possible. 

 

> Discuss: Allow transactional writes without buffering the entire transaction on the client.
> -------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5090
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5090
>             Project: Phoenix
>          Issue Type: Wish
>            Reporter: Lars Hofhansl
>            Priority: Major
>
> Currently it is not possible execute transactions in Phoenix that are too large to be
buffered entirely on the client.
> Both Tephra and Omid support writing uncommitted data to HBase immediately and at full
speed. The client still needs to keep tracks of the rows changes for:
> # Conflict detection
> # (for Omid) writing the shadow cells
> I'd like to do some brainstorming here.
> * It should *always* be enough to only hold on to the changed rows (and columns?) only
for _conflict resolution_ and free the rest from the client as soon as the uncommitted data
is written to HBase.
> * For the shadows cells we need only keep the rows changed, right?
> * There are situations where we can avoid the client site buffering entirely (perhaps
only for Tephra) when we declare a table or upsert not to participate in conflict resolution.
> [~tdsilva], [~ohads], [~yonigo], [~jamestaylor], [~vincentpoon], more, better ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message