tephra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Xiaoman Huang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TEPHRA-247) Avoid encoding the transaction multiple times
Date Mon, 22 Jan 2018 07:19:00 GMT

    [ https://issues.apache.org/jira/browse/TEPHRA-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330075#comment-16330075
] 

Patrick Xiaoman Huang edited comment on TEPHRA-247 at 1/22/18 7:18 AM:
-----------------------------------------------------------------------

I have tried code of Parquet DeltaBinaryPackingWriter using delta and binary packing store
integer which algorithm and format is inspired by D. Lemire's paper([http://lemire.me/blog/archives/2012/09/12/fast-integer-compression-decoding-billions-of-integers-per-second/)] to
compress 500s long number of writerPointer, got the result of about 8xx or 9xx bytes that
should be 500*8=4000 bytes, means it can be compressd about 77%.

is it possible we hack this code to compress inProgress array?

As invalids is a system invalid list normally grow but change not so frequently, is this can
be push to region servers as cache?


was (Author: mk926):
I have tried code of Parquet DeltaBinaryPackingWriter using delta and binary packing store
integer which algorithm and format is inspired by D. Lemire's paper([http://lemire.me/blog/archives/2012/09/12/fast-integer-compression-decoding-billions-of-integers-per-second/)] to
compress 500s long number of writerPointer, got the result of about 8xx or 9xx bytes that
should be 500*8=4000 bytes, means it can be compressd about 77%.

is it possible we hack this code to compress inProgress array?

As invalids is a system invalid list normally grow but change not so requently, is this can
be push to region servers as cache?

> Avoid encoding the transaction multiple times
> ---------------------------------------------
>
>                 Key: TEPHRA-247
>                 URL: https://issues.apache.org/jira/browse/TEPHRA-247
>             Project: Tephra
>          Issue Type: Improvement
>          Components: core, manager
>    Affects Versions: 0.12.0-incubating
>            Reporter: Andreas Neumann
>            Assignee: Andreas Neumann
>            Priority: Major
>         Attachments: design.jpg
>
>
> Currently, the same transaction object is encoded again and again for every Get performed
in HBase. It would be better to cache the encoded transaction for the duration of the transaction
and reuse it, 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message