phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bin Shi (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-5085) Disentangle BaseResultIterators from the backing Guidepost Data structure
Date Mon, 07 Jan 2019 20:01:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736138#comment-16736138
] 

Bin Shi edited comment on PHOENIX-5085 at 1/7/19 8:00 PM:
----------------------------------------------------------

[~dbwong] & [~karanmehta93], we're almost on the same page. 

At the current phase, to address this JIRA, the GuidePostsInfo can provide two create functions
(Factory Method) - one returns List<GuidePost> and another returns ArrayList<GuidePost>
interfaces (can leave it for implantation in the future), where GuidePost is a data structure
contains data from a row in stats table. Define a SequenceAccessor factory which provides
a method get SequenceAccessor (interface) of Guideposts. A concrete class implements this
SequenceAccess interface and encapsulates the current implantation of guideposts data structure
using prefix encoding. BaseResultInterator.getParallelScans() use List<GuidePost> or/and
ArrayList<GuidePost>.

At the next phase, we'll define RandomAccessor factory and RandomeAccess interface, implement
different guideposts data structure (Segment Tree). Add more APIs and helper functions, such
as what described in ["Phoenix deep dive" slides|https://docs.google.com/presentation/d/1G_CcAhk2xSC09mqG3MNt1i2QbgqfWbg9OM966_ucSGQ].
 # Use [Segment Tree|https://www.geeksforgeeks.org/segment-tree-set-1-sum-of-given-range/]
(Plus some characteristics from B+ Tree) PHOENIX-4925
 # Disentangle the granularity of guideposts from that of the cached guideposts (PHOENIX-4927)
 # Mount/unmount guideposts for a particular tenant or key range
 # Guideposts Chunk is always encoded/decoded as a whole, so we can choose different compression
algorithms depending on the data.
 # Support Range Scan. Given <start key, End Key>, return List<GuidePost> decompressed
on which we can perform binary search, or return estimated # rows and # bytes


was (Author: bin shi):
[~dbwong] & [~karanmehta93], we're almost on the same page. 

At the current phase, to address this JIRA, the GuidePostsInfo can provide two create functions
(Factory Method) - one returns List<GuidePost> and another returns ArrayList<GuidePost>
interfaces (can leave it for implantation for the future), where GuidePost is a data structure
contains data from a row in stats table. Define a SequenceAccessor factory which provides
a method get SequenceAccessor (interface) of Guideposts. A concrete class implements this
SequenceAccess interface and encapsulates the current implantation of guideposts data structure
using prefix encoding. In BaseResultInterator.getParallelScans() use List<GuidePost>
or/and ArrayList<GuidePost>.

 

At the next phase, we'll define RandomAccessor factory and RandomeAccess interface, implement
different guideposts data structure (Segment Tree). Add more APIs and helper functions, such
as what described in ["Phoenix deep dive" slides|https://docs.google.com/presentation/d/1G_CcAhk2xSC09mqG3MNt1i2QbgqfWbg9OM966_ucSGQ].
 # Use [Segment Tree|https://www.geeksforgeeks.org/segment-tree-set-1-sum-of-given-range/]
(Plus some characteristics from B+ Tree) PHOENIX-4925
 # Disentangle the granularity of guideposts from that of the cached guideposts (PHOENIX-4927)
 # Mount/unmount guideposts for a particular tenant or key range
 # Guideposts Chunk is always encoded/decoded as a whole, so we can choose different compression
algorithms depending on the data.
 # Support Range Scan. Given <start key, End Key>, return List<GuidePost> decompressed
on which we can perform binary search, or return estimated # rows and # bytes

> Disentangle BaseResultIterators from the backing Guidepost Data structure
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-5085
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5085
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Daniel Wong
>            Assignee: Daniel Wong
>            Priority: Major
>              Labels: Statistics, StatsImprovement
>
> Disentangle BaseResultIterators.getParallelScans from the backing Guidepost Data structure. 
This will provide the abstraction for possible new stats data structures in https://issues.apache.org/jira/browse/PHOENIX-4925
>  Will heavily affect changes in https://issues.apache.org/jira/browse/PHOENIX-4926 and
https://issues.apache.org/jira/browse/PHOENIX-4594.  [~Bin Shi] [~karanmehta93]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message