phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3744) Support snapshot scanners for MR-based queries
Date Wed, 24 May 2017 17:26:04 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023277#comment-16023277
] 

ASF GitHub Bot commented on PHOENIX-3744:
-----------------------------------------

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/239#discussion_r118314328
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/util/PhoenixConfigurationUtil.java
---
    @@ -192,6 +196,18 @@ public static void setOutputTableName(final Configuration configuration,
final S
         public static void setUpsertColumnNames(final Configuration configuration,final String[]
columns) {
             setValues(configuration, columns, MAPREDUCE_UPSERT_COLUMN_COUNT, MAPREDUCE_UPSERT_COLUMN_VALUE_PREFIX);
         }
    +
    +    public static void setSnapshotNameKey(final Configuration configuration, final String
snapshotName) {
    +        Preconditions.checkNotNull(configuration);
    +        Preconditions.checkNotNull(snapshotName);
    +        configuration.set(SNAPSHOT_NAME_KEY, snapshotName);
    --- End diff --
    
    The idea of having the snapshot name in the configuration is not going to translate well
when we want to expose snapshot reads for queries in general as the configuration is a global
object.


> Support snapshot scanners for MR-based queries
> ----------------------------------------------
>
>                 Key: PHOENIX-3744
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3744
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Akshita Malhotra
>         Attachments: PHOENIX-3744.patch
>
>
> HBase support scanning over snapshots, with a SnapshotScanner that accesses the region
directly in HDFS. We should make sure that Phoenix can support that.
> Not sure how we'd want to decide when to run a query over a snapshot. Some ideas:
> - if there's an SCN set (i.e. the query is running at a point in time in the past)
> - if the memstore is empty
> - if the query is being run at a timestamp earlier than any memstore data
> - as a config option on the table
> - as a query hint
> - based on some kind of optimizer rule (i.e. based on estimated # of bytes that will
be scanned)
> Phoenix typically runs a query at the timestamp at which it was compiled. Any data committed
after this time should not be seen while a query is running.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message