phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2715) Query Log
Date Sun, 08 Apr 2018 17:15:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429814#comment-16429814
] 

Andrew Purtell commented on PHOENIX-2715:
-----------------------------------------

Random thoughts on trying to use this in a production setting
 * Cool to have a LogWriter that puts the log into a table, so the log itself can be queried.
Powerful. How about a LogWriter that just emits to Java logging as well. HBase+Phoenix systems
throw off a ton of this type of logging, so we already need a solution for managing it, for
which query log would just be a new subset. Many places may want their log search solution
to be based on something else (Splunk, Elastic, Solr, etc.)
 * If not an alternate implementation of LogWriter, at least a better factoring. Make LogWriter
abstract or an interface. That should be quickly accomplished.
 * What happens if query logging becomes too expensive? We can turn it all the way on and
all the way off. Can we have a knob for probabilistic sampling? This is really easy to implement.
Add one config parameter, a float or double, one that can ideally be changed dynamically.
Call it something like QUERY_LOG_SAMPLE_RATE (not a great name but whatever) In the code where
you go to do the query logging, add a conditional \{{if (ThreadLocalRandom.getCurrent().getDouble()
<= getConfig(QUERY_LOG_SAMPLE_RATE))}} . Easy. So if logging 100% of queries is too expensive
(at QUERY_LOG_SAMPLE_RATE = 1.0), we can try logging 50% of them (at QUERY_LOG_SAMPLE_RATE
= 0.5), or 10% of them (at QUERY_LOG_SAMPLE_RATE = 0.1), or 1% of them (at QUERY_LOG_SAMPLE_RATE
= 0.01). 

> Query Log
> ---------
>
>                 Key: PHOENIX-2715
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2715
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Nick Dimiduk
>            Assignee: Ankit Singhal
>            Priority: Major
>         Attachments: PHOENIX-2715.patch, PHOENIX-2715_master.patch, PHOENIX-2715_master_V1.patch
>
>
> One useful feature of other database systems is the query log. It allows the DBA to review
the queries run, who's run them, time taken, &c. This serves both as an audit and also
as a source of "ground truth" for performance optimization. For instance, which columns should
be indexed. It may also serve as the foundation for automated performance recommendations/actions.
> What queries are being run is the first piece. Have this data tied into tracing results
and perhaps client-side metrics (PHOENIX-1819) becomes very useful.
> This might take the form of clients writing data to a new system table, but other implementation
suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message