cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murukesh Mohanan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13000) slow query log analysis tool
Date Mon, 20 Feb 2017 06:49:44 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Murukesh Mohanan updated CASSANDRA-13000:
-----------------------------------------
    Attachment: csqldumpslow.py

I have written a Python script that mimics functions of {{mysqldumpslow}}. Quoting the help
text from it:

{code}
usage: csqldumpslow.py [-h] [-s TYPE] [-r] [-t N] [-j] [-o FILE]
                       [FILE [FILE ...]]

Provide a summary of the slow queries listed in Cassandra debug logs.
Multiple log files can be provided, in which case the logs are combined.

positional arguments:
  FILE                  Input files. Standrad input is -. Default: logs/debug.log

optional arguments:
  -h, --help            show this help message and exit
  -s TYPE, --sort TYPE  Sort the input by TYPE
  -r, --reverse         Reverse the sort order
  -t N, --top N         Print only the top N queries
  -j, --json            Assume JSON-encoded input
  -o FILE, --output FILE
                        Save output to FILE

Sorting types:
	t	- total time
	at	- average time
	c	- count
{code}

Some of the information available in MySQL's logs are not available (or applicable) here.
Accordingly, I haven't tried to implement options from {{mysqldumpslow}} which use those.

I thought about implementing the {{-g}} option, but it seems the query string printed out
doesn't always match the actual query, so I don't know how useful it would be.

With input from [Code Review Stack Exchange|http://codereview.stackexchange.com/questions/155563/cassandra-slow-query-log-analysis-tool].

> slow query log analysis tool
> ----------------------------
>
>                 Key: CASSANDRA-13000
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13000
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Observability
>            Reporter: Jon Haddad
>         Attachments: csqldumpslow.py
>
>
> As a follow up to CASSANDRA-12403, it would be very helpful to have a tool to process
the slow queries that are logged.  In the MySQL world, there's a tool called mysqldumpslow,
which processes a slow query log, abstracts the parameters to prepared statements, and shows
the queries which are causing problems based on frequency.  The {{mysqldumpslow}} utillity
shows an aggregated count & time statistics spent on slow queries.  For instance:
> {code}shell> mysqldumpslow
> Reading mysql slow query log from /usr/local/mysql/data/mysqld51-apple-slow.log
> Count: 1  Time=4.32s (4s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
>  insert into t2 select * from t1
> Count: 3  Time=2.53s (7s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
>  insert into t2 select * from t1 limit N
> Count: 3  Time=2.13s (6s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
>  insert into t1 select * from t1{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message