NAME

mk-query-digest - Parses logs and more. Analyze, transform, filter, review and report on queries.

SYNOPSIS

Analyze and report on a slow log:

 mk-query-digest /path/to/slow.log

Review a slow log, saving results to the test.query_review table in a MySQL server running on host1. See --review for more on reviewing queries:

 mk-query-digest --review h=host1,D=test,t=query_review /path/to/slow.log

Watch a server's SHOW FULL PROCESSLIST and analyze the queries as if they were from a slow query log:

 mk-query-digest --processlist h=host1

Watch a server's SHOW FULL PROCESSLIST, filter out everything but SELECT queries, and replay the queries against another server, then use the timings from replaying them to analyze their performance:

 mk-query-digest --processlist h=host1 --execute h=another_server \
   --filter '$event->{fingerprint} =~ m/^select/'

DESCRIPTION

This tool was formerly known as mk-log-parser.

mk-query-digest is a framework for doing things with events from a query source such as the slow query log or PROCESSLIST. By default it acts as a very sophisticated log analysis tool. You can group and sort queries in many different ways simultaneously and find the most expensive queries, or create a timeline of queries in the log, for example. It can also do a "query review," which means to save a sample of each type of query into a MySQL table so you can easily see whether you've reviewed and analyzed a query before. The benefit of this is that you can keep track of changes to your server's queries and avoid repeated work. You can also save other information with the queries, such as comments, issue numbers in your ticketing system, and so on.

Note that this is a work in *very* active progress and you should expect incompatible changes in the future.

ATTRIBUTES

mk-query-digest works on events, which are a collection of key/value pairs called attributes. You'll recognize most of the attributes right away: Query_time, Lock_time, and so on. You can just look at a slow log and see them. However, there are some that don't exist in the slow log, and slow logs may actually include different kinds of attributes (for example, you may have a server with the Percona patches). A full list is at http://code.google.com/p/maatkit/wiki/EventAttributes. These attributes show up in lots of different places.

OUTPUT

The default output is a query analysis report. There is one paragraph for each class of query analyzed. A "class" of queries all have the same fingerprint, which is an abstracted version of the query text with literals removed, whitespace collapsed, and so forth. The report is formatted so it's easy to paste into emails without wrapping, and all non-query lines begin with a comment, so you can save it to a .sql file and open it in your favorite syntax-highlighting text editor. There is a response-time profile at the very end.

The report begins with one paragraph about the entire analysis run. The information is very similar to what you'll see for each class of queries in the log, but it doesn't have some information that would be too expensive to keep globally for the analysis. It also has some statistics about the code's excution itself, such as the CPU and memory usage.

Following this, each query then appears in a paragraph. Here's a sample, slightly reformatted so 'perldoc' will not wrap lines in a terminal. The following will all be one paragraph, but we'll break it up for commentary.

 # Query 2: 0.01 QPS, 0.02x conc, ID 0xFDEA8D2993C9CAF3 at byte 160665

This line identifies the sequential number of the query in the sort order specified by --order-by. Then there's the queries per second, and the approximate concurrency for this query (calculated as a function of the timespan and total Query_time). Next there's a query ID. This ID is a hex version of the query's checksum in the database, if you're using --review. You can select the reviewed query's details from the database with a query like SELECT .... WHERE checksum=0xFDEA8D2993C9CAF3.

Finally, in case you want to find a sample of the query in the log file, there's the byte offset where you can look. (This is not always accurate, due to some silly anomalies in the slow-log format, but it's usually right.) The position refers to the worst sample, which we'll see more about below.

Next is the table of metrics about this class of queries.

 #           pct   total    min    max     avg     95%  stddev  median
 # Count       0       2
 # Exec time  13   1105s   552s   554s    553s    554s      2s    553s
 # Lock time   0   216us   99us  117us   108us   117us    12us   108us
 # Rows sent  20   6.26M  3.13M  3.13M   3.13M   3.13M   12.73   3.13M
 # Rows exam   0   6.26M  3.13M  3.13M   3.13M   3.13M   12.73   3.13M

The first line is column headers for the table. The percentage is the percent of the total for the whole analysis run, and the total is the actual value of the specified metric. For example, in this case we can see that the query executed 2 times, which is 13% of the total number of queries in the file. The min, max and avg columns are self-explanatory. The 95% column shows the 95th percentile; 95% of the values are less than or equal to this value. The standard deviation shows you how tightly grouped the values are. The standard deviation and median are both calculated from the 95th percentile, discarding the extremely large values.

The stddev, median and 95th percentile statistics are approximate. Exact statistics require keeping every value seen, sorting, and doing some calculations on them. This uses a lot of memory. To avoid this, we keep 1000 buckets, each of them 5% bigger than the one before, ranging from .000001 up to a very big number. When we see a value we increment the bucket into which it falls. Thus we have fixed memory per class of queries. The drawback is the imprecision, which typically falls in the 5 percent range.

Next we have statistics on the users, databases and time range for the query.

 # Users       1   user1
 # Databases   2     db1(1), db2(1)
 # Time range 2008-11-26 04:55:18 to 2008-11-27 00:15:15

The users and databases are shown as a count of distinct values, followed by the values. If there's only one, it's shown alone; if there are many, we show each of the most frequent ones, followed by the number of times it appears.

 # Query_time distribution
 #   1us
 #  10us
 # 100us
 #   1ms
 #  10ms
 # 100ms
 #    1s
 #  10s+  #############################################################

The execution times show a logarithmic chart of time clustering. Each query goes into one of the "buckets" and is counted up. The buckets are powers of ten. The first bucket is all values in the "single microsecond range" -- that is, less than 10us. The second is "tens of microseconds," which is from 10us up to (but not including) 100us; and so on.

 # Tables
 #    SHOW TABLE STATUS LIKE 'table1'\G
 #    SHOW CREATE TABLE `table1`\G
 # EXPLAIN
 SELECT * FROM table1\G

This section is a convenience: if you're trying to optimize the queries you see in the slow log, you probably want to examine the table structure and size. These are copy-and-paste-ready commands to do that.

Finally, we see a sample of the queries in this class of query. This is not a random sample. It is the query that performed the worst, according to the sort order given by --order-by. You will normally see a commented # EXPLAIN line just before it, so you can copy-paste the query to examine its EXPLAIN plan. But for non-SELECT queries that isn't possible to do, so the tool tries to transform the query into a roughly equivalent SELECT query, and adds that below.

If you want to find this sample event in the log, use the offset mentioned above, and something like the following:

  tail -c +<offset> /path/to/file | head

The final bit of output is the response-time profile over the events. This is a summarized view of what we've already seen. Here is a sample:

  # Rank Query ID           Response time    Calls   R/Call     Item
  # ==== ================== ================ ======= ========== ====
  #    1 0x31DA25F95494CA95     0.1494 99.9%       1   0.149435 SHOW
  #    2 0x3AEAAD0E15D725B5     0.0001  0.1%       2   0.000041 SET
  #    3 0x813031B8BBC3B329     0.0000  0.0%       1   0.000015 COMMIT

The columns should be fairly self-explanatory: rank, query ID, response time sum and percentage of total; number of calls and response time per call; and the distilled query (see "distill" for more detail on this).

QUERY REVIEWS

A "query review" is the process of storing all the query fingerprints analyzed. This has several benefits:

To use this feature, you run mk-query-digest with the --review option. It will store the fingerprints and other information into the table you specify. Next time you run it with the same option, it will do the following:

FINGERPRINTS

A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together. Abstracting a query removes literal values, normalizes whitespace, and so on. For example, these two queries:

  SELECT name, password FROM user WHERE id='12823';
  select name,   password from user
     where id=5;

Both of those queries will fingerprint to

  select name, password from user where id=?

Once the query's fingerprint is known, we can then talk about a query as though it represents all similar queries.

What mk-query-digest does is analogous to a GROUP BY statement in SQL. If your command-line looks like this,

  mk-query-digest /path/to/slow.log --select Rows_read,Rows_sent \
      --group-by fingerprint --order-by Query_time:sum --limit 10

The corresponding pseudo-SQL looks like this:

  SELECT WORST(query BY Query_time), SUM(Query_time), ...
  FROM /path/to/slow.log
  GROUP BY FINGERPRINT(query)
  ORDER BY SUM(Query_time) DESC
  LIMIT 10

You can also use the value distill, which is a kind of super-fingerprint. See --group-by for more.

OPTIONS

DSN values in --review-history default to values in --review if COPY is yes.

--attribute-aliases

type: array; default: db|Schema

List of attribute|alias,etc.

Certain attributes have multiple names, like db and Schema. If an event does not have the priamry attribute, mk-query-digest looks for an alias attribute. If it finds an alias, it creates the primary attribute with the alias attribute's value and removes the alias attribute.

If the event has the primary attribute, all alias attributes are deleted.

This helps to simplify event attributes so that, for example, there will not be report lines for both db and Schema.

--attribute-limit

type: int; default: 4294967296

A sanity limit for attribute values.

This option deals with bugs in slow-logging functionality that causes large values for attributes. If the attribute's value is bigger than this, the last-seen value for that class of query is used instead.

--config

type: Array

Read this comma-separated list of config files; if specified, this must be the first option on the command line.

--continue-on-error
Continue parsing even if there is an error.
--create-review-history-table

Create the --review-history table if it does not exist.

This option causes the table specified by --review-history to be created with the default structure shown in the documentation for that option.

--create-review-table

Create the --review table if it does not exist.

This option causes the table specified by --review to be created with the default structure shown in the documentation for that option.

--daemonize
Fork to the background and detach from the shell. POSIX operating systems only.
--embedded-attributes

type: array

Two Perl regex patterns to capture pseudo-attributes embedded in queries.

Embedded attributes might be special attribute-value pairs that you've hidden in comments. The first regex should match the entire set of attributes (in case there are multiple). The second regex should match and capture attribute-value pairs from the first regex.

For example, suppose your query looks like the following:

  SELECT * from users -- file: /login.php, line: 493;

You might run mk-query-digest with the following option:

  mk-query-digest --embedded-attributes ' -- .*','(\w+): ([^\,]+)'

The first regular expression captures the whole comment:

  " -- file: /login.php, line: 493;"

The second one splits it into attribute-value pairs and adds them to the event:

   ATTRIBUTE  VALUE
   =========  ==========
   file       /login.php
   line       493

NOTE: All commas in the regex patterns must be escaped with \ otherwise the pattern will break.

--execute

type: DSN

Execute queries on this DSN.

Adds a callback into the chain, after filters but before --report. Events are executed on this DSN. If they are successful, the time they take to execute overwrites the event's Query_time attribute. If unsuccessful, the callback returns false and terminates the chain.

If the connection fails, mk-query-digest tries to reconnect once per second. See also --mirror.

--expected-range

type: array; default: 5,10

Explain items when there are more or fewer than expected.

Defines the number of items expected to be seen in the report given by --report, as controlled by --limit and --outliers. If there are more or fewer items in the report, each one will explain why it was included.

--explain

type: DSN

Run EXPLAIN for the sample query with this DSN and print results.

This works only when --report includes fingerprint. It causes mk-query-digest to run EXPLAIN and include the output into the report. For safety, queries that appear to have a subquery that EXPLAIN will execute won't be EXPLAINed. Those are typically "derived table" queries of the form

  select ... from ( select .... ) der;
--filter

type: string

Discard events for which this Perl code doesn't return true.

This option is a string of Perl code that gets compiled into a subroutine with one argument: $event. This is a hashref. If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the subroutine other than return $event.

An example filter that discards everything but SELECT statements:

  --filter '$event->{arg} =~ m/^select/i'

This is compiled into a subroutine like the following:

  sub { $event = shift; $event->{arg} =~ m/^select/i && return $event; }

It is permissible for the code to have side effects (to alter $event).

--fingerprints
Add query fingerprints to the --report. This is mostly useful for debugging purposes.
--[no]for-explain

default: yes

Print extra information to make analysis easy.

This option adds code snippets to make it easy to run SHOW CREATE TABLE and SHOW TABLE STATUS for the query's tables. It also rewrites non-SELECT queries into a SELECT that might be helpful for determining the non-SELECT statement's index usage.

--group-by

type: Array

Which attribute of the events to group by.

This option is a comma-separated list of items that defaults to the same items as --report but may contain extras.

In general, you can group queries into classes based on any attribute of the query, such as user or db, which will by default show you which users and which databases get the most Query_time.

Every value must have a corresponding value in the same position in --order-by. However, adding values to --group-by will automatically add values to --order-by, for your convenience. And, since --report automatically adds values to --group-by, these cascade through to --order-by. So you generally don't need to think about anything except --report.

There are several magical values that cause some extra data mining to happen before the grouping takes place:

fingerprint
This causes events to be fingerprinted to abstract queries into a canonical form, which is then used to group events together into a class. See "FINGERPRINTS" for more about fingerprinting.
tables
This causes events to be inspected for what appear to be tables, and then aggregated by that. Note that a query that contains two or more tables will be counted as many times as there are tables; so a join against two tables will count the Query_time against both tables.
distill
This is a sort of super-fingerprint that collapses queries down into a suggestion of what they do, such as INSERT SELECT table1 table2.
--help
Show help and exit.
--ignore-attributes

type: array; default: arg,cmd,insert_id,ip,port,Thread_id,timestamp

Do not aggregate these attributes when auto-detecting --select.

If you do not specify --select then mk-query-digest auto-detects and aggregates every attribute that it finds in the slow log. Some attributes, however, should not be aggregated. This option allows you to specify a list of attributes to ignore. This only works when no explicit --select is given.

--interval

type: float; default: .1

How frequently to poll the processlist, in seconds.

--iterations

type: int; default: 1

How many times to iterate through the collect-and-report cycle. If 0, iterate to infinity. See also --run-time.

--limit

type: Array; default: 95%:20

Limit output to the given percentage or count.

If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the % sign, report that percentage of the worst queries. If the percentage is followed by a colon and another integer, report the top percentage or the number specified by that integer, whichever comes first.

The value is actually a comma-separated array of values, one for each item in --report. If you don't specify a value for any of those items, the default is the top 95%.

See also --outliers.

--log

type: string

Print all output to this file when daemonized.

--mirror

type: float

How often to check whether connections should be moved, depending on read_only. Requires --processlist and --execute.

This option causes mk-query-digest to check every N seconds whether it is reading from a read-write server and executing against a read-only server, which is a sensible way to set up two servers if you're doing something like master-master replication. The http://code.google.com/p/mysql-master-master/ master-master toolkit does this. The aim is to keep the passive server ready for failover, which is impossible without putting it under a realistic workload.

--order-by

type: Array; default: Query_time:sum

Sort events by this attribute and aggregate function.

This is a comma-separated list of order-by expressions, one for each attribute you wish to report by (see --report). The syntax is attribute:aggregate. Valid aggregates are sum, min, max, cnt. See the default value for an example.

This option is automatically altered by --group-by. If you specify an attribute that doesn't exist in the events that are parsed and aggregated, mk-query-digest falls back to the default with a notice just at the beginning of the per-item report.

--outliers

type: array; default: Query_time:1:10

Report outliers by attribute:percentile:count.

The syntax of this option is a comma-separated list of colon-delimited strings. The first field is the attribute by which an outlier is defined. The second is a number that is compared to the attribute's 95th percentile. The third is optional, and is compared to the attribute's cnt aggregate. Queries that pass this specification are added to the report, regardless of any limits you specified in --limit.

For example, to report queries whose 95th percentile Query_time is at least 60 seconds and which are seen at least 5 times, use the following argument:

  --outliers Query_time:60:5

You can specify an --outliers option for each value in --report.

--pid

type: string

Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits.

--print
Print log events to STDOUT in standard slow-query-log format.
--processlist

type: DSN

Poll this DSN's processlist for queries, with --interval sleep between.

If the connection fails, mk-query-digest tries to reopen it once per second. See also --mirror.

--report

type: Array; default: fingerprint

Print out reports on the aggregate results from --group-by.

This is the standard slow-log analysis functionality. See "OUTPUT" for the description of what this does and what the results look like.

Any value you specify here automatically adds a corresponding value to --group-by.

--report-all
Include all queries, even if they have already been reviewed.
--report-format

type: Hash; default: rusage,header,query_report,profile

Print these elements in the query analysis report. Valid element names are rusage, header, query_report, and profile. The rusage is information about CPU times and memory usage. The header is information about the entire analysis run. The query_report is detailed information on each query in the report. The profile is a compact table of queries for an at-a-glance view of the report. Order does not matter; you cannot use this parameter to influence the order of items in the report, only the presence or absence of items.

See "OUTPUT" for more information on the various parts of the query report.

--review

type: DSN

Store a sample of each class of query in this DSN.

The argument specifies a table to store all unique query fingerprints in. The table must have at least the following columns, but you can add any more columns you wish. The following CREATE TABLE definition is also used for --create-review-table. MAGIC_create_review:

  CREATE TABLE query_review (
     checksum     BIGINT UNSIGNED NOT NULL PRIMARY KEY,
     fingerprint  TEXT NOT NULL,
     sample       TEXT NOT NULL,
     first_seen   DATETIME,
     last_seen    DATETIME,
     reviewed_by  VARCHAR(20),
     reviewed_on  DATETIME,
     comments     TEXT
  )

The columns are as follows:

  COLUMN       MEANING
  ===========  ===============
  checksum     A 64-bit checksum of the query fingerprint
  fingerprint  The abstracted version of the query; its primary key
  sample       The query text of a sample of the class of queries
  first_seen   The smallest timestamp of this class of queries
  last_seen    The largest timestamp of this class of queries
  reviewed_by  Initially NULL; if set, query is skipped thereafter
  reviewed_on  Initially NULL; not assigned any special meaning
  comments     Initially NULL; not assigned any special meaning

Note that the fingerprint column is the true primary key for a class of queries. The checksum is just a cryptographic hash of this value, which provides a shorter value that is very likely to also be unique.

After the tool generates the fingerprint report for --report, it uses the aggregated values from the report to fill the review table. When you're done, your table should contain a row for each fingerprint. This option depends on --report fingerprint. It will not work otherwise.

--review-history

type: DSN

The table in which to store historical values for review trend analysis.

Each time you review queries with --review, mk-query-digest will save information into this table so you can see how classes of queries have changed over time.

This DSN inherits unspecified values from --review. It should mention a table in which to store statistics about each class of queries. mk-query-digest verifies the existence of the table, and your privileges to insert, delete and update on that table.

mk-query-digest then inspects the columns in the table. The table must have at least the following columns:

  CREATE TABLE query_review_history (
    checksum     BIGINT UNSIGNED NOT NULL,
    sample       TEXT NOT NULL
  );

Any columns not mentioned above are inspected to see if they follow a certain naming convention. The column is special if the name ends with an underscore followed by any of these MAGIC_history_cols values:

  pct|avt|cnt|sum|min|max|pct_95|stddev|median|rank

If the column ends with one of those values, then the prefix is interpreted as the event attribute to store in that column, and the suffix is interpreted as the metric to be stored. For example, a column named Query_time_min will be used to store the minimum Query_time for the class of events. The presence of this column will also add Query_time to the --select list.

The table should also have a primary key, but that is up to you, depending on how you want to store the historical data. We suggest adding ts_min and ts_max columns and making them part of the primary key along with the checksum. But you could also just add a ts_min column and make it a DATE type, so you'd get one row per class of queries per day.

The default table structure follows. The following MAGIC_create_review_history table definition is used for --create-review-history-table:

 CREATE TABLE query_review_history (
   checksum             BIGINT UNSIGNED NOT NULL,
   sample               TEXT NOT NULL,
   ts_min               DATETIME,
   ts_max               DATETIME,
   ts_cnt               FLOAT,
   Query_time_sum       FLOAT,
   Query_time_min       FLOAT,
   Query_time_max       FLOAT,
   Query_time_pct_95    FLOAT,
   Query_time_stddev    FLOAT,
   Query_time_median    FLOAT,
   Lock_time_sum        FLOAT,
   Lock_time_min        FLOAT,
   Lock_time_max        FLOAT,
   Lock_time_pct_95     FLOAT,
   Lock_time_stddev     FLOAT,
   Lock_time_median     FLOAT,
   Rows_sent_sum        FLOAT,
   Rows_sent_min        FLOAT,
   Rows_sent_max        FLOAT,
   Rows_sent_pct_95     FLOAT,
   Rows_sent_stddev     FLOAT,
   Rows_sent_median     FLOAT,
   Rows_examined_sum    FLOAT,
   Rows_examined_min    FLOAT,
   Rows_examined_max    FLOAT,
   Rows_examined_pct_95 FLOAT,
   Rows_examined_stddev FLOAT,
   Rows_examined_median FLOAT,
   PRIMARY KEY(checksum, ts_min, ts_max)
 );

Note that we store the count (cnt) for the ts attribute only; it will be redundant to store this for other attributes.

--run-time

type: time

How long to run before exiting. The default is to run forever (you can interrupt with CTRL-C). See also --iterations.

--sample

type: int

Filter out all but the first N occurrences of each query. The queries are filtered on the first value in --group-by, so by default, this will filter by query fingerprint. For example, --sample 2 will permit two sample queries for each fingerprint. Useful in conjunction with --print to print out the queries. You probably want to set --report to the empty string to avoid the overhead of aggregating and reporting, if you're just using this to print out samples of queries. A complete example:

  mk-query-digest --sample 2 --report '' --group-by fingerprint --print slow.log
--select

type: Array

Compute aggregate statistics for these attributes.

By default mk-query-digest auto-detects, aggregates and prints metrics for every query attribute that it finds in the slow query log. This option specifies a list of only the attributes that you want. You can specify an alternative attribute with a colon. For example, db:Schema uses db if it's available, and Schema if it's not.

Previously, mk-query-digest only aggregated these attributes:

  Query_time,Lock_time,Rows_sent,Rows_examined,user,db:Schema,ts

Attributes specified in the --review-history table will always be selected even if you do not specify --select.

See also --ignore-attributes and "ATTRIBUTES".

--shorten

type: int; default: 1024

Shorten long statements in reports.

Shortens long statements, replacing the omitted portion with a /*... omitted ...*/ comment. This applies only to the output in reports, not to information stored for --review or other places. It prevents a large statement from causing difficulty in a report. The argument is the preferred length of the shortened statement. Not all statements can be shortened, but very large INSERT and similar statements often can; and so can IN() lists, although only the first such list in the statement will be shortened.

If it shortens something beyond recognition, you can find the original statement in the log, at the offset shown in the report header (see "OUTPUT").

--tcpdump-errors

type: string

Write the tcpdump data to this file on error. If mk-query-digest doesn't parse the stream correctly for some reason, the session's packets since the last query event will be written out to create a usable test case. If this happens, mk-query-digest will not raise an error; it will just discard the session's saved state and permit the tool to continue working.

--timeline

type: Array

Show a timeline of events (disables --report).

This option makes mk-query-digest aggregate events into a timeline by the specified attribute. Each event is compared to the previous one, and if the specified attribute is the same, they are aggregated together. At the end, the report prints out the timestamp, interval, count and value of each aggregated group of events.

The setting cascades to --order-by and --group-by just like --report does. At this time, --timeline and --report are mutually exclusive, and since --report has a default value, setting --timeline will merely unset --report.

Example:

  mk-query-digest /path/to/log --timeline distill
--type

type: string

The type of input to parse (default slowlog). The permitted types are

slowlog
Parse a log file in any varation of MySQL slow-log format.
tcpdump

mk-query-digest does not actually watch the network (i.e. it does NOT "sniff packets"). Instead, it's just parsing the output of tcpdump. You are responsible for generating this output; mk-query-digest does not do it for you. Then you send this to mk-query-digest as you would any log file: as files on the command line or to STDIN.

The parser expects the input to be formatted with the following options: -x -n -q -tttt. For example, if you want to capture output from your local machine, you can do something like

  tcpdump -i eth0 port 3306 -s 65535 -c 1000 -x -n -q -tttt \
    | mk-query-digest --type tcpdump

The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make sure the output looks like this:

  2009-04-12 09:50:16.804849 IP 127.0.0.1.42167 > 127.0.0.1.3306: tcp 37
      0x0000:  4508 0059 6eb2 4000 4006 cde2 7f00 0001
      0x0010:  ....

Remember tcpdump has a handy -c option to stop after it captures some number of packets! That's very useful for testing your tcpdump command.

If you're analyzing traffic to a non-standard port, see --watch-server.

--version
Show version and exit.
--watch-server

type: string

Which IP address and port is the server when parsing tcpdump output. This option tells mk-query-digest whether a given TCP packet is from client to server, or server to client. If you don't specify it, mk-query-digest tries to guess the host and port of the server by looking for port 3306 or port "mysql". If you're watching a server with a non-standard port, this won't work, and you'll need to specify the IP address and port to watch. If mk-query-digest doesn't know what to watch, nothing works right.

--[no]zero-admin

default: yes

Zero out the Rows_XXX properties for administrator command events.

DOWNLOADING

You can download Maatkit from Google Code at http://code.google.com/p/maatkit/, or you can get any of the tools easily with a command like the following:

   wget http://www.maatkit.org/get/toolname
   or
   wget http://www.maatkit.org/trunk/toolname

Where toolname can be replaced with the name (or fragment of a name) of any of the Maatkit tools. Once downloaded, they're ready to run; no installation is needed. The first URL gets the latest released version of the tool, and the second gets the latest trunk code from Subversion.

SYSTEM REQUIREMENTS

You need Perl and some core packages that ought to be installed in any reasonably new version of Perl.

ENVIRONMENT

The environment variable MKDEBUG enables verbose debugging output in all of the Maatkit tools:

   MKDEBUG=1 mk-....

BUGS

Please use Google Code Issues and Groups to report bugs or request support: http://code.google.com/p/maatkit/. You can also join #maatkit on Freenode to discuss Maatkit.

Please include the complete command-line used to reproduce the problem you are seeing, the version of all MySQL servers involved, the complete output of the tool when run with --version, and if possible, debugging output produced by running with the MKDEBUG=1 environment variable.

COPYRIGHT, LICENSE AND WARRANTY

This program is copyright 2007-2009 Baron Schwartz. Feedback and improvements are welcome.

THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

AUTHOR

Baron Schwartz, Daniel Nichter

VERSION

This manual page documents Ver 0.9.7 Distrib 4047 $Revision: 4043 $.