pt-mongodb-stalk¶
NAME¶
pt-mongodb-stalk - Collect forensic data about MongoDB when problems occur.
SYNOPSIS¶
Usage¶
pt-mongodb-stalk [OPTIONS]
DESCRIPTION¶
pt-mongodb-stalk watches a MongoDB server for a trigger condition and collects
diagnostic data when that trigger occurs. It follows the same basic operating
model as pt-stalk, but uses MongoDB administration commands instead of
MySQL commands.
The default trigger watches serverStatus.connections.current. You can also
watch currentOp, host CPU usage, host memory usage, queued writers, and
replica set replication lag.
OPTIONS¶
- --ask-pass¶
Prompt for a password when connecting to MongoDB.
- --authenticationDatabase¶
type: string; default: admin
Authentication database for the MongoDB shell connection.
- --collect¶
default: yes; negatable: yes
Collect diagnostic data when the trigger occurs.
- --config¶
type: string
Read this comma-separated list of config files. If specified, this must be the first option on the command line.
- --cycles¶
type: int; default: 5
How many times
--variablemust be greater than--thresholdbefore triggering--collect.
- --dest¶
type: string; default: /var/lib/pt-mongodb-stalk
Where to save diagnostic data.
- --disk-bytes-free¶
type: size; default: 100M
Do not collect if the disk has less than this much free space.
- --disk-pct-free¶
type: int; default: 5
Do not collect if the disk has less than this percent free space.
- --function¶
type: string; default: status
Trigger source. Valid built-in values are
status,currentop,cpu,memory,writewait, andrepllag.With
status,--variableis a dot-separated path insideserverStatus. Example:--function status --variable connections.current --threshold 200With
currentop,--variableis a dot-separated field path inside eachcurrentOp.inprogdocument and--matchis a regex. The trigger value is the number of matching operations. Example:--function currentop --variable command.aggregate --match '^orders$' --threshold 10With
cpu, the trigger value is host CPU busy percentage from/proc/stat. Withmemory, the trigger value is host memory used percentage from/proc/meminfo.memis accepted as an alias formemory. Example:--function cpu --threshold 85 --function memory --threshold 90With
writewait, the trigger value isserverStatus.globalLock.currentQueue.writers. Example:--function writewait --threshold 5With
repllag, the trigger value is the maximum replica set member lag behind the primary, in seconds, fromreplSetGetStatus.replicationlagandwaitForReplicationLagare accepted as aliases. Example:--function repllag --threshold 30You can also specify a file that defines
trg_plugin.
- --help¶
Print help and exit.
- --host¶
short form: -h; type: string; default: localhost
Host to connect to.
- --interval¶
type: int; default: 1
How often to check the trigger, in seconds.
- --iterations¶
type: int
How many collections to perform before exiting.
- --log¶
type: string; default: /var/log/pt-mongodb-stalk.log
Print all output to this file when daemonized.
- --match¶
type: string
Regex pattern used with
--functioncurrentop.
- --password¶
short form: -p; type: string
Password to use when connecting.
- --pid¶
type: string; default: /var/run/pt-mongodb-stalk.pid
Create the given PID file.
- --plugin¶
type: string
Load a plugin that defines any of the standard
before_*orafter_*hooks.
- --port¶
short form: -P; type: int; default: 27017
Port number to use for connection.
- --prefix¶
type: string
Filename prefix for diagnostic samples.
- --retention-count¶
type: int; default: 0
Keep data for the last N runs.
- --retention-time¶
type: int; default: 30
Number of days to retain collected samples.
- --run-time¶
type: int; default: 30
How long interval collectors should run when the trigger occurs.
- --sleep¶
type: int; default: 300
How long to sleep after collecting.
- --sleep-collect¶
type: int; default: 1
Polling interval for interval collectors, in seconds.
- --stalk¶
default: yes; negatable: yes
Watch the server and wait for the trigger to occur. Specify
--no-stalkto collect immediately.
- --threshold¶
type: float; default: 100
Collection is triggered when
--variableis greater than this value.
- --tls¶
default: ; negatable: yes
Enable TLS for the MongoDB shell connection.
- --sslCAFile¶
type: string
Path to the TLS CA file.
- --sslPEMKeyFile¶
type: string
Path to the TLS client certificate and key file.
- --uri¶
type: string
Full MongoDB URI to connect with.
- --user¶
short form: -u; type: string
User for login.
- --variable¶
type: string; default: connections.current
Variable to watch inside
serverStatusorcurrentOp. This option is ignored bycpu,memory,writewait, andrepllag.
- --verbose¶
type: int; default: 3
Print level of information. Values: 1 errors, 2 matching triggers and collection info, 3 non-matching triggers.
- --version¶
Print version and exit.
EXAMPLES¶
Run in stalking mode and collect twice when the trigger is met:
pt-mongodb-stalk \
--host localhost --port 30001 --user admin --password admin --authenticationDatabase admin \
--function status --variable connections.current --threshold 50 --cycles 3 --interval 1 --iterations 2 \
--dest /tmp/pt-mongodb-stalk
Run immediately without stalking and collect one short sample:
pt-mongodb-stalk \
--host localhost --port 30004 --user admin --password admin --authenticationDatabase admin \
--no-stalk --iterations 1 --run-time 6 --sleep-collect 1 \
--dest /tmp/pt-mongodb-stalk
Run immediately without stalking and collect multiple short runs:
pt-mongodb-stalk \
--host localhost --port 30000 --user admin --password admin --authenticationDatabase admin \
--no-stalk --iterations 3 --run-time 6 --sleep-collect 1 --sleep 1 \
--dest /tmp/pt-mongodb-stalk
Run immediately without stalking and collect fewer, more widely spaced samples:
pt-mongodb-stalk \
--host localhost --port 27000 --user admin --password admin --authenticationDatabase admin \
--no-stalk --iterations 1 --run-time 10 --sleep-collect 2 \
--dest /tmp/pt-mongodb-stalk
Run in stalking mode using currentOp matches instead of a serverStatus metric:
pt-mongodb-stalk \
--host localhost --port 30001 --user admin --password admin --authenticationDatabase admin \
--function currentop --variable command.aggregate --match '^orders$' --threshold 10 --cycles 2 --interval 1 --iterations 1 \
--dest /tmp/pt-mongodb-stalk
Run in stalking mode when host CPU is above 85 percent:
pt-mongodb-stalk \
--host localhost --port 30001 --user admin --password admin --authenticationDatabase admin \
--function cpu --threshold 85 --cycles 3 --interval 1 --iterations 1 \
--dest /tmp/pt-mongodb-stalk
Run in stalking mode when host memory is above 90 percent:
pt-mongodb-stalk \
--host localhost --port 30001 --user admin --password admin --authenticationDatabase admin \
--function memory --threshold 90 --cycles 3 --interval 1 --iterations 1 \
--dest /tmp/pt-mongodb-stalk
Run in stalking mode when replica set lag is above 30 seconds:
pt-mongodb-stalk \
--host localhost --port 30001 --user admin --password admin --authenticationDatabase admin \
--function repllag --threshold 30 --cycles 2 --interval 1 --iterations 1 \
--dest /tmp/pt-mongodb-stalk
OUTPUT¶
When the trigger condition is met for the configured number of consecutive
cycles, the tool collects into --dest. Snapshot commands run once per
collection iteration and are stored as timestamped files. For example:
2026_04_24_10_00_01-serverStatus.json
2026_04_24_10_00_01-currentOp.json
2026_04_24_10_00_01-ps.txt
Interval commands run once per collection iteration using --sleep-collect
as their polling interval and a count derived from --run-time. For
example, --run-time 5 --sleep-collect 1 runs commands like vmstat 1 5
and stores the result in one timestamped file.
The collection window is capped by --run-time. After a collection
finishes, the tool waits --sleep seconds before the next trigger check or
collection iteration. Collections do not overlap.
The collector also writes these fixed files in the destination directory:
heartbeat
log
trigger
COLLECTED DATA¶
MongoDB data collected once per collection iteration:
serverStatus
currentOp
MongoDB interval tools collected once per collection iteration:
mongostat
mongotop
MongoDB JSON output from shell commands is cleaned before writing: ok,
$clusterTime, and operationTime are removed.
System data collected once per collection iteration, depending on tool availability:
ps faux
pidstat -d
pidstat -u
pidstat -urdwt for mongod or mongos
System interval tools collected once per collection iteration, depending on tool availability:
vmstat
iostat
mpstat
top
The process-specific pidstat file is named pidstat_mongod.txt or
pidstat_mongos.txt according to the detected MongoDB process type. Topology
is detected internally for this purpose, but no topology summary file is
written.
If /var/log/messages exists, the tool also copies it to messages.out.
OUTPUT CLEANUP¶
At the end of a run, zero-byte .err files are removed and non-empty .err
files are kept. The collector enforces disk-space safety checks, but uses
internal temporary files instead of writing disk-space snapshots into
--dest.
NOTES¶
This tool is intended for Linux systems. A separate summary-style tool should collect broader one-time server and MongoDB metadata; pt-mongodb-stalk is focused on runtime sampling around a trigger event.
COPYRIGHT, LICENSE, AND WARRANTY¶
This program is copyright 2011-2026 Percona LLC and/or its affiliates.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.