Troubleshooting
Some advanced troubleshooting options are provided below. These are generally very low-level and are not normally required.
Gearman Jobs
Connecting to Gearman can allow you see if any Zuul components appear to not be accepting requests correctly.
For unencrypted Gearman connections, you can use telnet to connect to and check which Zuul components are online:
telnet <gearman_ip> 4730
For encrypted connections, you will need to provide suitable keys, e.g:
openssl s_client -connect localhost:4730 -cert /etc/zuul/ssl/client.pem -key /etc/zuul/ssl/client.key
Commands available are discussed in the Gearman administrative
protocol. Useful commands are
workers
and status
which you can run by just typing those
commands once connected to gearman.
For status
you will see output for internal Zuul functions in the
form FUNCTION\tTOTAL\tRUNNING\tAVAILABLE_WORKERS
:
...
executor:resume:ze06.openstack.org 0 0 1
zuul:config_errors_list 0 0 1
zuul:status_get 0 0 1
executor:stop:ze11.openstack.org 0 0 1
zuul:job_list 0 0 1
zuul:tenant_sql_connection 0 0 1
executor:resume:ze09.openstack.org 0 0 1
...
Thread Dumps and Profiling
If you send a SIGUSR2 to one of the daemon processes, it will dump a
stack trace for each running thread into its debug log. It is written
under the log bucket zuul.stack_dump
. This is useful for tracking
down deadlock or otherwise slow threads:
sudo kill -USR2 `cat /var/run/zuul/executor.pid`
view /var/log/zuul/executor-debug.log +/zuul.stack_dump
When yappi (Yet Another Python Profiler) is available, additional functions’ and threads’ stats are emitted as well. The first SIGUSR2 will enable yappi, on the second SIGUSR2 it dumps the information collected, resets all yappi state and stops profiling. This is to minimize the impact of yappi on a running system.