I mentioned before that I'm using XMLRPC to debug a mutli-threaded Python application.
One of the things I'd like to get from this new debugging tool is a stack trace for the various threads my application is running. Sometimes, it appears to get stuck, and I'd like to know what it's doing. A deadlock of some sort is a likely cause, but I would really like to know where that deadlock is happening, and log output is not quite as detailed as I'd like, plus adding more logging just to tell me where I am required frequent restarts, which is a major drag.
It seems, however, that there is no way in Python to get a list of all running threads. Nor can you, from one thread, get a stack trace for another thread. However, in my program, I'm creating a set of worker threads, and I'm keeping track of them already, so I don't have to ask for a list of all of the threads. The trick part is then to get stack traces from these threads. Lacking the ability to inspect another thread directly, I found that Python's tracing utilities, which facilitate the Python debugging and profiling tools, can be used to get that information. Each worker thread calls sys.settrace(self._trace) in its run() method, and implements this tracing method:
def _trace(self, frame, event, arg):
self._frame = frame
In the main thread, a
status() method returns a trace for each thread:
def status(self):
import inspect
self._lock.acquire()
try:
status = "Available threads:\n"
for worker in self._available_threads:
status = status + " " + worker.getName() + ":\n"
frame = worker._frame
if frame:
status = status + " stack:\n"
for frame, filename, line, function_name, context, index in inspect.getouterframes(frame):
status = status + " " + function_name + " @ " + filename + " # " + str(line) + "\n"
status = status + "\n"
return status
finally:
self._lock.release()
I'm building up a string here because it's a return value that can be sent over XMLRPC and printed. Structured data may be more useful, but hopefully you get the idea. Now I can not only inspect my programs state remotely, but see exactly what it's doing:
[bluntman:~] wsanchez% python
Python 2.3 (#1, Sep 13 2003, 00:49:11)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import xmlrpclib
>>> p = xmlrpclib.ServerProxy("http://localhost:8001", allow_none = 1)
>>> p.status()
Available threads:
thread01:
stack:
_release_save@/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/threading.py#181
wait@/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/threading.py#223
wait@/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/threading.py#350
run@/Users/wsanchez/Python/test/Manager.py#20
__bootstrap@/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/threading.py#436
One downside to this approach is that we're now incurring a fair bit of overhead because of all of the tracing; every statement the runtime executes now includes invoking the trace method and storing the frame. If performance is a concern, this can be problematic. However, it's easy enough to comment out the sys.settrace() call and uncomment it only when this level of debugging is desired.