At work I was debugging and issue where I had to dig into the internals of the Python runtime running on Windows. Along the way I learned a few tips and tricks that might come in handy for someone else. So I figured I’d do a quick little write up to share the knowledge. To start you’ll want to debug python with the windows system debugger: WinDbg. Debugging runtime like Python is convenient in Visual Studio, but in the end it’s not as powerful and features available in WinDbg will make things a lot easier.
The first thing you’ll want is symbols for your python release. There are multiple ways to get Python symbols:
- You can download them manually using this helpful guide.
- You can use Sean Cline’s symbol server, and the debugger will download symbols for you.
The symbol server support seems to work well, and is easy, so I would recommend that. To add the symbol server to your WinDbg symbol path, run:
1:044> .sympath+ http://pythonsymbols.sdcline.com/symbols
Now that we have symbols, next we need to get the source to line up with the code you are debugging and stepping through. First we need to use the debugger to find out what version of python we are debugging:
1:044> !filever python38.dll filever python38.dll GetFileVersionInfo Version: 22.214.171.1243 0000000f`02ca0000 0000000f`030c9000 python38 Loaded symbol image file: python38.dll Image path: python38.dll Image name: python38.dll Browse all global symbols functions data Timestamp: Mon Oct 14 12:38:17 2019 (5DA4CEA9) CheckSum: 0040A466 ImageSize: 00429000 Product version: 126.96.36.1993 File flags: 0 (Mask 3F) File OS: 4 Unknown Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0000.04b0 Information from resource tables: CompanyName: Python Software Foundation ProductName: Python InternalName: Python DLL OriginalFilename: python38.dll ProductVersion: 3.8.0 FileVersion: 3.8.0
The version I’m debugging appears to be based on Python 3.8.0, using that information we can clone a copy of the cpython repository, and checkout the correct branch:
c:\> git clone https://github.com/python/cpython.git c:\> pushd cpython c:\cpython> git tag -l | findstr 3.8.0 v3.8.0 ... c:\cpython> git checkout v3.8.0
Finally we can fix up the source path in our debugger to point to our python source code:
1:044> .srcpath+ c:\cpython
Another trick that might need, is the ability to inject dynamic breakpoints into the python code in order to narrow down a bug reproduction, or catch the system in a certain state. Unfortunately there is no way to write inline assembly in python. However we can take advantage of the Python ctypes library. It exposes the windll API with support for kernel32.dll exported functions. Using windll we can call into the windows implementation of DebugBreak() via the foreign function wrapper. This will trigger a breakpoint exception in the process, and the debugger will break into handle it.
Here’s an example of using it to instrument some code:
# Call DebugBreak export from kernel32 from ctypes import windll windll.kernel32.DebugBreak() thing_you_are_trying_to_debug()
The PyExt WinDbg Extension
Discovering the PyExt WinDbg extension allowed me to start being productive and narrowing down my issue. It’s very useful for gleaning bits of information about the python runtime while debugging and issue or analyzing a crash dump.
To install, download the latest version from the PyExt Release Page. Extract the zip file somewhere on disk. Then you can load the extension into the debugger with the the following command.
The extension exposes a few useful commands:
1:044> !pyext.help Commands for C:\PyExt-x64-Release\x64\Release\pyext.dll: !help - Displays information on available extension commands !pyobj - Prints information about a Python object !pystack - Prints the Python stack for the current thread, or starting at a provided PyFrameObject !pysymfix - Adds python symbols to the symbol path.
As a short cut to the steps listed above the extension provides a command to quickly
setup the python symbol server:
1:044> !pysymfix Current symbol path: srv* Adding symbol server to path... ************* Path validation summary ************** Response Time (ms) Location Deferred http://pythonsymbols.sdcline.com/symbols New symbol path: http://pythonsymbols.sdcline.com/symbols Loading symbols...
The extension also allows you to traverse the runtime stack frame object and display a
stack trace of the executing python code, using the
!pystack we’ll need to find a python stack frame pointer somewhere on on
the native call stack. Lets look around and try to find one:
1:044> .f+ 06 0000000f`0044bfe0 0000000f`02ce9770 python38!_PyEval_EvalFrameDefault+0xaba 1:044> dv prv param @r12 struct _frame * f = 0x0000000f`0886e800
We simply pass the address of the _frame object to the
!pystack command and let it work:
1:045> !pystack 0x0000000f0886e800 Stack trace starting at (PyFrameObject*)(00000008`85b46130): File "C:\python-3.8.0.amd64\lib\site-packages\pip\__main__.py", line 26, in <module> [Locals] [Globals] File "C:\python-3.8.0.amd64\lib\runpy.py", line 85, in _run_code [Globals] File "C:\python-3.8.0.amd64\lib\runpy.py", line 192, in _run_module_as_main [Globals]
!pyobj extension allows you dump python runtime variables.
Objects in the CPython implementation are of type
!pyobj operates on.
Lets find an example in our test case, and try dumping it’s value:
1:049> g Breakpoint 11 hit python38!_io_FileIO___init___impl: 0000000f`02d3652c 48895c2410 mov qword ptr [rsp+10h],rbx ss:0000000f`0044aff8=0000000000000002 1:049> dv nameobj prv param @rdx @rdx struct _object * nameobj = 0x0000000f`04cc2f10 1:049> !pyobj 0xf04cc2f10 PyASCIIObject at address: 0000000f`04cc2f10 RefCount: 81 Type: str Repr: 'C:\\python-3.8.0.amd64\\lib\\tempfile.py'
These resources, tips, and tricks helped me dig into the problem I was working on and figure out what was going on. I hope they might help out someone else out there debugging python on windows. Thanks for reading!