01-编译CPython

CPython的目录结构

源码中文件夹的大致含义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cpython/

├── Doc ← Source for the documentation
├── Grammar ← The computer-readable language definition
├── Include ← The C header files
├── Lib ← Standard library modules written in Python
├── Mac ← macOS support files
├── Misc ← Miscellaneous files
├── Modules ← Standard Library Modules written in C
├── Objects ← Core types and the object model
├── Parser ← The Python parser source code
├── PC ← Windows build support files
├── PCbuild ← Windows build support files for older Windows versions
├── Programs ← Source code for the python executable and other binaries
├── Python ← The CPython interpreter source code
└── Tools ← Standalone tools useful for building or extending Python
  • Grammar记录了关于python的语法定义,如tokens,grammar,CPython有专门的代码读取这些文件,再生成词法解析和语法解析的代码。
  • Objects目录里有Python的核心内置类型,与python的对象模型息息相关
  • Parser里包含了Python语言的前端,即词法解析,语法解析
  • Programs包含了Python可执行文件的入口和相关实现
  • Python目录包含了Python解释器的实现

编译CPython

首先在CPython的目录下运行configure脚本

1
./configure CFLAGS="-g -O0"

继续make指令编译

1
make -j4

编译成功后会有一个Python可执行文件。

CPython2-GEF插件让GDB更好用

gdb ./python调试python,想看源码用list指令就可以了,想要看反汇编用disassemble命令就可以了,但是它的风格是AT&T风格的,想要看内存用x,想要看寄存器用info命令。

1
gdb -tui ./python

GEF https://github.com/hugsy/gef

安装

1
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"

可以使用hexdump命令

1
hexdump byte &argc

image-20230329102126435