Your Name is My Name: Attacking and Defending Programs from Name Collisions

Open Access
- Author:
- Basu, Aditya
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 12, 2025
- Committee Members:
- Chitaranjan Das, Program Head/Chair
Trent Jaeger, Chair & Dissertation Advisor
Peng Liu, Outside Unit & Field Member
Ruslan Nikolaev, Major Field Member
John Sampson, Major Field Member
Abutalib Aghayev, Major Field Member - Keywords:
- file system
systems security
UTF-8
canonicalization
computer security
case collision
name collision
computer science
Unicode
copying programs
software
attack
vulnerability
Linux
Extended Berkeley Packet Filter (eBPF or BPF)
case preserving
case insensitive
defense - Abstract:
- File name confusion attacks, such as malicious symlinks and file squatting, have long been studied as sources of security vulnerabilities. However, a recently emerged type, i.e., case and encoding induced name collisions, has not been scrutinized. The inclusion of per-directory case-insensitivity to Ext4 has widened the avenues for such name collisions on Linux where historically most mainstream file systems were case-sensitive. This dissertation investigates the security implications of such collisions, particularly in copy programs, and proposes a robust defense mechanism to mitigate associated risks. The research systematically analyzes name collisions in real-world programs, by introducing the concept of create-use pairs to identify vulnerabilities in copy utilities. Our findings revealed that name collisions could enable classical file system attacks, such as file squatting and symbolic link traversal, even in programs designed to prevent them. We subsequently extend the create-use pair model to encompass use-use pairs, demonstrating additional security vulnerabilities. In order to evaluate existing defenses (or a lack thereof), a testing suite was developed to drive common Linux utilities automatically to identify unsafe responses to name collisions. A name tweaking technique was also developed to systematically generate variations of case and encoding in names. These efforts led to the discovery of critical zero-day vulnerabilities in the widely used programs of Git and Mercurial, allowing for arbitrary code execution when cloning repositories on case-insensitive file systems, such as Ext4 and ZFS. Furthermore, security flaws were also identified in multiple programs including dpkg, Apache httpd, rsync, restic and xcopy (on Windows), etc. Additionally, multiple functional issues were discovered when the programs were subject to name collisions. To mitigate name collision attacks, this research classifies risks based on resource ownership, distinguishing between system-protected and program-protected resources and proposes a comprehensive defense strategy that integrates security enforcement directly within the kernel by leveraging its path resolution subsystem. This approach eliminates the need for user-space name canonicalization, reducing inconsistencies and ensuring robust protection against name collision attacks. By utilizing Linux Security Module (LSM) hooks, the defense guarantees complete mediation across all file system operations without requiring modifications to individual programs, ensuring portability and effectiveness. Micro-benchmarks show that the defense adds a base overhead of ~36 nanoseconds across all system calls of the monitored process. For system calls that require path traversals, the overhead grows linearly with each path element that is traversed. In real world scenarios, this translates to a 3.29% overhead in Apache httpd and a 4.8% overhead in Git making it an effective and practical defense for programs that need to interact with case-diverse file systems.