BIPI

File Upload Vulnerabilities: A Practical Defense Playbook

Cybersecurity

File upload bypass techniques have outpaced naive extension blocklists for a decade. Here are the attack patterns we see on real engagements and the layered controls that hold up against polyglots and parser exploits.

By Arjun Raghavan, Security & Systems Lead, BIPI · January 8, 2025 · 8 min read

#file-upload#web-security#pentest

Upload endpoints are a perennial source of high-severity bugs because the validation surface is wide: extension, MIME type, magic bytes, content parsing, storage path, serving headers, and antivirus pipeline all need to be correct. Miss one and you have RCE or stored XSS.

How to test for it

Capture an upload in Burp and start with the obvious bypasses, then escalate. Always confirm where the file is served from (same origin, CDN, separate sandbox domain) because that determines exploitability.

Extension bypass: shell.php.jpg, shell.phP, shell.phtml, shell.pht, shell.php5, shell.php7, shell.phar, shell.inc.
Content-Type bypass: send Content-Type: image/jpeg with PHP source body when the server trusts the client header.
Magic byte prefix: prepend GIF89a; <?php ... to fool getimagesize() while keeping the file executable as PHP.
Polyglot: a file that is both a valid JPEG and a valid PHP script via Ange Albertini's mitra tool.
Path traversal in filename: ../../../var/www/html/shell.php as the multipart filename field.
Null byte legacy: shell.php%00.jpg on PHP under 5.3.4.
SVG XSS: upload an SVG with a script tag, then access it directly to execute JavaScript on the upload origin.

For image parser exploits, generate a malformed PNG with libpng-fuzz corpora and watch for crashes or out-of-bounds reads in ImageMagick or libvips. ImageTragick (CVE-2016-3714) and the GhostScript chain still appear on legacy stacks. Run nuclei with -tags upload across the discovered upload endpoints.

Less-obvious surfaces

Profile picture, document import, CSV import (formula injection lands here), ZIP import with zip-slip, video transcoding pipelines, and avatar URL fetchers (which loop back to SSRF). XML-based formats like DOCX and SVG carry XXE risk. CSV imports executed in Excel cause stored formula attacks against analysts.

Detection

Log every upload with hash, MIME, claimed extension, detected magic bytes, and final storage path. Alert on mismatches between claimed and detected types, on uploads with multiple extensions, and on filenames containing dot-dot or null bytes. Run ClamAV or YARA rules on the storage bucket as a defense-in-depth layer with telemetry to your SIEM.

Remediation

Validate file type by magic bytes, not extension or Content-Type. Use libmagic, file-type for Node, or python-magic.
Generate a fresh server-side filename like a UUID. Never trust client-supplied filenames for storage paths.
Store uploads outside the web root, ideally in object storage like S3 with a non-public bucket, then serve via signed URLs.
Serve user uploads from a separate origin like usercontent.example.com with Content-Disposition: attachment and a strict Content-Security-Policy of sandbox.
Re-encode images server-side. A roundtrip through Sharp or Pillow strips most polyglots and removes EXIF-embedded payloads.
Run uploads through a sandboxed antivirus pipeline. Quarantine on detection, never deliver scan-pending files.
For DOCX, XLSX, and SVG, parse with hardened libraries that disable external entities and macros.

Validation

Re-run the bypass list and confirm each upload is rejected or rendered inert. Upload a benign HTML file and confirm it downloads with attachment disposition rather than rendering. Run nuclei file-upload templates and verify clean output. BIPI engagements include a permutation matrix of 80-plus upload payloads and a remediation patch tuned to your storage and serving stack.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.