Before diving into wordlist creation I want you to understand a concept of Tech Stacking & how it will help in selecting & customizing the best wordlist for our target.

None
Credit: DALL-E

Friend Link | Free Link

Workflow for today's blog:

[Tech Stack Tools] → [Wordlist Generation] → [Fuzzing Tools]  
  (Wappalyzer)         (altdns + cewl)         (ffuf/massdns)  
                      ↙                 ↘  
            [Subdomain Permutations]  [Content Discovery Paths]  

Tech Stacking

Tech stacking is a crucial step that many hackers skip or miss. This step will provide you the information about the technologies used by your target behind the scenes. So, knowing them you can assume what's happening at the backend.

Passively

  1. Wappalyzer

Simply type "Wappalyzer extension" in your browser's search engine or extension store & install it. Pin it to the toolbar.

Now, whenever you visit your target website, just click on the icon placed right to the search bar & you can see all the technologies used by your target.

None

2. BuiltWith Technology Profiler

Browser extension (Same way of installation as of Wappalyzer)

Provide limited information in free version/subscription.

Actively

  1. Headers — Server detection

Check if you get any output related to Server & X-Powered-By in headers

curl -I https://target.com | grep -iE 'server|x-powered-by'

2. NMAP — Server version & WAF

nmap -sV --script=http-enum,http-waf-detect target.com

3. Full Tech Audit

Install the tool from Whatweb's official github repository

whatweb -a3 https://target.com

4. Based on File Extensions

Using dirsearch to get the desired extension pages that are crucial and few internal pages such as: php, asp, aspx, jsp

python3 dirsearch.py -u https://target.com -e php,asp,aspx,jsp

Tips

Look for the below information in Response Headers & URLs

  • X-Powered-By: PHP/8.1.2
  • Set-Cookie: JSESSIONID → Java
  • /wp-content/ → WordPress
  • /_api/ → SharePoint
  • .ashx → ASP.NET
  • node_modules → Node.js

By performing the above steps you can identify most of the technologies used by your target which now will help us in wordlist creation.

BUT HOW?

Suppose you get to know that your target is using JSP. What you will do is search for the JSP wordlist which by combining with fuzzing can result us in giving critical & internal pages revealing sensitive information or some other vulnerability as well.

Build Advanced Wordlist for Content Discovery

A. Harvest Tech-Specific Terms

You can do research and add more endpoints based on previously found vulnerabilities by reading POCs to your wordlist based on the technology used.

Few examples with crucial endpoints are:

WordPress:

grep -rEi 'wp-admin|wp-includes|xmlrpc' ~/SecLists/Discovery/Web-Content/

Spring Boot:

Do include these crucial endpoints in your wordlist:

/actuator
/heapdump
/env

AWS:

Do include these crucial endpoints in your wordlist:

s3-buckets
aws-lambda
cloudfront

APIs:

Do include these crucial endpoints in your wordlist:

openapi.json
graphql
/v1/swagger

B. Generate Custom Wordlist

Example: Suppose you target is using — PHP/WordPress Target

cat ~/SecLists/Discovery/Web-Content/raft-large-words.txt > custom.txt
curl -s https://raw.githubusercontent.com/random-robbie/wordlists/master/wp.txt >> custom.txt
echo -e "debug.php\nphpinfo.php\ncron.php" >> custom.txt  # PHP-specific

Creating wordlist based on the target's own content:

cewl -d 4 -m 5 -w cewl_words.txt https://target.com  

Sorting all endpoints & keywords collected into one file:

sort -u custom.txt cewl_words.txt > final_wordlist.txt

Use this advanced wordlist while fuzzing the target:

ffuf -w final_wordlist.txt -u https://target.com/FUZZ -t 100 -mc 200,403

Subdomain Bruteforce with Custom Wordlist

Seed List Creation

  1. Extract target's name variations:
# replace target with your actual target
echo "target" > base.txt
echo "tgt\nprod-target\ntarget-api\ncorp-target" >> base.txt  # Common patterns

2. Extract from SSL Cert:

Might give error if you don't have or improperly installed openssl. But i hope you can fix the issue by your own (Basic thing to be a hacker)

openssl s_client -connect target.com:443 | openssl x509 -text | grep DNS: | sed 's/DNS://g' >> ssl_subs.txt

3. Industry-Specific Terms:

Let's suppose we had our target as a fintech organization, then we can add finance & bank related keywords (such as — payments, card, api-gateway, vault, merchant):

echo "payments\ncard\napi-gateway\nvault\nmerchant" >> industry.txt

4. Combine with massive lists:

cat ~/SecLists/Discovery/DNS/subdomains-top1million-5000.txt >> all_subs.txt

Generate Smart Permutations

  1. Use altdns for mutation patterns:

Adds prefixes/suffixes —

altdns -i all_subs.txt -o permutations.txt -w ~/altdns-words.txt

# Replace altdns-words.txt with the best wordlists you like for mutaion.

2. Custom mutations:

i) Create a Subdomain List: Save subdomains in a file called all_subs.txt. Example: Use tools like sublist3r or assetfinder first to get basic subdomains.

ii) Let's us take Cloud & SRE related terms:

sed 's/^/dev./' all_subs.txt >> permutations.txt
sed 's/^/staging./' all_subs.txt >> permutations.txt

iii) Result: A new file permutations.txt will have modified subdomains. Use this list with tools like massdns or httpx to check if they exist.

Pro Tip: Add more common prefixes like test., prod., or uat. using the same pattern:

sed 's/^/test./' all_subs.txt >> permutations.txt
sed 's/^/prod./' all_subs.txt >> permutations.txt

Actual Motive of this step:

Original Subdomains       After Adding Prefixes
--------------------     ------------------------
api                       dev.api, staging.api
internal                  dev.internal, staging.internal
assets                    dev.assets, staging.assets

Bruteforce with DNS Overkill

massdns -r /path/to/resolvers.txt -t A -o S -w live_subs.txt permutations.txt

Pro Tips for Maximum Results

  1. API Endpoint Mining Find openapi.yaml → Extract paths with yq '.paths[]' openapi.yaml → Add to wordlist.
  2. GitHub Recon Search target's repos: gh api -X GET search/code -f q="org:target.com filename:env" → Find leaked endpoints.
  3. WAF-Bypass Wordlist If Cloudflare detected: add %2e%2e/ (URL-encoded path traversal) and /?id=1<svg/onload=alert(1)> (XSS probes).
  4. Dynamic Updates Pipe live subdomains back into wordlist: cat live_subs.txt | sed 's/\.target\.com//' >> permutations.txt.

In next write-up I'll try to provide you the resources for best wordlist present for these processes.

I look forward to sharing what I've learned while exploring the ever-evolving world of cybersecurity and bug bounties. Let's hunt some bugs!

Thank you for reading the blog!!!

You can also follow me on Twitter & LinkedIn for more such tips & tricks.

Follow & subscribe for daily write-up updates via mail on Medium

None
Buy Me A Coffee