Web Reconnaissance — Complete Guide

01 Apa Itu Reconnaissance? 🕐 5 min

🔍 Definisi

Reconnaissance (recon) adalah tahap pertama dan paling krusial dalam penetration testing. Tujuannya adalah mengumpulkan sebanyak mungkin informasi tentang target sebelum melakukan eksploitasi. Semakin banyak informasi yang dikumpulkan, semakin besar peluang menemukan kerentanan.

Analogi: Bayangkan Anda seorang arsitek yang diminta mengevaluasi keamanan sebuah gedung. Sebelum masuk, Anda akan mengelilinginya, memperhatikan pintu, jendela, CCTV, pagar, dan pola penjaga. Itulah recon — mengamati sebelum bertindak.

📊 Dua Jenis Utama

Aspek	Passive Recon	Active Recon
Interaksi dengan target	Tidak langsung	Langsung
Deteksi oleh target	Tidak terdeteksi	Bisa terdeteksi (log, IDS/IPS)
Sumber data	Publik (internet, cache, registry)	Target langsung (scanning, crawling)
Risiko legal	Rendah	Lebih tinggi (butuh izin)
Contoh	WHOIS, Google Dorking, Shodan	Nmap, Nikto, directory bruteforce

1

Passive Recon

→

2

OSINT

→

3

Active Recon

→

4

Enumeration

→

5

Mapping

02 Passive Reconnaissance 🕐 15 min

👁️ Konsep

Passive recon adalah pengumpulan informasi tanpa berinteraksi langsung dengan target. Data diambil dari sumber publik yang sudah tersedia di internet. Target tidak tahu bahwa mereka sedang di-recon.

📋 Yang Dicari

WHOIS Data — Pemilik domain, registrar, tanggal registrasi, nameserver, kontak email/telepon
DNS Records — A, AAAA, MX, TXT, NS, CNAME, SOA — mengungkap infrastruktur target
SSL/TLS Certificates — Common Name, Subject Alternative Names (SANs), issuer, expiry
Historical Data — Versi lama website via Wayback Machine, perubahan DNS history
Cached Content — Google Cache, halaman yang sudah dihapus tapi masih ter-index
Public Code Repositories — GitHub/GitLab milik organisasi target, leaked credentials
Job Postings — Mengungkap tech stack (misal: "Kami butuh engineer Go + PostgreSQL + AWS")
Social Media — Informasi karyawan, email pattern, infrastruktur internal yang dibocorkan

🛠️ Tools Passive Recon

Passive

WHOIS Lookup

Domain Intelligence

Query informasi registrasi domain: pemilik, registrar, nameserver, tanggal expire.

whois target.com

Passive

dig / nslookup

DNS Query

Query DNS records (A, MX, TXT, NS, CNAME) dari nameserver publik.

dig target.com ANY +noall +answer

Passive

Shodan

Internet-wide Scanner

Mesin pencari untuk perangkat yang terhubung ke internet. Menemukan port terbuka, service, banner, tanpa scanning langsung.

shodan search hostname:target.com

Passive

Censys

Internet Intelligence

Mirip Shodan — mencari host, certificate, dan service di internet.

https://search.censys.io

Passive

crt.sh

Certificate Transparency

Mencari semua SSL certificate yang pernah diterbitkan untuk sebuah domain. Sangat berguna untuk menemukan subdomain tersembunyi.

curl "https://crt.sh/?q=%25.target.com&output=json"

Passive

Wayback Machine

Historical Archive

Melihat snapshot historis website — menemukan halaman yang sudah dihapus, endpoint lama, info sensitif.

https://web.archive.org/web/*/target.com

Passive

theHarvester

Email & Subdomain OSINT

Mengumpulkan email, nama, subdomain, IP dari berbagai sumber publik sekaligus.

theHarvester -d target.com -b all

Passive

SecurityTrails

DNS & Domain History

API & dashboard untuk DNS history, subdomain, associated domains, WHOIS history.

https://securitytrails.com

03 Active Reconnaissance 🕐 15 min

⚡ Konsep

Active recon melibatkan interaksi langsung dengan target — mengirim paket, crawling website, scanning port. Ini menghasilkan log di server target dan bisa terdeteksi oleh IDS/IPS/WAF. Selalu lakukan active recon setelah passive recon dan hanya dengan izin tertulis.

🛠️ Tools Active Recon

Active

Nmap

Port & Service Scanner

Tool scanning paling populer — discovery port terbuka, service version, OS detection, script scanning (NSE).

nmap -sC -sV -O -oN scan.txt target.com

Active

Masscan

Fast Port Scanner

Port scanner tercepat di dunia — bisa scan seluruh internet dalam <6 menit. Cocok untuk scan massal.

masscan -p1-65535 target.com --rate=1000

Active

Nikto

Web Server Scanner

Scanner web server untuk menemukan file berbahaya, konfigurasi salah, versi outdated, dan 6700+ item lainnya.

nikto -h https://target.com

Active

WhatWeb

Web Fingerprinter

Mendeteksi CMS, framework, web server, JavaScript library, dan teknologi lain yang dipakai website.

whatweb target.com -v

Active

Nuclei

Vulnerability Scanner

Fast template-based scanner dengan ribuan template komunitas untuk detect CVE, misconfig, exposed panels, dll.

nuclei -u https://target.com -t cves/

Active

Burp Suite

Web Proxy & Scanner

Swiss army knife untuk web pentesting — intercepting proxy, scanner, repeater, intruder, dan banyak lagi.

GUI Tool — Community & Pro edition

04 OSINT (Open Source Intelligence) 🕐 20 min

🌐 Konsep

OSINT adalah pengumpulan dan analisis informasi dari sumber terbuka (publicly available). Dalam konteks web recon, OSINT digunakan untuk menemukan data tentang organisasi, karyawan, infrastruktur, dan teknologi yang digunakan tanpa menyentuh target langsung.

🛠️ Tools OSINT

OSINT

Maltego

Visual Link Analysis

Platform OSINT visual — memetakan hubungan antara domain, IP, email, orang, perusahaan, dan infrastruktur.

GUI Tool — Community & Pro edition

OSINT

SpiderFoot

Automated OSINT

Otomatis mengumpulkan OSINT dari 200+ sumber — IP, domain, email, nama, phone, leaked data.

spiderfoot -s target.com -t all

OSINT

Recon-ng

Recon Framework

Framework modular untuk recon — modul untuk WHOIS, Shodan, VirusTotal, GitHub, dan banyak lagi.

recon-ng → marketplace search → use module

OSINT

OSINT Framework

Tool Directory

Koleksi link ke ratusan tool OSINT gratis yang dikategorikan berdasarkan jenis data.

https://osintframework.com

05 Subdomain Enumeration 🕐 20 min

🗺️ Mengapa Penting?

Subdomain seringkali menjadi entry point utama karena: subdomain dev/staging sering kurang dilindungi, admin panel tersembunyi di subdomain, service internal (Jenkins, Grafana, phpMyAdmin) sering terekspos via subdomain, dan subdomain lama yang terlupakan mungkin punya vulnerability yang belum di-patch.

🛠️ Tools Subdomain Enumeration

Semi-Passive

Subfinder

Passive Subdomain Discovery

Menemukan subdomain dari certificate logs, DNS datasets, search engines, dan API pihak ketiga.

subfinder -d target.com -o subs.txt

Semi-Passive

Amass

Attack Surface Mapping

Tool terlengkap untuk subdomain enum — menggabungkan passive sources, DNS brute force, dan ASN discovery.

amass enum -d target.com -o amass.txt

Active

gobuster dns

DNS Bruteforce

Brute force subdomain menggunakan wordlist — mencoba setiap kata sebagai subdomain.

gobuster dns -d target.com -w wordlist.txt

Active

dnsx

DNS Resolver

Fast DNS resolver — memvalidasi subdomain yang ditemukan, resolve ke IP, filter wildcard.

cat subs.txt | dnsx -a -resp

Active

httpx

HTTP Prober

Memprobe subdomain yang live — cek status code, title, tech, content-length. Filter subdomain yang aktif.

cat subs.txt | httpx -status-code -title

Passive

Assetfinder

Subdomain Finder

Tool simple dari Tomnomnom — cari subdomain dari berbagai passive sources.

assetfinder --subs-only target.com

💡 Pro Tip — Chaining Tools

Kombinasikan beberapa tool untuk hasil maksimal: subfinder -d target.com -silent | dnsx -silent | httpx -title -status-code -tech-detect — ini menemukan subdomain → resolve DNS → cek yang hidup.

06 Port Scanning & Service Enumeration 🕐 20 min

🚪 Konsep

Setiap port terbuka adalah "pintu masuk" potensial. Port scanning mengidentifikasi port apa yang terbuka, service apa yang berjalan, dan versi berapa. Informasi ini digunakan untuk mencari CVE atau misconfiguration pada service tersebut.

📋 Port Penting untuk Web Recon

Port	Service	Catatan
`80`	HTTP	Web server standar
`443`	HTTPS	Web server + SSL/TLS
`8080`	HTTP Alt	Sering dipakai proxy, dev server, Tomcat
`8443`	HTTPS Alt	Alternatif HTTPS
`3306`	MySQL	Jika terbuka ke publik = masalah besar
`5432`	PostgreSQL	Database terekspos
`27017`	MongoDB	Sering tanpa auth di default
`6379`	Redis	Sering tanpa password
`22`	SSH	Remote access, cek versi & auth method
`21`	FTP	Cek anonymous login
`9200`	Elasticsearch	Sering tanpa auth
`8888`	Jupyter Notebook	RCE jika tanpa password

🛠️ Nmap Cheatsheet

Tujuan	Command
Quick scan top 1000 ports	`nmap target.com`
Full port scan	`nmap -p- target.com`
Service version detection	`nmap -sV target.com`
OS detection	`nmap -O target.com`
Aggressive scan (lengkap)	`nmap -A target.com`
Script scan (vuln check)	`nmap --script=vuln target.com`
UDP scan	`nmap -sU --top-ports 100 target.com`
Output semua format	`nmap -oA output target.com`

07 Technology Fingerprinting 🕐 15 min

🏷️ Apa yang Dicari?

Mengetahui teknologi yang digunakan target membantu menentukan attack vector yang relevan. Informasi yang dicari meliputi:

Web Server — Apache, Nginx, IIS, Caddy, dan versinya
CMS — WordPress, Drupal, Joomla, Shopify, dll
Framework — Laravel, Django, Rails, Express, Spring, dll
JavaScript Library — React, Vue, Angular, jQuery versi tertentu
WAF (Web Application Firewall) — Cloudflare, Akamai, AWS WAF
CDN — Cloudflare, Fastly, AWS CloudFront
Bahasa pemrograman — PHP, Python, Java, Node.js, Go
Database — MySQL, PostgreSQL, MongoDB (dari error messages)
Hosting/Cloud — AWS, GCP, Azure, DigitalOcean

🛠️ Tools Fingerprinting

Passive

Wappalyzer

Browser Extension

Extension browser yang mendeteksi teknologi website secara real-time — CMS, framework, analytics, payment.

Browser extension (Chrome/Firefox)

Passive

BuiltWith

Technology Lookup

Web service yang menampilkan tech stack lengkap sebuah website termasuk history perubahan.

https://builtwith.com/target.com

Active

WhatWeb

CLI Fingerprinter

Tool command-line yang mendeteksi 1800+ web technologies dari response headers, cookies, HTML, dan JS.

whatweb -a 3 target.com

Active

wafw00f

WAF Detector

Mendeteksi apakah website dilindungi oleh WAF dan mengidentifikasi vendor WAF-nya.

wafw00f https://target.com

Active

WPScan

WordPress Scanner

Scanner khusus WordPress — theme, plugin, user enumeration, dan known vulnerabilities.

wpscan --url target.com --enumerate ap,at,u

Active

CMSeeK

CMS Scanner

Mendeteksi 170+ CMS dan exploit yang diketahui untuk masing-masing CMS.

cmseek -u target.com

08 Content Discovery 🕐 20 min

📂 Konsep

Content discovery (directory/file bruteforcing) bertujuan menemukan halaman, file, direktori, dan endpoint tersembunyi yang tidak di-link di website tapi tetap bisa diakses. Ini sering mengungkap admin panel, backup files, config files, API docs, dan lain-lain.

🎯 Target yang Sering Ditemukan

/admin, /wp-admin, /dashboard — Admin panel
/backup, /db.sql, /backup.zip — Database dump & backup
/.env, /config.php, /wp-config.php.bak — Config file berisi credentials
/.git, /.svn — Version control terekspos (bisa download source code)
/api/docs, /swagger, /graphql — API documentation
/phpinfo.php, /server-status — Server information disclosure
/robots.txt, /sitemap.xml — Sering mengungkap path sensitif
/.well-known/ — Security.txt, OpenID config

🛠️ Tools Content Discovery

Active

ffuf

Fast Web Fuzzer

Fuzzer tercepat — directory brute force, parameter fuzzing, vhost discovery. Ditulis dalam Go.

ffuf -u https://target.com/FUZZ -w wordlist.txt -mc 200,301,302,403

Active

gobuster

Directory/File Bruteforcer

Tool Go untuk brute force direktori, file, subdomain, dan vhost.

gobuster dir -u https://target.com -w /usr/share/wordlists/dirb/common.txt

Active

feroxbuster

Recursive Content Discovery

Mirip gobuster tapi recursive — otomatis scan subdirectory yang ditemukan.

feroxbuster -u https://target.com -w wordlist.txt --depth 3

Active

dirsearch

Directory Scanner

Directory scanner berbasis Python dengan wordlist bawaan yang sudah bagus.

dirsearch -u https://target.com -e php,html,js,txt

Passive

gau (GetAllURLs)

URL Harvester

Mengambil semua URL yang pernah diketahui dari Wayback Machine, Common Crawl, dan VirusTotal.

gau target.com | sort -u

Passive

waybackurls

Wayback URL Extractor

Mengambil URL dari Wayback Machine — sering menemukan endpoint lama yang masih aktif.

waybackurls target.com | sort -u

💡 Wordlists Rekomendasi

SecLists oleh Daniel Miessler adalah koleksi wordlist terlengkap: /usr/share/seclists/Discovery/Web-Content/. Wordlist populer: raft-medium-directories.txt, common.txt, big.txt. Untuk API: api-endpoints.txt.

09 Parameter & Endpoint Discovery 🕐 15 min

🔗 Konsep

Selain menemukan halaman, penting juga menemukan parameter tersembunyi (query string, POST body, headers) dan API endpoints. Parameter tersembunyi sering menjadi celah untuk IDOR, SQLi, XSS, dan mass assignment.

🛠️ Tools

Active

Arjun

Parameter Discovery

Menemukan hidden HTTP parameters di query string dan POST body.

arjun -u https://target.com/page

Active

ParamSpider

Parameter Mining

Mining parameter dari web archives untuk menemukan semua parameter yang pernah digunakan.

paramspider -d target.com

Active

Katana

Web Crawler

Next-gen crawler dari ProjectDiscovery — crawl website dan extract endpoint, form, JS files.

katana -u https://target.com -d 3 -jc

Passive

LinkFinder

JS Endpoint Extractor

Menganalisis file JavaScript untuk menemukan endpoint API dan URL tersembunyi.

linkfinder -i https://target.com/app.js -o results.html

Passive

JSFScan / SecretFinder

JS Secret Scanner

Memindai file JS untuk menemukan API keys, tokens, credentials, dan secrets yang tertanam.

secretfinder -i https://target.com/app.js -o results

Active

Kiterunner

API Endpoint Discovery

Brute force API endpoints dengan metode kontekstual (bukan hanya path, tapi juga HTTP method dan parameter).

kr scan https://target.com -w routes.kite

10 Google Dorking 🕐 20 min

🔎 Konsep

Google Dorking menggunakan operator pencarian lanjutan Google untuk menemukan informasi sensitif yang tidak sengaja terekspos ke publik. Ini adalah bentuk passive recon yang sangat efektif.

📋 Operator & Dork Penting

Dork	Fungsi
`site:target.com`	Semua halaman yang diindeks Google untuk domain target
`site:target.com filetype:pdf`	Menemukan file PDF (bisa berisi info sensitif)
`site:target.com filetype:sql`	Database dump yang terekspos
`site:target.com filetype:env`	File environment berisi credentials
`site:target.com filetype:log`	Log file yang mungkin berisi info sensitif
`site:target.com inurl:admin`	Halaman admin
`site:target.com inurl:login`	Halaman login
`site:target.com intitle:"index of"`	Directory listing terbuka
`site:target.com intext:"sql syntax"`	Halaman dengan SQL error (potensi SQLi)
`site:target.com ext:php inurl:config`	File konfigurasi PHP
`site:target.com "password" filetype:xlsx`	Spreadsheet berisi password
`"target.com" site:github.com`	Mentions di GitHub (API keys, source code)
`"target.com" site:pastebin.com`	Leaked data di Pastebin
`site:target.com -www`	Menemukan subdomain (exclude www)
`inurl:target.com intext:@gmail.com`	Email yang terekspos

🛠️ Tools Google Dorking

Passive

Google Hacking Database (GHDB)

Dork Collection

Database ribuan Google dorks yang dikategorikan oleh Exploit-DB / Offensive Security.

https://www.exploit-db.com/google-hacking-database

Passive

DorkSearch

Dork Generator

Tool online untuk generate dan menjalankan Google dorks secara cepat.

https://dorksearch.com

11 Email & People Reconnaissance 🕐 15 min

👤 Konsep

Mengumpulkan email karyawan dan informasi personal membantu untuk social engineering assessment, password spraying, dan memahami struktur organisasi target.

🛠️ Tools

OSINT

Hunter.io

Email Finder

Menemukan email karyawan berdasarkan domain. Menunjukkan email pattern ([email protected]).

https://hunter.io/search/target.com

OSINT

Phonebook.cz

Email & Domain Search

Mencari email, domain, dan URL yang terkait dengan sebuah organisasi.

https://phonebook.cz

OSINT

LinkedIn (Manual OSINT)

People Intelligence

Menemukan karyawan, jabatan, skill, dan tech stack yang digunakan dari profil karyawan target.

LinkedIn search: "Company Name"

OSINT

Have I Been Pwned

Breach Database

Mengecek apakah email pernah muncul di data breach — berguna untuk memahami exposure.

https://haveibeenpwned.com

OSINT

Dehashed

Breach Search Engine

Search engine untuk data breach — cari berdasarkan email, username, IP, domain, nama, dll.

https://dehashed.com

OSINT

CrossLinked

LinkedIn Scraper

Mengekstrak nama karyawan dari LinkedIn dan generate email list berdasarkan email pattern.

crosslinked -f '{first}.{last}@target.com' "Target Company"

12 Complete Recon Workflow & Checklist 🕐 30 min

🗺️ Full Recon Workflow

1

Scope & Rules

→

2

WHOIS & DNS

→

3

Subdomain Enum

→

4

Port Scan

→

5

Tech Stack

6

Content Discovery

→

7

Parameter Mining

→

8

JS Analysis

→

9

Google Dorking

→

10

Report

✅ Master Checklist

☐ Tentukan scope & dapatkan izin tertulis (scope of engagement)
☐ WHOIS lookup — pemilik domain, registrar, nameserver
☐ DNS enumeration — A, AAAA, MX, TXT, NS, CNAME, SOA records
☐ Reverse DNS lookup & ASN enumeration
☐ SSL/TLS certificate analysis via crt.sh
☐ Subdomain enumeration (subfinder + amass + brute force)
☐ Filter live subdomains (httpx / httprobe)
☐ Screenshot semua subdomain (gowitness / aquatone)
☐ Port scanning (nmap full scan pada target utama)
☐ Service & version detection
☐ Technology fingerprinting (Wappalyzer / WhatWeb)
☐ WAF detection (wafw00f)
☐ CMS scanning (WPScan jika WordPress)
☐ Directory & file bruteforcing (ffuf / gobuster)
☐ Cek robots.txt, sitemap.xml, security.txt
☐ Cek .git, .svn, .env, .DS_Store, backup files
☐ URL harvesting (gau / waybackurls)
☐ JavaScript analysis (LinkFinder / SecretFinder)
☐ Parameter discovery (Arjun / ParamSpider)
☐ API endpoint discovery (Kiterunner)
☐ Google dorking untuk sensitive files & info disclosure
☐ GitHub/GitLab dorking untuk leaked credentials
☐ Email harvesting (Hunter.io / theHarvester)
☐ Breach data check (HIBP / Dehashed)
☐ Shodan / Censys lookup
☐ Wayback Machine analysis untuk endpoint lama
☐ Vulnerability scanning (Nuclei templates)
☐ Dokumentasikan semua temuan dalam report

⚡ Automation — One-Liner Combo

Berikut beberapa one-liner chain yang menggabungkan tools untuk efisiensi:

Full subdomain → live → screenshot:

subfinder -d target.com -silent | httpx -silent | gowitness file -f -

Find all URLs → extract parameters → scan XSS:

gau target.com | uro | gf xss | httpx -silent | dalfox pipe

Subdomain → port scan → vuln scan:

subfinder -d target.com | dnsx -silent | naabu -silent | nuclei -t cves/

JS file crawl → extract secrets:

katana -u https://target.com -jc -d 3 | grep "\.js$" | nuclei -t exposures/

🧰 Recon Frameworks (All-in-One)

Reconftw

Automated Recon Pipeline

Script bash yang menjalankan seluruh pipeline recon secara otomatis — subdomain, port scan, content discovery, vuln scan, semua dalam satu command.

reconftw -d target.com -r

FinalRecon

All-in-One Recon

Tool Python untuk WHOIS, DNS, header analysis, SSL, crawler, dan Wayback dalam satu tool.

finalrecon --full https://target.com

Raccoon

Offensive Recon

High-performance offensive recon tool — DNS, WHOIS, TLS, port scan, directory brute, web app scan.

raccoon target.com

Photon

Fast Crawler & Extractor

Crawler yang mengekstrak URLs, emails, files, secrets, dan subdomains dari target website.

photon -u https://target.com -o output/

🔒 Etika & Legalitas

Selalu pastikan Anda memiliki izin tertulis sebelum melakukan active recon. Untuk latihan, gunakan platform legal seperti HackTheBox, TryHackMe, PortSwigger Web Security Academy, atau program bug bounty resmi (HackerOne, Bugcrowd, Intigriti). Passive recon pada data publik umu

Website Reconnaissance

🔍 Definisi

📊 Dua Jenis Utama

👁️ Konsep

📋 Yang Dicari

🛠️ Tools Passive Recon

⚡ Konsep

🛠️ Tools Active Recon

🌐 Konsep

🛠️ Tools OSINT

🗺️ Mengapa Penting?

🛠️ Tools Subdomain Enumeration

🚪 Konsep

📋 Port Penting untuk Web Recon

🛠️ Nmap Cheatsheet

🏷️ Apa yang Dicari?

🛠️ Tools Fingerprinting

📂 Konsep

🎯 Target yang Sering Ditemukan

🛠️ Tools Content Discovery

🔗 Konsep

🛠️ Tools

🔎 Konsep

📋 Operator & Dork Penting

🛠️ Tools Google Dorking

👤 Konsep

🛠️ Tools

🗺️ Full Recon Workflow

✅ Master Checklist

⚡ Automation — One-Liner Combo

🧰 Recon Frameworks (All-in-One)