A structured method to understand a website, audit performance, SEO, accessibility and defensive security practices.

Website analysis: method, tools and best practices

0

A structured method to understand a website, audit performance, SEO, accessibility and defensive security practices.

Photo: Pixabay / Pexels
10 min read

Goal and scope

Analyzing a website is not just viewing source code. A modern site combines HTML, CSS, JavaScript, media, API calls, cookies, local storage, third-party scripts, analytics and sometimes authentication.

The goal can be understanding how it works, auditing your own site, improving performance, checking SEO, reviewing accessibility or identifying defensive security issues.

This guide stays legal and ethical. It is not about bypassing authentication, exploiting vulnerabilities, scraping a site aggressively, accessing private data or disrupting a service.

Define the scope

Before starting, document the analysis:

Analyzed site:
Goal: learning / internal audit / performance / SEO / accessibility / security
Owner:
Authorization: yes / no
Pages covered:
Allowed actions:
Forbidden actions:
Test dates:
Technical contact:

Inspecting HTML/CSS/JS loaded in your browser, using DevTools, measuring performance, checking accessibility, reading HTTP headers or viewing robots.txt and sitemap.xml is usually acceptable. Bypassing authentication, accessing private endpoints, running intrusive scans or collecting personal data requires clear authorization.

Main analysis areas

A website can be reviewed from several angles:

  • front-end: HTML, CSS, JavaScript, DOM, components, frameworks;
  • network: requests, APIs, third-party scripts, CDN, HTTP codes, loading time;
  • performance: resource weight, initial rendering, cache, Core Web Vitals;
  • SEO: title, meta description, headings, URLs, internal links, sitemap, indexability;
  • accessibility: contrast, keyboard navigation, alternative text, labels, ARIA;
  • defensive security: headers, cookies, CORS, CSP, visible errors, dependencies.

The goal is to understand the site, not to push it outside normal use.

Prepare the environment

A modern browser already gives you a lot. Useful tools include:

  • Chrome DevTools or Firefox DevTools;
  • Lighthouse, PageSpeed Insights and WebPageTest;
  • Wappalyzer or BuiltWith;
  • curl, wget or httpie;
  • axe DevTools for accessibility;
  • Screaming Frog SEO Spider for SEO;
  • Playwright or Puppeteer for checks on your own site;
  • mitmproxy or Burp Suite only for authorized audits.

Use a dedicated browser profile, disable unnecessary extensions, test cold and warm cache, check mobile and desktop, and record date, browser version and tested pages.

First inspection with DevTools

Open DevTools with F12 or right-click then Inspect. The most useful tabs are Elements, Console, Network, Sources, Application, Performance and Lighthouse.

Start by reloading the page with Network open and "Preserve log" enabled.

Analyze HTML

Elements shows the rendered DOM. Check:

[ ] One main h1
[ ] Logical heading order
[ ] Semantic tags
[ ] Relevant image alt text
[ ] Form labels
[ ] Links understandable out of context
[ ] No important content only in images

Semantic HTML helps accessibility, SEO and maintenance.

Analyze CSS

CSS reveals visual architecture: frameworks, utility classes, variables, media queries, animations, inline styles and responsive behavior.

In DevTools, select an element to inspect applied rules, overridden rules, box model, margins, padding and source files.

Positive signs include well-named CSS variables, few inline styles, reusable components, clear responsive behavior, limited overrides and visual consistency.

:root {
  --color-primary: #1f6feb;
  --spacing-md: 1rem;
  --radius-md: 0.5rem;
}

Analyze JavaScript

JavaScript can reveal the stack, front-end logic, API calls and third-party dependencies. Inspect bundles, console errors, frameworks, source maps and external scripts.

Common clues:

React   → __REACT_DEVTOOLS_GLOBAL_HOOK__, React root
Vue     → __VUE__, data-v-xxxx
Angular → ng-version
Next.js → __NEXT_DATA__, /_next/static/
Nuxt    → __NUXT__, /_nuxt/

Source maps may expose more information than expected: file names, comments, internal paths or business logic. It is not always a vulnerability, but it deserves review in production.

Analyze Network

Network shows requests, order, duration, size, headers, responses, errors and redirects.

Simple workflow:

  1. Open DevTools.
  2. Go to Network.
  3. Enable "Preserve log".
  4. Reload.
  5. Filter by JS, CSS, Img, Fetch/XHR, Doc.
  6. Identify heavy, slow or failing resources.

Common status codes:

200 OK
301 / 302 Redirect
304 Not Modified
401 Unauthorized
403 Forbidden
404 Not Found
429 Too Many Requests
500 Server Error

Documenting contacted domains, slow resources, heavy images, 404s, external dependencies, missing cache and unnecessary redirects is useful and low-risk.

Read HTTP headers

Use curl:

curl -I https://example.com

Interesting headers:

Strict-Transport-Security
Content-Security-Policy
X-Frame-Options
X-Content-Type-Options
Referrer-Policy
Permissions-Policy
Cache-Control

These headers do not prove a site is secure, but missing ones often reveal improvement opportunities.

Cookies and local storage

In DevTools, open Application then Cookies, Local Storage, Session Storage and IndexedDB.

For cookies, check domain, expiry, path, HttpOnly, Secure, SameSite and readable sensitive data.

Set-Cookie: session=abc; HttpOnly; Secure; SameSite=Lax; Path=/

Avoid storing sensitive tokens in localStorage when safer alternatives exist. It increases exposure in case of XSS.

Forms

Forms are sensitive areas. Inspect GET or POST, target endpoint, client validation, error messages, autocomplete, labels, anti-spam, CSRF for authenticated actions and failure behavior.

Client-side validation helps UX, but server-side validation is required for security.

Technical SEO

Check the basics:

<title>Clear page title</title>
<meta name="description" content="Useful concise description.">
<link rel="canonical" href="https://example.com/page">

Review heading hierarchy, URLs, internal linking, JSON-LD structured data, robots.txt and sitemap.xml.

Important: robots.txt is not a security boundary. It gives instructions to crawlers, but listed URLs may still be public.

Performance

Measure with Lighthouse, PageSpeed Insights, WebPageTest, Performance and Network.

Important metrics:

LCP  → Largest Contentful Paint
INP  → Interaction to Next Paint
CLS  → Cumulative Layout Shift
TTFB → Time To First Byte
FCP  → First Contentful Paint

Common causes of slowness include heavy images, excessive JavaScript, render-blocking CSS, missing cache, poorly loaded fonts, too many third-party scripts, slow servers and unnecessary redirects.

Frequent optimizations: compress images, use WebP/AVIF, lazy-load non-critical images, minify, remove unused code, enable HTTP cache and defer non-critical scripts.

Accessibility

Quick checks:

[ ] Keyboard-only navigation
[ ] Visible focus
[ ] Sufficient contrast
[ ] Image alt text
[ ] Form labels
[ ] Heading structure
[ ] Screen reader test if possible

Common issues: icon buttons without accessible names, missing alt, poor contrast, keyboard-trapping modals, invisible focus, vague "click here" links and inconsistent headings.

Technologies and third-party scripts

Some clues:

WordPress → /wp-content/, /wp-json/
Shopify   → cdn.shopify.com
Next.js   → /_next/static/
Nuxt      → /_nuxt/
Webflow   → webflow.js
Wix       → static.wixstatic.com

List third-party scripts too: analytics, tag managers, pixels, chat, payment, reCAPTCHA, marketing widgets and JavaScript CDNs. Too many third parties can hurt performance, complicate cookie consent and increase exposure.

Lighthouse and automation

Lighthouse gives Performance, Accessibility, Best Practices and SEO scores. It is a helper, not a final verdict. Results vary with machine, network, cache, location, extensions and server state.

For your own site, Playwright can automate checks:

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
console.log(await page.title());
await browser.close();

Avoid aggressive automated browsing on third-party sites without permission.

Fictional mini-analysis

For https://example-demo.local, Network shows:

main.js        900 KB
hero.jpg       3.4 MB
style.css      180 KB
analytics.js   third-party

Likely issues: heavy hero image and large JavaScript bundle.

HTML shows:

<h1>Welcome</h1>
<h1>Our services</h1>

Improvement: keep one main h1 and use h2 for sections.

A well-configured session cookie:

session_id
Secure: true
HttpOnly: true
SameSite: Lax

Quick checklist

[ ] Goal and scope defined
[ ] Main pages identified
[ ] HTML inspected
[ ] CSS inspected
[ ] JavaScript inspected
[ ] Console errors recorded
[ ] Network requests analyzed
[ ] HTTP headers checked
[ ] Cookies checked
[ ] localStorage/sessionStorage inspected
[ ] robots.txt reviewed
[ ] sitemap.xml reviewed
[ ] Performance measured
[ ] Technical SEO checked
[ ] Accessibility checked
[ ] Third-party scripts listed
[ ] Report written

Report template

# Web analysis report

## General information
Site:
Date:
Browser:
Pages analyzed:
Goal:

## Summary
Main findings.

## Technologies
CMS/framework:
JS libraries:
Hosting/CDN:
Analytics:

## Performance / SEO / Accessibility / Defensive security
Findings:
Impact:
Recommendations:

## Prioritization
Critical:
High:
Medium:
Low:

What to avoid

Without explicit authorization, avoid aggressive scans, brute force, exploitation, authentication bypass, mass data extraction, destructive tests, sensitive parameter tampering, paywall bypass or unauthorized reuse of protected content.

Responsible analysis is about improving, understanding and documenting, not abusing a service.

Conclusion

The method is simple: define scope, inspect HTML/CSS/JS, observe network traffic, check headers, cookies and storage, measure performance, SEO and accessibility, identify technologies, document findings and suggest concrete recommendations.

With only a browser and DevTools, you can already learn a lot. The boundary remains clear: observe and understand, yes; bypass, exploit or disrupt, no.