Title Tag & Meta Description
The title tag is the single most impactful on-page SEO element. It appears as the clickable blue headline in every search result. Google may rewrite it if it finds yours misleading or too short — write it well and it stays yours.
Title tag rules
| Rule | Correct | Wrong |
|---|---|---|
| Length | 50–60 characters | 80+ chars (truncated in SERP) |
| Keyword position | Primary keyword first | Brand name first, keyword last |
| Uniqueness | Unique per page | Same title on all pages |
| Content match | Accurately describes the page | Clickbait that misleads users |
Meta description rules
<!-- GOOD — descriptive, keyword-aware, has implicit CTA -->
<meta name="description"
content="A complete, practical guide to SEO fundamentals —
meta tags, structured data, robots.txt, sitemaps,
and more — all implemented live on this very page.">
<!-- BAD — vague, no keywords, no reason to click -->
<meta name="description" content="Welcome to our website.">
Canonical URL
When the same content is accessible at multiple URLs, Google distributes your ranking signals (backlinks, traffic) across all of them instead of consolidating them on one. This dilutes your authority. The canonical tag says: "Ignore the variants. This is the definitive URL."
Common causes of duplicate URLs
## All of these might serve identical content:
https://example.com/page # clean URL
https://example.com/page/ # trailing slash variant
https://www.example.com/page # www vs non-www
http://example.com/page # http vs https
https://example.com/page?utm_source=email # UTM parameter
https://example.com/page?ref=homepage # referral parameter
https://example.com/page?sort=asc # filter/sort parameter
<!-- Place in <head>. Always use absolute URL with https:// -->
<!-- ⚑ Replace with YOUR production URL -->
<link rel="canonical"
href="https://yourdomain.com/this-page/">
Open Graph Protocol
Open Graph (OG) was created by Facebook and is now used by LinkedIn, Slack, iMessage, WhatsApp, Discord, and most platforms that generate link previews. Without OG tags, the platform guesses — and usually gets it wrong.
<!-- Use og:type="website" for homepages and standard pages -->
<!-- ⚑ Replace yourdomain.com with your actual URL throughout -->
<meta property="og:type" content="website">
<meta property="og:title" content="Your Page Title Here">
<meta property="og:description" content="Describe the page in 1–2 sentences.">
<meta property="og:image" content="https://yourdomain.com/og-image.jpg">
<meta property="og:url" content="https://yourdomain.com/this-page/">
<meta property="og:site_name" content="Your Brand Name">
<!-- Use og:type="article" for blog posts and news pages -->
<!-- IMPORTANT: a page can only have ONE og:type value. -->
<!-- Choose "website" OR "article" — never both. -->
<meta property="og:type" content="article">
<meta property="og:title" content="Your Article Title">
<meta property="og:description" content="Article summary here.">
<meta property="og:image" content="https://yourdomain.com/og-image.jpg">
<meta property="article:published_time" content="2026-02-10T00:00:00Z">
<meta property="article:author" content="https://twitter.com/yourhandle">
Twitter Card
Twitter (X) uses its own meta tags. If absent, it falls back to Open Graph, but the fallback rendering is less reliable. Explicitly set Twitter Card tags for consistent display.
<!-- summary_large_image = full-width banner card (recommended) -->
<!-- summary = small thumbnail on the left -->
<!-- app = for app download cards -->
<!-- player = for video/audio cards -->
<!-- ⚑ Replace yourdomain.com and @yourhandle with your own -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Your Page Title Here">
<meta name="twitter:description" content="Short summary for the Twitter card.">
<meta name="twitter:image" content="https://yourdomain.com/og-image.jpg">
<meta name="twitter:site" content="@yourhandle"> <!-- optional -->
Structured Data & JSON-LD
Structured data gives Google machine-readable facts about your content. In return, Google can display rich results — enhanced SERP listings with star ratings, FAQs, product prices, event dates, and more. Rich results can dramatically increase CTR.
| Schema Type | Rich Result | Best for |
|---|---|---|
| Article | Top stories carousel, date | Blog posts, news |
| Product | Price, availability, reviews | E-commerce |
| FAQPage | Expandable Q&A in SERP | FAQ sections |
| LocalBusiness | Map, hours, phone | Brick & mortar |
| BreadcrumbList | Path shown under title | Multi-level sites |
| Event | Date, location, tickets | Events, concerts |
| JobPosting | Job listing card | Career pages |
| Person | Knowledge panel | Personal/portfolio |
| WebSite | Sitelinks searchbox | Homepage |
JSON-LD format (recommended)
<!-- ⚑ Replace all yourdomain.com values with your own URLs -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Your Article Headline Here",
"description": "A short description of the article.",
"image": "https://yourdomain.com/og-image.jpg",
"url": "https://yourdomain.com/",
"datePublished": "2026-01-01",
"dateModified": "2026-01-01",
"author": {
"@type": "Person",
"name": "Your Name",
"url": "https://yourdomain.com/"
},
"publisher": {
"@type": "Organization",
"name": "Your Organisation Name",
"logo": {
"@type": "ImageObject",
"url": "https://yourdomain.com/logo.png"
// logo should be a dedicated image, NOT the article image
// recommended: rectangular, ~60px tall, on a white background
// if you don't have a logo yet, delete this entire "logo" block
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://yourdomain.com/"
}
}
</script>
robots.txt — Controlling Crawlers
The robots.txt file lives at the root of your domain (yourdomain.com/robots.txt)
and tells search engine crawlers which parts of your site they can and cannot access.
It is the first file most crawlers fetch when they visit your site.
# User-agent specifies WHICH crawler this rule applies to
# * means ALL crawlers
User-agent: *
Allow: / # allow everything by default
Disallow: /admin/ # block admin panel
Disallow: /api/ # block API endpoints
Disallow: /checkout/ # block checkout pages
Disallow: /search? # block internal search results
# ⚑ Point to YOUR sitemap URL — not yourdomain.com
Sitemap: https://yourdomain.com/sitemap.xml
---
# Crawl-delay: how many seconds to wait between requests
# Use if your server can't handle heavy crawler traffic
User-agent: Googlebot
Crawl-delay: 2
---
# Block a specific crawler entirely (e.g. AI training scrapers)
User-agent: GPTBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: *
Allow: /
Sitemap: https://risa-source.github.io/BasicSEOImprovement/sitemap.xml
# Block AI training crawlers
User-agent: GPTBot
Disallow: /
User-agent: CCBot
Disallow: /
# Wrong (what NOT to do — this would prevent Google finding your sitemap):
# Sitemap: http://localhost/Workshop11/sitemap.xml
XML Sitemap
A sitemap is a file that lists all the pages on your site you want search engines to index. It doesn't guarantee indexing, but it dramatically speeds up discovery — especially for large sites, new sites, or pages with few inbound links.
<?xml version="1.0" encoding="UTF-8"?>
<!-- ⚑ Replace yourdomain.com with your actual production URL -->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<!-- loc: the canonical, absolute URL of the page -->
<loc>https://yourdomain.com/</loc>
<!-- lastmod: when the page content was last changed (YYYY-MM-DD) -->
<lastmod>2026-02-10</lastmod>
<!-- changefreq: hint to crawlers — always|daily|weekly|monthly|never -->
<!-- Google largely ignores this, but it costs nothing to include -->
<changefreq>monthly</changefreq>
<!-- priority: relative importance 0.0–1.0 within YOUR site only -->
<!-- Does not affect how Google ranks you vs other sites -->
<priority>1.0</priority>
</url>
<url>
<loc>https://yourdomain.com/about/</loc>
<lastmod>2026-01-20</lastmod>
<changefreq>yearly</changefreq>
<priority>0.7</priority>
</url>
</urlset>
What to exclude from your sitemap
- ✗ Pages with
noindexmeta robot tag - ✗ Paginated archive pages (e.g.
/page/2/,/page/3/) - ✗ Thank-you pages, login pages, checkout pages
- ✗ Duplicate content URLs (use canonical instead)
- ✗ URLs that return non-200 HTTP status codes
- ✗ Disallowed URLs in robots.txt
Heading Hierarchy & Content Structure
Search engines use your heading structure as an outline of the page's topic and subtopics. A clear, logical hierarchy helps Google understand what the page is about and which parts are most important. It also drastically improves readability for humans and screen reader users.
<!-- ONE h1 per page — the primary topic -->
<h1>SEO Fundamentals: Complete On-Page Guide</h1>
<!-- h2 — major sections of the page -->
<h2>The <head> — Where SEO Begins</h2>
<!-- h3 — subsections within an h2 -->
<h3>The minimal viable head</h3>
<!-- h4 — subsections within an h3, use sparingly -->
<h4>Character encoding</h4>
<!-- NEVER skip levels: h1 → h3 (skipped h2) is wrong -->
<!-- NEVER use headings for visual size only — use CSS -->
Semantic HTML elements
| Element | Semantic meaning | SEO signal |
|---|---|---|
<article> | Self-contained piece of content | Strong — marks primary content |
<section> | Thematic grouping with a heading | Good — organises topics |
<nav> | Navigation links | Site structure signal |
<main> | Page's primary unique content | Identifies main content |
<header> | Introductory / branding content | Structural context |
<footer> | Supplementary / legal content | Lower content weight |
<aside> | Tangentially related content | Lower content weight |
<div> | No meaning | Zero SEO signal |
Image Optimisation
Images are the leading cause of slow page loads and poor Core Web Vitals scores. They're also an independent channel for traffic via Google Image Search. Getting image SEO right means better rankings AND a faster site.
<!-- BAD — the original version -->
<img src="big-image.jpg">
<!-- No alt text: invisible to screen readers, no keyword signal -->
<!-- No dimensions: causes layout shift (hurts CLS Core Web Vital) -->
<!-- No loading attr: blocks rendering of below-fold content -->
<!-- GOOD — fully optimised -->
<figure>
<img
src="optimized-image.jpg"
alt="Modern office workspace with natural light and standing desks"
<!-- alt: describe the image for screen readers AND image search -->
<!-- Do NOT: alt="image of office" or alt="keyword keyword office" -->
width="800"
height="600"
<!-- Explicit dimensions prevent Cumulative Layout Shift (CLS) -->
<!-- Browser reserves space before image loads → no jumping layout -->
loading="lazy"
<!-- Defer off-screen images until user scrolls near them -->
<!-- Use loading="eager" for the hero/above-fold image only -->
decoding="async"
<!-- Decode image off the main thread → doesn't block rendering -->
>
<figcaption>Our office workspace — built for deep work.</figcaption>
</figure>
Performance & Core Web Vitals
Since 2021, Google uses Core Web Vitals (CWV) as a confirmed ranking signal under the "Page Experience" update. These are real-world UX measurements captured in Chrome user data. They reflect how your page actually feels to use — not just how fast it theoretically loads.
LCP
Largest Contentful Paint
How long until the largest visible element (hero image, headline) renders. Measures perceived load speed.
Good: < 2.5sCLS
Cumulative Layout Shift
How much page elements unexpectedly jump around during load. Caused by images without dimensions, late-loading fonts.
Good: < 0.1INP
Interaction to Next Paint
How quickly the page responds to user interaction (click, tap, key press). Replaced FID in March 2024.
Good: < 200msPerformance techniques used on this page
<!-- 1. dns-prefetch: resolve domain name early (~20ms saved) -->
<link rel="dns-prefetch" href="//fonts.googleapis.com">
<!-- 2. preconnect: DNS + TCP + TLS early (~150ms saved) -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<!-- 3. font-display:swap prevents invisible text during load -->
<link href="https://fonts.googleapis.com/css2?family=...&display=swap" ...>
<!-- 4. Non-render-blocking CSS + FOUC prevention -->
<!-- Step A: hide body instantly via inline style -->
<style>body { opacity: 0; } body.css-ready { opacity: 1; transition: opacity 0.2s ease; }</style>
<noscript><style>body { opacity: 1; }</style></noscript>
<!-- Step B: load CSS async, then reveal body once ready -->
<link rel="preload" href="style.css" as="style"
onload="this.onload=null;this.rel='stylesheet';document.body.classList.add('css-ready')">
<noscript><link rel="stylesheet" href="style.css"></noscript>
<!-- 5. Explicit image dimensions prevent CLS -->
<img width="800" height="600" loading="lazy" decoding="async" ...>
Complete SEO Checklist
Use this before every page goes live.
Head tags
- ✓ charset="UTF-8" is the first tag in <head>
- ✓ Viewport meta tag with width=device-width
- ✓ Unique <title> 50–60 chars, primary keyword first
- ✓ Unique meta description 150–160 chars with call to action
- ✓ Canonical link tag with absolute production URL
- ✓ meta robots content="index, follow" (or noindex where needed)
- ✓ Full Open Graph tags: type, title, description, image, url, site_name
- ✓ Twitter Card tags: card, title, description, image — twitter:site is optional
- ✓ JSON-LD structured data matching the page content type
- ✓ Favicon (SVG preferred, data URI avoids extra request)
Content & markup
- ✓ Exactly one <h1> containing the primary keyword
- ✓ Logical heading hierarchy h1 → h2 → h3 (no skipped levels)
- ✓ Semantic landmark elements: <header>, <main>, <nav>, <footer>
- ✓ lang attribute on <html> element
- ✓ All images have descriptive alt text
- ✓ All images have explicit width and height attributes
- ✓ loading="lazy" on below-fold images
- ✓ loading="eager" on above-fold/hero image
Technical files
- ✓ robots.txt exists at domain root with production URL
- ✓ robots.txt Sitemap directive points to production sitemap URL
- ✓ sitemap.xml exists with all indexable pages
- ✓ sitemap.xml submitted to Google Search Console
- ✓ All sitemap URLs return HTTP 200
- ✓ No noindex pages are included in the sitemap
Performance
- ✓ LCP < 2.5s (test with PageSpeed Insights)
- ✓ CLS < 0.1 (all images and embeds have dimensions)
- ✓ INP < 200ms (minimal JS blocking main thread)
- ✓ CSS loaded non-render-blocking (preload trick) + FOUC prevented (opacity:0 on body until css-ready)
- ✓ Google Fonts use display=swap
- ✓ preconnect hints for critical external origins