Skip to content

Course Description: Machine Learning with Graphs

Course Title

Graph Neural Networks and Machine Learning with Graphs

Subtitle

A Comprehensive Intelligent Textbook — From Graph Theory to Foundation Models

Inspiration & Standard

This textbook is built from the materials of Stanford CS224W (Jure Leskovec) but is designed to exceed that course in depth, accessibility, completeness, and interactivity. Where the course relies on lectures to fill in the gaps between terse slides, this textbook is fully self-contained. Where the course must choose between classical and modern topics due to time constraints, this textbook covers both.


Overview

Complex data can be represented as a graph of relationships between objects. Such networks are a fundamental tool for modeling social, technological, and biological systems. This textbook provides a rigorous, intuition-first, and deeply interactive treatment of machine learning on graphs — from classical graph algorithms and shallow node embeddings, through modern Graph Neural Networks (GNNs), to frontier topics like Graph Transformers, Knowledge Graph Foundation Models, and LLM+GNN integration.

Topics include: - Graph theory fundamentals and real-world network properties - Classical graph algorithms (PageRank, label propagation, community detection) - Shallow node embeddings (DeepWalk, node2vec, matrix factorization) - Graph Neural Networks: GCN, GraphSAGE, GAT, GIN, and beyond - GNN theory: expressiveness, WL test, over-smoothing, over-squashing - Graph Transformers (Graphormer, GPS, SAN) - Heterogeneous graphs and relational data - Knowledge graph embeddings and multi-hop reasoning - GNNs for recommender systems, drug discovery, and social networks - Scalable GNN training on billion-node graphs - Deep generative models for graphs - LLMs + GNNs, agents on graphs, graph foundation models - Frequent subgraph mining and motif analysis


Target Audience

This textbook is designed for: - Graduate students in CS, computational biology, data science, or any field with relational data - Advanced undergraduates who have completed an ML course and want to go deeper on graphs - Practitioners building recommendation engines, drug discovery pipelines, fraud detection systems, or knowledge bases - Researchers entering graph ML who want a unified, up-to-date reference


Prerequisites

Required

  • Programming: Python proficiency; NumPy/Pandas familiarity
  • Linear algebra: Matrix multiplication, eigenvalues/eigenvectors, SVD
  • Probability & statistics: Probability distributions, expectations, Bayes' theorem
  • ML fundamentals: Supervised/unsupervised learning, gradient descent, backpropagation

Helpful but taught in Chapter 0

  • PyTorch and automatic differentiation
  • Graph theory basics (we introduce from scratch)
  • PyTorch Geometric (PyG)

Chapter Structure (26 Chapters — Comprehensive Coverage)

The textbook is organized into 6 parts. It covers all topics from both the 2021 and 2025 CS224W offerings, plus dedicated prerequisite and application chapters not in the course.

Part 0: Prerequisites (Chapters 0–1)

Chapter Title CS224W Coverage
0 Math & Programming Prerequisites ❌ Not covered in CS224W
1 Introduction to Graphs and Networks ✅ Lecture 1

Chapter 0 covers: Linear algebra review (matrix operations, eigendecomposition, SVD), probability review, PyTorch basics, PyTorch Geometric installation and first graph, NetworkX tutorial. This is the chapter that makes the textbook self-contained.

Part 1: Classical Graph Analysis (Chapters 2–5)

Chapter Title CS224W Coverage
2 Graph Properties and Traditional ML Features ✅ 2021 Lecture 2 / ❌ dropped from 2025
3 Node Embeddings: DeepWalk and node2vec ✅ Both years
4 Link Analysis and PageRank ✅ 2021 Lecture 4 / ❌ dropped from 2025
5 Label Propagation and Semi-Supervised Learning ✅ 2021 Lecture 5 / ❌ dropped from 2025

Part 2: Graph Neural Networks (Chapters 6–11)

Chapter Title CS224W Coverage
6 GNN Foundations: Message Passing and GCN ✅ Both years
7 GNN Design Space: GraphSAGE and GAT ✅ Both years
8 GNN Training, Augmentation, and Practical Tips ✅ 2025 Lecture 5
9 Theory of GNNs: Expressiveness and the WL Test ✅ Both years
10 Designing Powerful Encoders: GIN and Beyond ✅ 2025 Lecture 7
11 Graph Transformers ✅ 2025 Lecture 8

Part 3: Knowledge Graphs and Reasoning (Chapters 12–14)

Chapter Title CS224W Coverage
12 Knowledge Graph Embeddings ✅ Both years
13 Reasoning over Knowledge Graphs ✅ 2021 Lecture 11
14 Knowledge Graph Foundation Models ✅ 2025 Lecture 15

Part 4: Graphs in the Wild (Chapters 15–19)

Chapter Title CS224W Coverage
15 Heterogeneous Graphs ✅ 2025 Lecture 9
16 GNNs for Recommender Systems ✅ Both years
17 Relational Deep Learning ✅ 2025 Lectures 12–13
18 Community Structure in Networks ✅ 2021 Lecture 14 / ❌ dropped from 2025
19 Frequent Subgraph Mining ✅ 2021 Lecture 12 / ❌ dropped from 2025

Part 5: Scalability and Generation (Chapters 20–22)

Chapter Title CS224W Coverage
20 Scaling GNNs to Billion-Node Graphs ✅ 2021 Lecture 17
21 Deep Generative Models for Graphs ✅ Both years
22 Temporal and Dynamic Graphs ❌ Not a dedicated lecture in CS224W

Part 6: Frontiers (Chapters 23–26)

Chapter Title CS224W Coverage
23 Advanced GNN Topics: In-Context Learning and Uncertainty ✅ 2025 Lecture 14
24 LLMs + GNNs: Text-Attributed Graphs and Joint Training ✅ 2025 Lecture 16
25 Agents + Graphs ✅ 2025 Lecture 17
26 Conclusion: The GNN Design Space and Open Problems ✅ Both years

Total: 27 chapters (26 + prerequisites) vs. 19 lectures in CS224W. New material not in CS224W at all: Chapters 0, 22 (temporal graphs), and expanded applications throughout.


Learning Objectives

Using Bloom's Taxonomy (2001 revision), students who complete this textbook will be able to:

Remember (Level 1)

  • Recall the definitions of: graph, node, edge, adjacency matrix, degree, path, connected component, subgraph, clique, bipartite graph, heterogeneous graph
  • List the four main graph ML tasks: node classification, link prediction, graph classification, graph generation
  • Recall the names and key equations of: GCN, GraphSAGE, GAT, GIN, GraphRNN, TransE, RotatE, LightGCN, Graphormer
  • State the Weisfeiler-Lehman graph isomorphism test procedure

Understand (Level 2)

  • Explain why message passing generalizes convolution from grids to irregular graphs
  • Describe the difference between transductive and inductive learning with concrete examples
  • Summarize why sum aggregation is strictly more expressive than mean or max for graph isomorphism
  • Explain the power-law degree distribution and why it appears in real-world networks
  • Describe how PageRank solves the spider-trap and dead-end problems via teleportation
  • Explain the geometric intuition behind TransE, ComplEx, and RotatE for different relation patterns
  • Describe what over-smoothing is, why it occurs, and three ways to mitigate it

Apply (Level 3)

  • Implement GCN, GraphSAGE, GAT, and GIN from scratch in PyTorch and using PyTorch Geometric
  • Apply node2vec for unsupervised node embedding with the appropriate p/q settings
  • Use the Open Graph Benchmark (OGB) pipeline to train and evaluate a GNN on a standard benchmark
  • Apply knowledge graph embedding methods (TransE, RotatE) to link prediction tasks
  • Build a LightGCN recommender system on a user-item interaction graph
  • Train a scalable GNN using neighbor sampling or cluster-based mini-batching
  • Generate molecular graphs conditioned on target properties using GCPN

Analyze (Level 4)

  • Analyze the expressive power of a GNN architecture and identify non-isomorphic graphs it fails to distinguish
  • Compare GCN, GraphSAGE, GAT, and GIN architecturally and empirically across node and graph classification tasks
  • Examine why certain relation patterns (symmetry, antisymmetry, inversion, composition) require different KG embedding geometry
  • Decompose the trade-offs between accuracy and scalability for different GNN sampling strategies
  • Evaluate the impact of graph structure vs. node features for a given dataset

Evaluate (Level 5)

  • Assess when graph structure adds value over feature-only baselines
  • Critically evaluate a GNN paper's experimental setup for potential confounders (data leakage, unfair baselines)
  • Judge which pooling strategy is appropriate for a graph classification task given graph size and structure
  • Compare GNN-based approaches to classical graph algorithms for a specific problem
  • Evaluate the appropriateness of different KG embedding methods given a dataset's relation patterns

Create (Level 6)

  • Design a novel end-to-end GNN system for a domain-specific task (drug interaction, fraud detection, traffic)
  • Construct a knowledge graph from raw data and develop a reasoning pipeline
  • Build interactive MicroSim visualizations of key GNN concepts using p5.js
  • Develop and tune a GNN that achieves competitive performance on an OGB benchmark
  • Write a well-structured literature review connecting classical graph algorithms to modern GNN approaches

Pedagogical Design

Every chapter follows this structure: 1. Motivation — real-world problem that this chapter's methods solve 2. Intuition first — visual explanation before any math 3. Mathematical formalism — full derivations with LaTeX 4. Algorithm / pseudocode — step-by-step procedure 5. PyTorch implementation — complete, runnable code 6. MicroSim — interactive p5.js visualization of the key concept 7. Empirical results — benchmark performance with up-to-date numbers 8. Common pitfalls — what goes wrong and how to fix it 9. Research extensions — open problems and recent work 10. Chapter summary — key equations and concepts 11. Exercises — Bloom's taxonomy-aligned questions at all 6 levels 12. Further reading — 5–8 key papers with annotated descriptions


What Makes This Textbook Better Than CS224W

Dimension CS224W Course This Textbook
Topic coverage 19 lectures (2025 drops 5 classical topics) 26 chapters (covers all 2021 + 2025 topics + more)
Prerequisite material Assumed, not taught Full Chapter 0: math + PyTorch + PyG
Explanations Slide bullets + live lecture Full prose derivations, worked examples
Intuition Conveyed verbally in lecture Written out before every formalism
Interactivity None (slides + colabs) MicroSims for every key concept
Code 6 Colabs, partial Complete runnable examples in every chapter
Exercises 3 HWs + 5 Colabs 6-level Bloom's exercises per chapter
Temporal graphs Brief mention Dedicated Chapter 22
Benchmark results 2021-era numbers Up-to-date 2024–2025 results
Accessibility Requires lecture audio Fully self-contained

Style and Technology

  • Format: MkDocs with Material theme
  • Math: LaTeX via MathJax; all major results stated as numbered theorems/propositions
  • Code: Python with PyTorch Geometric; every snippet is self-contained and runnable
  • Interactivity: p5.js MicroSims for: message passing, random walk, WL color refinement, PageRank power iteration, KG geometry, graph generation
  • Learning graph: 250+ concepts with full dependency DAG
  • Glossary: ISO 11179-compliant definitions for all key terms
  • Reading level: Graduate-level rigor; Flesch-Kincaid Grade ~15

Benchmark References

Primary evaluation datasets: - OGB (ogbn-arxiv, ogbn-products, ogbg-molhiv, ogbg-molpcba, ogbl-collab) — standard node/graph/link benchmarks - Cora, CiteSeer, PubMed — classic citation networks - TUDatasets — graph classification benchmarks - FB15k-237, WN18RR — knowledge graph completion - MovieLens-1M — recommendation - ZINC — molecular property prediction - RelBench — relational databases


Explicitly Out of Scope

The following topics are intentionally not covered in this textbook:

  • Reinforcement learning on graphs — graph-structured state/action spaces, GNN-based policy networks, and graph-based reward shaping are excluded; learners should consult dedicated RL textbooks for this intersection.
  • Graph signal processing (beyond GCN spectral derivation) — the full GSP literature (filterbanks, graph wavelets, spectral clustering via Laplacian eigenvectors) is covered only insofar as it motivates the GCN derivation in Chapter 6; deeper GSP theory is out of scope.
  • Streaming and dynamic graph algorithms — temporal graphs (Chapter 22) covers event-based and snapshot models for GNNs, but classical streaming algorithms (e.g., triangle counting under memory constraints) and fully online graph processing are not covered.
  • Hardware-level GNN optimization — GPU kernel design, sparse tensor formats for graphs, and chip-level inference optimization are not addressed; the focus stays at the software/model level.
  • Survey of GNN libraries beyond PyTorch Geometric — DGL, Spektral, GraphNets, and JAX-based frameworks are not covered; all code examples use PyTorch Geometric exclusively.
  • Continuous-time differential equations on graphs — neural ODEs on graphs and diffusion-PDE interpretations of GNNs are touched on briefly in the theory chapter but not developed in depth.
  • Quantum computing on graphs — quantum graph algorithms and quantum GNNs are entirely out of scope.