Welcome to Chapter 10! So far, our Rust-based Static Site Generator (SSG) can parse content, apply templates, generate routes, and output static HTML. However, with every change to a source file, our SSG currently rebuilds the entire site. While fast for small projects, this full rebuild approach quickly becomes a bottleneck for larger sites, leading to frustratingly long development cycles.
In this chapter, we will tackle this performance issue head-on by implementing two crucial features: incremental builds and file system watching. Incremental builds allow our SSG to intelligently detect changes and only re-process the necessary files, drastically reducing build times. Coupled with a file system watcher, this will enable an incredibly smooth developer experience: save a file, and the site automatically rebuilds and refreshes in milliseconds, showing your changes instantly.
We will design a caching mechanism to track file states, use cryptographic hashing to reliably detect modifications, and integrate a robust file system watcher. By the end of this chapter, you will have a highly responsive SSG development server that makes content creation a joy, deeply understanding how modern build tools achieve their speed and efficiency.
Planning & Design
The core idea behind incremental builds is to avoid redundant work. This requires a way to:
- Track File State: Store metadata about each source file (e.g., its hash or last modified timestamp) from the previous successful build.
- Detect Changes: Compare the current state of files with the tracked state to identify what has been
added,modified, ordeleted. - Invalidate Cache: Based on detected changes, determine which generated output files (HTML, assets) are now stale and need to be rebuilt.
- Rebuild Selectively: Only run the build pipeline for the affected source files and their dependent outputs.
- Update Cache: After a successful incremental build, update the tracked state for the modified files.
For file system watching, we need a separate thread or asynchronous task that monitors specified directories for changes and triggers the incremental build process.
Component Architecture for Incremental Builds and Watching
Let’s visualize the enhanced build process:
Explanation of the Flow:
- Build Trigger: Can be manually invoked (e.g.,
ssg build) or automatically by the file system watcher. - Cache Check: The builder first checks if a build cache exists and is valid.
- Initial Build: If no cache or invalid, a full scan and hash computation occur.
- Incremental Path: If a cache exists, it’s loaded, and current file hashes are compared against it.
- Change Detection:
Added,Modified,Deletedfiles are identified. - Affected Outputs: Based on changes, the system determines which specific pages or assets need to be rebuilt. For instance, a change in a content file affects only that page; a change in a template might affect many pages.
- Partial Build: Only the necessary parts of the build pipeline are executed.
- Update Cache: The cache is updated with the new file states.
- File System Watcher: Runs in the background, listening for changes. It debounces multiple rapid events into a single build signal to prevent excessive rebuilds.
- Serve Files: The development server serves the generated output, refreshing as changes occur.
File Structure Additions/Modifications
We’ll primarily modify our src/build.rs and introduce a new src/watcher.rs module. We’ll also define a new struct for our build cache.
.
├── Cargo.toml
├── src/
│ ├── main.rs
│ ├── config.rs
│ ├── content.rs
│ ├── parser.rs
│ ├── renderer.rs
│ ├── template.rs
│ ├── router.rs
│ ├── server.rs // Our dev server
│ ├── build.rs // Will contain core build logic and incremental logic
│ └── watcher.rs // NEW: Handles file system watching
│ └── cache.rs // NEW: Defines build cache structure and logic
Step-by-Step Implementation
a) Setup/Configuration
First, let’s add the necessary dependencies to our Cargo.toml.
Cargo.toml
[package]
name = "my_ssg"
version = "0.1.0"
edition = "2021"
[dependencies]
# ... existing dependencies ...
serde = { version = "1.0", features = ["derive"] }
serde_yaml = "0.9"
toml = "0.8"
pulldown-cmark = "0.9"
tera = "1.19"
anyhow = "1.0"
tracing = "0.1"
tracing-subscriber = "0.3"
walkdir = "2.3"
tokio = { version = "1.36", features = ["full"] } # Ensure "fs" feature is enabled for async file ops
lazy_static = "1.4"
regex = "1.10"
# New dependencies for incremental builds and watching
notify = "6.1" # For file system watching
sha2 = "0.10" # For cryptographic hashing of file contents
hex = "0.4" # To convert hash bytes to hex string
Explanation:
notify: A cross-platform file system notification library.sha2: Provides SHA-2 hashing algorithms, essential for reliably detecting file changes.hex: Converts byte arrays (fromsha2) into human-readable hexadecimal strings.
b) Core Implementation
We’ll start by creating the cache.rs module to define our build cache structure and utility functions for hashing.
1. src/cache.rs - Build Cache Structure and Hashing
This module will contain the BuildCache struct, which stores information about processed files, and functions to compute file hashes.
src/cache.rs
use std::{
collections::HashMap,
fs,
path::{Path, PathBuf},
time::SystemTime,
};
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
use anyhow::{Result, Context};
use tracing::{info, debug, error};
/// Represents the metadata for a single source file in the build cache.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FileMetadata {
pub path: PathBuf,
pub hash: String, // SHA256 hash of the file content
pub last_modified: SystemTime,
// Add other relevant metadata if needed, e.g., dependencies, output paths
}
/// The main build cache structure.
#[derive(Debug, Default, Serialize, Deserialize)]
pub struct BuildCache {
pub files: HashMap<PathBuf, FileMetadata>,
pub last_full_build_time: Option<SystemTime>,
// Potentially add more sophisticated dependency tracking here in the future
// e.g., template_dependencies: HashMap<PathBuf, Vec<PathBuf>> // template -> content files using it
}
impl BuildCache {
/// Loads the build cache from a JSON file.
pub fn load(cache_path: &Path) -> Result<Self> {
if !cache_path.exists() {
debug!("Build cache file not found at {:?}. Starting with empty cache.", cache_path);
return Ok(Self::default());
}
let content = fs::read_to_string(cache_path)
.context(format!("Failed to read build cache from {:?}", cache_path))?;
let cache: Self = serde_json::from_str(&content)
.context(format!("Failed to deserialize build cache from {:?}", cache_path))?;
info!("Build cache loaded from {:?}", cache_path);
Ok(cache)
}
/// Saves the build cache to a JSON file.
pub fn save(&self, cache_path: &Path) -> Result<()> {
let content = serde_json::to_string_pretty(self)
.context("Failed to serialize build cache")?;
fs::write(cache_path, content)
.context(format!("Failed to write build cache to {:?}", cache_path))?;
info!("Build cache saved to {:?}", cache_path);
Ok(())
}
/// Computes the SHA256 hash of a file's content.
pub fn compute_file_hash(path: &Path) -> Result<String> {
let mut file = fs::File::open(path)
.context(format!("Failed to open file for hashing: {:?}", path))?;
let mut hasher = Sha256::new();
std::io::copy(&mut file, &mut hasher)
.context(format!("Failed to read file for hashing: {:?}", path))?;
Ok(hex::encode(hasher.finalize()))
}
/// Creates FileMetadata for a given path.
pub fn create_file_metadata(path: &Path) -> Result<FileMetadata> {
let hash = Self::compute_file_hash(path)?;
let metadata = fs::metadata(path)
.context(format!("Failed to get metadata for file: {:?}", path))?;
let last_modified = metadata.modified()
.context(format!("Failed to get last modified time for file: {:?}", path))?;
Ok(FileMetadata {
path: path.to_path_buf(),
hash,
last_modified,
})
}
/// Updates the metadata for a single file in the cache.
pub fn update_file(&mut self, path: &Path) -> Result<()> {
let metadata = Self::create_file_metadata(path)?;
self.files.insert(path.to_path_buf(), metadata);
Ok(())
}
/// Removes a file from the cache.
pub fn remove_file(&mut self, path: &Path) {
self.files.remove(path);
}
}
Explanation of src/cache.rs:
FileMetadata: Stores thepath,hash(SHA256), andlast_modifiedtimestamp for each file. The hash is crucial for detecting content changes, andlast_modifiedcan be a quick check.BuildCache: Contains aHashMapwhere keys are file paths and values areFileMetadata. It also trackslast_full_build_timefor potential future use (e.g., for clearing cache if too old).loadandsave: Handle serialization/deserialization of the cache to a.ssg_cache.jsonfile usingserde_json.compute_file_hash: Reads a file and calculates its SHA256 hash. This is the most reliable way to know if a file’s content has changed.create_file_metadata: A helper to create aFileMetadatainstance.update_fileandremove_file: Methods to manage individual file entries in the cache.
2. src/build.rs - Integrating Incremental Logic
Now, let’s modify our build.rs to leverage this cache. We’ll introduce a BuildMode (full or incremental) and a BuildContext that holds the cache.
First, ensure src/build.rs has the necessary imports:
src/build.rs (add or modify imports)
use std::{
fs,
path::{Path, PathBuf},
collections::HashMap,
};
use anyhow::{Result, Context};
use tracing::{info, debug, error, warn};
use walkdir::WalkDir;
use crate::{
config::Config,
content::{Content, FrontMatter},
parser::parse_markdown_to_html,
renderer::render_page,
router::{Route, Router},
template::TemplateEngine,
// New imports
cache::{BuildCache, FileMetadata},
};
/// Defines the build mode.
pub enum BuildMode {
Full,
Incremental,
}
/// Context for the build process, including the cache.
pub struct BuildContext {
pub config: Config,
pub template_engine: TemplateEngine,
pub router: Router,
pub build_cache: BuildCache,
pub output_dir: PathBuf,
}
impl BuildContext {
pub fn new(config: Config, output_dir: PathBuf) -> Result<Self> {
let template_engine = TemplateEngine::new(&config.template_dir)?;
let router = Router::new(); // Initialize an empty router for now, it's populated during build.
let build_cache = BuildCache::load(&output_dir.join(".ssg_cache.json"))?;
Ok(Self {
config,
template_engine,
router,
build_cache,
output_dir,
})
}
}
// ... existing functions like `setup_output_directory`, `copy_static_assets` ...
Now, let’s refactor the main build_site function to incorporate incremental logic. This will be a significant change.
src/build.rs (modify build_site and add helper functions)
// ... (previous code for BuildContext, setup_output_directory, copy_static_assets) ...
/// Enum to classify file changes
#[derive(Debug, PartialEq)]
pub enum FileChange {
Added,
Modified,
Deleted,
Unchanged,
}
/// Scans the source directory and compares current file states with the cache.
/// Returns categorized lists of changed content and template files.
pub fn detect_file_changes(
build_context: &mut BuildContext,
source_dir: &Path,
template_dir: &Path,
static_dir: &Path,
) -> Result<(
HashMap<PathBuf, FileChange>, // All changed files (content, template, static)
Vec<PathBuf>, // List of files that need a full rebuild (e.g., template changes)
)> {
let mut changed_files: HashMap<PathBuf, FileChange> = HashMap::new();
let mut files_to_rebuild_all: Vec<PathBuf> = Vec::new(); // Files whose change requires full rebuild
// Track all current files encountered during scan
let mut current_files_in_source: HashMap<PathBuf, FileMetadata> = HashMap::new();
// 1. Scan Content, Template, and Static directories
for entry in WalkDir::new(source_dir)
.into_iter()
.filter_map(|e| e.ok())
{
let path = entry.path().to_path_buf();
if path.is_file() {
let relative_path = path.strip_prefix(source_dir).unwrap_or(&path).to_path_buf();
// Ignore cache file itself
if relative_path.file_name().map_or(false, |name| name == ".ssg_cache.json") {
continue;
}
let current_metadata = BuildCache::create_file_metadata(&path)?;
current_files_in_source.insert(relative_path.clone(), current_metadata.clone());
let cached_metadata_option = build_context.build_cache.files.get(&relative_path);
match cached_metadata_option {
Some(cached_metadata) => {
if cached_metadata.hash != current_metadata.hash {
debug!("File modified: {:?}", relative_path);
changed_files.insert(relative_path.clone(), FileChange::Modified);
// If a template file changes, it might affect many pages, so mark for full rebuild
if relative_path.starts_with(template_dir.strip_prefix(source_dir).unwrap_or(template_dir)) {
files_to_rebuild_all.push(relative_path.clone());
}
} else {
// Unchanged
}
}
None => {
debug!("File added: {:?}", relative_path);
changed_files.insert(relative_path.clone(), FileChange::Added);
if relative_path.starts_with(template_dir.strip_prefix(source_dir).unwrap_or(template_dir)) {
files_to_rebuild_all.push(relative_path.clone());
}
}
}
}
}
// 2. Detect Deleted Files (present in cache but not in current scan)
let mut deleted_files: Vec<PathBuf> = Vec::new();
for cached_path in build_context.build_cache.files.keys() {
if !current_files_in_source.contains_key(cached_path) {
deleted_files.push(cached_path.clone());
}
}
for path in deleted_files {
debug!("File deleted: {:?}", path);
changed_files.insert(path, FileChange::Deleted);
}
Ok((changed_files, files_to_rebuild_all))
}
/// The main function to build the static site.
/// orchestrates the entire build process.
pub async fn build_site(build_context: &mut BuildContext, mode: BuildMode) -> Result<()> {
info!("Starting build in {:?} mode...", mode);
let ssg_root = &build_context.config.root_dir;
let content_dir = ssg_root.join(&build_context.config.content_dir);
let static_dir = ssg_root.join(&build_context.config.static_dir);
let template_dir = ssg_root.join(&build_context.config.template_dir);
let output_dir = &build_context.output_dir;
// Ensure output directory exists and is clean for full builds
if let BuildMode::Full = mode {
setup_output_directory(output_dir)?;
}
// --- Step 1: Detect Changes (for incremental builds) ---
let mut content_files_to_process: Vec<PathBuf> = Vec::new();
let mut static_files_to_copy: Vec<PathBuf> = Vec::new();
let mut template_files_to_update: Vec<PathBuf> = Vec::new();
let mut pages_to_delete: Vec<PathBuf> = Vec::new(); // Original content paths of deleted pages
let mut requires_full_rebuild = false;
if let BuildMode::Incremental = mode {
let (changed_files, files_triggering_full_rebuild) =
detect_file_changes(build_context, ssg_root, &template_dir, &static_dir)?;
if !files_triggering_full_rebuild.is_empty() {
warn!("Changes in {:?} detected. Triggering full rebuild.", files_triggering_full_rebuild);
requires_full_rebuild = true;
}
if requires_full_rebuild {
info!("Performing full rebuild due to critical changes.");
// Clear current cache files for a fresh start, except the cache file itself.
build_context.build_cache.files.clear();
setup_output_directory(output_dir)?; // Re-clean output for full rebuild
} else {
for (relative_path, change_type) in changed_files.iter() {
let absolute_path = ssg_root.join(relative_path);
if absolute_path.starts_with(&content_dir) {
match change_type {
FileChange::Added | FileChange::Modified => {
content_files_to_process.push(absolute_path.clone());
build_context.build_cache.update_file(&absolute_path)?;
}
FileChange::Deleted => {
// Mark for deletion from output
if let Some(cached_meta) = build_context.build_cache.files.get(relative_path) {
// For now, we'll just remove from cache.
// A more advanced system would track output paths to delete.
debug!("Content file deleted, removing from cache: {:?}", relative_path);
pages_to_delete.push(relative_path.clone()); // Store relative path for potential output deletion
}
build_context.build_cache.remove_file(relative_path);
}
FileChange::Unchanged => {} // Should not be in changed_files map
}
} else if absolute_path.starts_with(&template_dir) {
// Templates are usually handled by requiring a full rebuild if their content changes significantly
// For now, we just update their cache entry.
match change_type {
FileChange::Added | FileChange::Modified => {
template_files_to_update.push(absolute_path.clone());
build_context.build_cache.update_file(&absolute_path)?;
// If template changes, all content files using it *might* need rebuilding.
// For simplicity, we are triggering a full rebuild for template changes.
// A more advanced system would track template dependencies.
// This path should ideally be covered by `files_triggering_full_rebuild`
}
FileChange::Deleted => {
debug!("Template file deleted, removing from cache: {:?}", relative_path);
build_context.build_cache.remove_file(relative_path);
// This would also trigger a full rebuild for dependency-aware systems.
}
FileChange::Unchanged => {}
}
} else if absolute_path.starts_with(&static_dir) {
match change_type {
FileChange::Added | FileChange::Modified => {
static_files_to_copy.push(absolute_path.clone());
build_context.build_cache.update_file(&absolute_path)?;
}
FileChange::Deleted => {
debug!("Static file deleted, removing from cache: {:?}", relative_path);
build_context.build_cache.remove_file(relative_path);
// TODO: Implement deletion of corresponding static file from output_dir
}
FileChange::Unchanged => {}
}
}
}
}
}
// --- Step 2: Process Content Files ---
if let BuildMode::Full = mode {
// For full build, scan all content files
for entry in WalkDir::new(&content_dir)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.path().is_file() && e.path().extension().map_or(false, |ext| ext == "md"))
{
content_files_to_process.push(entry.path().to_path_buf());
}
} else if content_files_to_process.is_empty() && static_files_to_copy.is_empty() && !requires_full_rebuild {
info!("No relevant content or static file changes detected for incremental build.");
// If nothing changed and no full rebuild required, just save cache and exit.
build_context.build_cache.save(&output_dir.join(".ssg_cache.json"))?;
return Ok(());
}
// For simplicity, even in incremental builds, we currently re-copy all static assets
// A more advanced system would only copy changed static assets.
// For now, `static_files_to_copy` is populated only by incremental changes, not full scan.
if let BuildMode::Full = mode {
copy_static_assets(ssg_root, &static_dir, output_dir)?;
} else if !static_files_to_copy.is_empty() {
// Only copy changed static files incrementally
for static_file_path in static_files_to_copy {
let relative_path = static_file_path.strip_prefix(ssg_root)?;
let dest_path = output_dir.join(relative_path);
if let Some(parent) = dest_path.parent() {
fs::create_dir_all(parent)?;
}
fs::copy(&static_file_path, &dest_path)
.context(format!("Failed to copy static file from {:?} to {:?}", static_file_path, dest_path))?;
debug!("Copied static file: {:?}", relative_path);
}
}
// Delete pages marked for deletion (from the router and output)
for deleted_content_path in pages_to_delete {
// This is a placeholder. A robust system would track the output path
// associated with the content path and delete that specific file.
// For now, we rely on full rebuilds to clean up deleted pages.
warn!("Content file {:?} deleted. Output file deletion not yet implemented for incremental builds. Full rebuild recommended for cleanup.", deleted_content_path);
build_context.router.remove_route_by_source(&deleted_content_path);
}
// Process content files (either all for full build, or only changed for incremental)
let mut processed_contents: Vec<Content> = Vec::new();
for content_file_path in content_files_to_process {
debug!("Processing content file: {:?}", content_file_path);
let content_result = Content::from_file(&content_file_path, &content_dir);
match content_result {
Ok(content) => {
processed_contents.push(content);
// Update cache for this file
let relative_path = content_file_path.strip_prefix(ssg_root)?;
build_context.build_cache.update_file(&ssg_root.join(relative_path))?;
},
Err(e) => {
error!("Failed to process content file {:?}: {:?}", content_file_path, e);
// Continue processing other files
}
}
}
// --- Step 3: Register Routes and Render Pages ---
// For incremental builds, we need to ensure the router is up-to-date.
// A full rebuild will clear and re-populate it entirely.
// For incremental, we add/update routes for processed_contents.
// If a full rebuild was triggered, or it's an initial full build, clear router.
if requires_full_rebuild || matches!(mode, BuildMode::Full) {
build_context.router = Router::new(); // Reset router
// Re-scan all content files to rebuild the router correctly
let all_content_files: Vec<PathBuf> = WalkDir::new(&content_dir)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.path().is_file() && e.path().extension().map_or(false, |ext| ext == "md"))
.map(|e| e.path().to_path_buf())
.collect();
processed_contents.clear(); // Clear, then re-populate with all content
for content_file_path in all_content_files {
let content_result = Content::from_file(&content_file_path, &content_dir);
match content_result {
Ok(content) => processed_contents.push(content),
Err(e) => error!("Failed to re-process content file {:?} for full rebuild: {:?}", content_file_path, e),
}
}
}
// Register routes for all processed content (either all or changed)
for content in &processed_contents {
build_context.router.register_route(&content, &build_context.config)?;
}
// Render pages. If full rebuild, render all. If incremental, render only changed.
// This is still a simplification. A truly incremental system would only render
// pages whose content or *dependent templates* have changed.
// For now, if any content changed, we re-render that specific page.
// If a template changed, we triggered a full rebuild, so all pages get re-rendered.
let pages_to_render = if requires_full_rebuild || matches!(mode, BuildMode::Full) {
// Render all pages
build_context.router.get_all_routes().cloned().collect()
} else {
// Render only pages whose content files were processed (added/modified)
processed_contents.iter()
.filter_map(|c| build_context.router.get_route_by_source_path(&c.source_path))
.cloned()
.collect()
};
for route in pages_to_render {
match render_page(
&route,
&build_context.router,
&build_context.template_engine,
output_dir,
&build_context.config,
) {
Ok(_) => debug!("Rendered: {}", route.output_path.display()),
Err(e) => error!("Failed to render route {}: {:?}", route.output_path.display(), e),
}
}
// Finalize: Save the updated build cache
build_context.build_cache.save(&output_dir.join(".ssg_cache.json"))?;
info!("Build completed successfully in {:?} mode.", mode);
Ok(())
}
Explanation of src/build.rs changes:
BuildModeenum: Distinguishes between aFullbuild (like the first run) and anIncrementalbuild.BuildContext: Now holds theBuildCacheinstance.detect_file_changes: This new function is the heart of incremental detection.- It walks the source directories (
content,templates,static). - For each file, it computes its current hash and compares it with the hash stored in
build_context.build_cache. - It categorizes files as
Added,Modified, orDeleted. - Crucially, if a template file is changed, it sets a flag
requires_full_rebuild, because template changes often impact multiple pages and tracking these dependencies is complex (and often overkill for dev builds).
- It walks the source directories (
build_sitemodifications:- Takes a
BuildModeargument. - If
Incrementalmode, it callsdetect_file_changes. - If
requires_full_rebuildis true (e.g., template change), it acts like aFullbuild. - Otherwise, it populates
content_files_to_processandstatic_files_to_copyonly with the detected changed files. - It updates the
build_cachewith new metadata for processed files and removes entries for deleted files. - Simplification: For now, if a content file is deleted, we print a warning and rely on a full rebuild to truly clean up the output. A more robust system would track the output path for each source file and delete it directly.
- Template Dependency: The current implementation triggers a full rebuild if any template file changes. This is a common and practical simplification for SSGs during development. A more advanced system would track which content files use which templates and only rebuild those specific content files.
- Router update: For full builds, the router is reset. For incremental, it just adds/updates routes for the
processed_contents. - Page Rendering: For full builds or if a full rebuild was triggered, all pages are re-rendered. Otherwise, only pages corresponding to
processed_contents(added/modified content files) are rendered. - Finally, the updated
build_cacheis saved.
- Takes a
3. src/watcher.rs - File System Watcher
This module will use the notify crate to monitor our source directories.
src/watcher.rs
use std::{
path::{Path, PathBuf},
time::Duration,
sync::Arc,
};
use notify::{
Config, Event, EventKind, RecommendedWatcher, RecursiveMode, Watcher,
};
use tokio::sync::{mpsc, Mutex};
use tracing::{info, debug, error, warn};
// Define an event type that the watcher sends
#[derive(Debug)]
pub enum WatcherEvent {
Change,
Shutdown,
}
/// Initializes and runs a file system watcher.
/// Sends `WatcherEvent::Change` on file modifications and `WatcherEvent::Shutdown` on error.
pub async fn start_watcher(
ssg_root: PathBuf,
content_dir: PathBuf,
template_dir: PathBuf,
static_dir: PathBuf,
tx: mpsc::Sender<WatcherEvent>,
) -> anyhow::Result<()> {
info!("Starting file system watcher on {:?}", ssg_root);
let (event_tx, mut event_rx) = mpsc::channel(100); // Channel for raw notify events
let mut watcher = RecommendedWatcher::new(
move |res| match res {
Ok(event) => {
if let Err(e) = event_tx.blocking_send(event) {
error!("Failed to send watcher event: {:?}", e);
}
}
Err(e) => error!("Watcher error: {:?}", e),
},
Config::default(),
)?;
// Watch the relevant directories recursively
watcher.watch(&content_dir, RecursiveMode::Recursive)?;
watcher.watch(&template_dir, RecursiveMode::Recursive)?;
watcher.watch(&static_dir, RecursiveMode::Recursive)?;
info!("Watcher is monitoring: {:?}, {:?}, {:?}", content_dir, template_dir, static_dir);
let mut debounce_timer: Option<tokio::time::Sleep> = None;
loop {
tokio::select! {
// Receive raw events from the `notify` crate
Some(event) = event_rx.recv() => {
debug!("Raw watcher event: {:?}", event);
// Filter out irrelevant events (e.g., changes to output directory, temporary files)
if should_trigger_rebuild(&event, &ssg_root) {
// Reset debounce timer on any relevant event
debounce_timer = Some(tokio::time::sleep(Duration::from_millis(200))); // Debounce for 200ms
}
}
// Wait for debounce timer to expire
_ = async {
if let Some(timer) = &mut debounce_timer {
timer.await;
} else {
futures::pending!(); // If no timer, just wait
}
}, if debounce_timer.is_some() => {
info!("Debounce timer expired, triggering rebuild.");
if let Err(e) = tx.send(WatcherEvent::Change).await {
error!("Failed to send build signal: {:?}", e);
break; // Exit loop on send error
}
debounce_timer = None; // Reset timer
}
else => {
// This branch is hit if `event_rx` is closed or `debounce_timer` is always None.
// In practice, `event_rx` shouldn't close unless `watcher` is dropped.
// This `else` block makes `tokio::select!` exhaustive and avoids a hang.
tokio::time::sleep(Duration::from_secs(1)).await; // Prevent busy-looping if no events
}
}
}
Ok(())
}
/// Determines if a file system event should trigger a rebuild.
fn should_trigger_rebuild(event: &Event, ssg_root: &Path) -> bool {
// Ignore events in the output directory
if event.paths.iter().any(|p| p.starts_with(ssg_root.join("public")) || p.starts_with(ssg_root.join(".ssg_cache.json"))) {
return false;
}
match event.kind {
EventKind::Access(_) => false, // Ignore access events
EventKind::Modify(_) | EventKind::Create(_) | EventKind::Remove(_) | EventKind::Any => {
// Filter out temporary files often created by editors (e.g., `.swp`, `~`, `.#`)
if event.paths.iter().any(|p| {
p.file_name()
.and_then(|name| name.to_str())
.map_or(false, |s| s.starts_with('.') || s.ends_with('~') || s.starts_with('#'))
}) {
debug!("Ignoring temporary file event: {:?}", event.paths);
false
} else {
true
}
}
_ => false, // Ignore other event kinds by default
}
}
Explanation of src/watcher.rs:
start_watcher: An async function that sets up and runs thenotifywatcher.mpsc::channel: A multi-producer, single-consumer channel is used to sendWatcherEventmessages to the main build loop.RecommendedWatcher:notify’s best-effort watcher for the current platform.watcher.watch: Configures the watcher to monitorcontent,template, andstaticdirectories recursively.- Debouncing:
tokio::time::sleepis used to implement debouncing. When a file change event occurs, a timer is started. If another event occurs before the timer expires, the timer is reset. Only when the timer successfully expires (meaning no new events for 200ms) is aWatcherEvent::Changesent. This prevents multiple rapid rebuilds from a single save operation (e.g., an editor might save a file, then its metadata, triggering two events). should_trigger_rebuild: A helper function to filter out irrelevant file system events, such as changes in thepublicoutput directory or temporary editor files.
4. src/main.rs - Integrating Watcher and Incremental Build
Finally, let’s update our main.rs to handle a watch command, which will start the development server and the file watcher.
src/main.rs (update main function)
use std::path::PathBuf;
use anyhow::Result;
use tracing::{info, Level};
use tracing_subscriber::FmtSubscriber;
use clap::Parser;
use tokio::sync::mpsc;
mod config;
mod content;
mod parser;
mod renderer;
mod template;
mod router;
mod server;
mod build;
mod cache; // NEW
mod watcher; // NEW
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
#[arg(subcommand)]
command: Commands,
}
#[derive(Parser, Debug)]
enum Commands {
/// Builds the static site
Build {
/// Path to the source directory (defaults to current directory)
#[arg(short, long, default_value = ".")]
source: PathBuf,
/// Path to the output directory (defaults to 'public')
#[arg(short, long, default_value = "public")]
output: PathBuf,
},
/// Starts a development server with live-reloading
Watch {
/// Path to the source directory (defaults to current directory)
#[arg(short, long, default_value = ".")]
source: PathBuf,
/// Path to the output directory (defaults to 'public')
#[arg(short, long, default_value = "public")]
output: PathBuf,
/// Port for the development server
#[arg(short, long, default_value_t = 8080)]
port: u16,
},
}
#[tokio::main]
async fn main() -> Result<()> {
// Setup tracing for logging
let subscriber = FmtSubscriber::builder()
.with_max_level(Level::INFO) // Set default logging level
.finish();
tracing::subscriber::set_global_default(subscriber)
.expect("setting default subscriber failed");
let args = Args::parse();
match args.command {
Commands::Build { source, output } => {
let config = config::Config::load(&source)?;
let mut build_context = build::BuildContext::new(config, output)?;
build::build_site(&mut build_context, build::BuildMode::Full).await?;
info!("Static site built successfully!");
}
Commands::Watch { source, output, port } => {
info!("Starting development server with watcher...");
let config = config::Config::load(&source)?;
let content_dir = source.join(&config.content_dir);
let template_dir = source.join(&config.template_dir);
let static_dir = source.join(&config.static_dir);
// Initial full build
let mut build_context = build::BuildContext::new(config, output.clone())?;
build::build_site(&mut build_context, build::BuildMode::Full).await?;
info!("Initial build complete. Serving on http://127.0.0.1:{}", port);
// Channel for watcher events to trigger rebuilds
let (tx, mut rx) = mpsc::channel(10);
// Start file watcher in a separate task
let watcher_tx = tx.clone();
let ssg_root_clone = source.clone();
let content_dir_clone = content_dir.clone();
let template_dir_clone = template_dir.clone();
let static_dir_clone = static_dir.clone();
tokio::spawn(async move {
if let Err(e) = watcher::start_watcher(
ssg_root_clone,
content_dir_clone,
template_dir_clone,
static_dir_clone,
watcher_tx,
).await {
error!("Watcher failed: {:?}", e);
// Send a shutdown signal if watcher fails
if let Err(send_err) = tx.send(watcher::WatcherEvent::Shutdown).await {
error!("Failed to send shutdown signal: {:?}", send_err);
}
}
});
// Start development server in a separate task
let server_handle = tokio::spawn(server::start_dev_server(port, output.clone()));
// Main loop to listen for watcher events and trigger rebuilds
loop {
tokio::select! {
Some(event) = rx.recv() => {
match event {
watcher::WatcherEvent::Change => {
info!("File change detected. Triggering incremental build...");
// Re-load config and create a new build context for each build
// This ensures latest config is used and build_context is fresh
let new_config = config::Config::load(&source)?;
let mut new_build_context = build::BuildContext::new(new_config, output.clone())?;
if let Err(e) = build::build_site(&mut new_build_context, build::BuildMode::Incremental).await {
error!("Incremental build failed: {:?}", e);
} else {
info!("Incremental build complete.");
// TODO: Implement live-reload signal to the browser here
// For now, manual refresh is needed, or integrate a WebSocket for live-reload.
}
}
watcher::WatcherEvent::Shutdown => {
error!("Watcher shutdown signal received. Exiting.");
break;
}
}
}
_ = &mut server_handle => {
error!("Development server stopped unexpectedly. Exiting.");
break;
}
}
}
}
}
Ok(())
}
Explanation of src/main.rs changes:
- New
Watchcommand inclap::Parser. - When
Watchcommand is run:- An initial
Fullbuild is performed. - An
mpsc::channelis created to communicate between the watcher and the main loop. watcher::start_watcheris spawned as a separate Tokio task, sending events to the channel.server::start_dev_serveris also spawned.- The
mainloop thentokio::select!s between receiving watcher events and the server handle. - Upon receiving
WatcherEvent::Change, anIncrementalbuild is triggered. - Upon
WatcherEvent::Shutdownor server failure, the application exits.
- An initial
- Important: We re-load the
Configand create a newBuildContextfor each build (initial or incremental). This ensures that any changes toconfig.tomlare picked up immediately without restarting the entire application.
c) Testing This Component
To test the incremental build and file watching:
Build the project:
cargo buildRun the watcher:
cargo run watch --source . --output public --port 8080You should see output indicating the initial full build and the watcher starting.
Open your browser: Navigate to
http://127.0.0.1:8080. You should see your site.Make a change:
- Open a Markdown file in your
content/directory (e.g.,content/posts/first-post.md). - Change some text.
- Save the file.
- Open a Markdown file in your
Observe the console:
- You should see logs in your terminal indicating:
File modified: "content/posts/first-post.md"File change detected. Triggering incremental build...Processing content file: .../content/posts/first-post.mdRendered: public/posts/first-post/index.htmlIncremental build complete.
- The build time should be significantly faster than a full rebuild.
- You should see logs in your terminal indicating:
Verify in browser: Refresh your browser. You should see the updated content.
Test template changes:
- Modify a template file (e.g.,
templates/base.html). - Save the file.
- You should see a
warn!message:Changes in [...] detected. Triggering full rebuild.followed by a full rebuild.
- Modify a template file (e.g.,
Test static asset changes:
- Modify a static asset (e.g.,
static/css/style.css). - Save the file.
- You should see
File modified: "static/css/style.css"andCopied static file: "static/css/style.css"logs.
- Modify a static asset (e.g.,
Test adding/deleting files:
- Add a new Markdown file to
content/. - Delete an existing Markdown file from
content/. - Observe the logs. For deletion, you’ll see a warning about output file cleanup.
- Add a new Markdown file to
Debugging Tips:
- If the watcher isn’t triggering:
- Ensure your
sourcepath is correct. - Check
tracinglogs forWatcher errormessages. - Verify the watched directories exist.
- Make sure you are saving the file, not just changing it.
- Ensure your
- If the incremental build is slow:
- Check
tracinglogs. Are too many files being processed? - Is
requires_full_rebuildbeing triggered unexpectedly? - Verify file hashing is working correctly.
- Check
Production Considerations
- Production Builds: Incremental builds are primarily for development. For production deployments, always perform a
Fullbuild (cargo run build) to ensure a clean, consistent output. This avoids any potential stale content issues that might arise from complex incremental cache invalidation scenarios. - Performance:
- Hashing: SHA256 is cryptographically secure but might be overkill for just file change detection. For extremely large sites, consider faster non-cryptographic hashes (like FNV) or simply relying on file modification timestamps if absolute reliability isn’t critical (though timestamps can be unreliable across different file systems or during sync operations). We’ll stick with SHA256 for robustness.
- Cache Serialization:
serde_jsonis generally fast, but for millions of files, saving/loading the cache could become a bottleneck. Binary formats likebincodecould be faster. - Dependency Tracking: Our current template dependency tracking is basic (full rebuild on template change). For massive sites with many templates and partials, a sophisticated dependency graph (e.g., which content files use which templates/partials) would be necessary to achieve true incremental rendering. This is a significant complexity increase but offers ultimate performance.
- Security: The build cache (
.ssg_cache.json) should not contain any sensitive information. It primarily stores file paths and hashes, which are not security-critical. Ensure.ssg_cache.jsonis added to.gitignore. - Logging and Monitoring: The
tracingcrate is essential. During development,debug!andinfo!levels provide insight. In production,info!andwarn!are typically sufficient. Ensure errors are always logged for easy debugging.
Code Review Checkpoint
At this point, you have implemented:
- A
BuildCachestruct insrc/cache.rsto store file metadata (path, hash, last modified). - Functions to compute SHA256 hashes of files.
- Modified
src/build.rsto:- Load and save the
BuildCache. detect_file_changesto compare current file states with the cache and identifyAdded,Modified,Deletedfiles.- An incremental build logic that only processes changed content/static files or triggers a full rebuild for template changes.
- Load and save the
- A
src/watcher.rsmodule that uses thenotifycrate to monitor source directories. - A debouncing mechanism in the watcher to prevent excessive rebuilds.
- Updated
src/main.rsto include awatchcommand, which starts the watcher and an incremental build loop alongside the development server.
Files Created/Modified:
Cargo.toml(addednotify,sha2,hex)src/cache.rs(new)src/watcher.rs(new)src/build.rs(significant modifications tobuild_site, addedBuildContext,detect_file_changes)src/main.rs(addedwatchcommand, integrated watcher and incremental build loop)
This completes a major enhancement to our SSG’s development workflow, making it much more responsive and enjoyable to use.
Common Issues & Solutions
Watcher not detecting changes:
- Issue: You save a file, but the console doesn’t show any
File change detectedmessages. - Solution:
- Check Paths: Ensure the
sourcedirectory passed tocargo run watchis correct and contains yourcontent,templates, andstaticfolders. - File Type/Name: Some editors create temporary files (e.g.,
.~filename,#filename#,.filename.swp) that ourshould_trigger_rebuildfunction might filter. Verify that the actual source file is being monitored. - Permissions: On some systems,
notifymight have issues with file system permissions. Run with elevated privileges if necessary (though usually not required for user directories). - Large Projects: For projects with an extremely large number of files, the watcher might struggle. Consider increasing the channel buffer size or the debounce duration.
- OS-specific quirks:
notifyrelies on OS-specific APIs. Some network drives or virtualized file systems might not emit events reliably.
- Check Paths: Ensure the
- Debugging: Add
debug!logs insideshould_trigger_rebuildand in thetokio::select!loop instart_watcherto see all raw events.
- Issue: You save a file, but the console doesn’t show any
Stale content after incremental build:
- Issue: You change a file, an incremental build runs, but the browser still shows old content even after refreshing.
- Solution:
- Cache Invalidation Logic: This is the trickiest part. Our current logic for template changes triggers a full rebuild. If you see stale content, it might mean a change (e.g., in a partial included by a template) wasn’t correctly identified as needing a full rebuild or wasn’t correctly linked to affected content files.
- Clean Build: If you encounter this, always try
cargo run build(a full build) to confirm if the issue is with the incremental logic or the core rendering. - Debugging: Use
debug!logs indetect_file_changesto verify that your file changes are correctly categorized asAddedorModified. Check thebuild_cache.jsonfile to see if the hashes are being updated.
Build loop/excessive rebuilds:
- Issue: The SSG enters a continuous rebuild loop or rebuilds too frequently for minor changes.
- Solution:
- Debouncing: Adjust the
Duration::from_millis(200)insrc/watcher.rs. Some editors might save in bursts. Increase it if needed (e.g., to 500ms). - Output Directory Filtering: Ensure your
output_dir(e.g.,public/) is correctly excluded from watching inshould_trigger_rebuild. If the watcher monitors its own output, it will trigger an infinite loop. Also ensure.ssg_cache.jsonis excluded. - Temporary Files: Make sure the
should_trigger_rebuildlogic is robust in filtering out temporary editor files.
- Debouncing: Adjust the
Testing & Verification
- Start the watcher:
cargo run watch --source . --output public --port 8080 - Initial build check: Verify that the first build is a “Full” build and completes successfully.
- Content modification: Edit any
.mdfile in yourcontentdirectory. Save it. Observe the console forIncremental build completewithin milliseconds (usually less than 100ms for small sites). Refresh your browser to confirm changes. - New content creation: Create a brand new
.mdfile incontent. Save it. Verify it’s detected asAddedand rendered correctly. - Content deletion: Delete an existing
.mdfile. Observe the logs. Acknowledge the warning about output file cleanup. - Template modification: Edit a
Teratemplate file in yourtemplatesdirectory. Save it. Verify afull rebuildis triggered and all pages are updated. - Static asset modification: Change a CSS or image file in your
staticdirectory. Save it. Verify it’s detected and copied to thepublicdirectory. - Configuration modification: Change a value in
config.toml. Save it. This should trigger a full rebuild because theBuildContextis recreated, effectively re-parsing the config. - Error handling: Introduce a syntax error in a Markdown file or a Tera template. The build should log an error but ideally not crash the watcher, allowing you to fix the error and trigger another build.
Summary & Next Steps
Congratulations! You’ve successfully implemented incremental builds and a file system watcher for your Rust SSG. This is a monumental step in improving the developer experience, making your SSG feel fast and responsive, comparable to modern static site generators. You now understand the core principles of build caching, change detection, and event-driven build systems.
We’ve laid the groundwork for a highly performant development workflow. While our current incremental logic for template changes is a full rebuild, it’s a practical trade-off for simplicity and correctness during development.
In the next chapter, we will focus on Chapter 11: Search Indexing and Pagefind Integration. We’ll learn how to generate a search index for your content and integrate it with a powerful client-side search library like Pagefind to provide a seamless search experience for your users.