
I once pushed a half-built site to production because I forgot to wait for Jekyll to finish building. The broken HTML sat live for twenty minutes before I noticed. That was the day I stopped trusting manual deploys and started designing pipelines with enforced validation gates.
But production safety was only half the problem. I also needed staging previews for every feature branch, without spinning up a new bucket for each one. Production and staging need fundamentally different pipeline shapes: one optimizes for confidence, the other for visibility.
Here’s what I ended up with:
push to main ──► build-and-test ──► deploy (GitHub Pages)
│
htmlproofer
SEO validator
push to feature/* ──► build (staging) ──► S3 prefix deploy
delete feature/* ──► S3 prefix cleanup
schedule (daily) ──► build-and-test ──► deploy
This post covers why each of these shapes exists.
Production: Two Jobs, One Gate
The production pipeline lives in a single file, 85 lines of YAML. The structure itself is the interesting part, so here it is in full:
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
schedule:
- cron: '0 22 * * *'
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: '3.3'
bundler-cache: true
- name: Install dependencies
run: bundle install
- name: Build Jekyll site
run: bundle exec jekyll build
env:
JEKYLL_ENV: production
- name: Validate HTML
run: |
bundle exec htmlproofer ./_site \
--disable-external \
--check-html \
--allow-hash-href \
--ignore-urls "/^#/,/localhost/" \
--enforce-https=false
- name: Validate SEO
run: ruby script/validate-seo.rb
deploy:
runs-on: ubuntu-latest
needs: build-and-test
if: >-
github.ref == 'refs/heads/main' &&
(github.event_name == 'push' || github.event_name == 'schedule')
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: $
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: '3.3'
bundler-cache: true
- name: Build Jekyll site
run: bundle exec jekyll build
env:
JEKYLL_ENV: production
- name: Setup Pages
uses: actions/configure-pages@v4
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: './_site'
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
The key design decision is needs: build-and-test. The deploy job won’t run until the build-and-test job passes, which means htmlproofer and the custom SEO validator act as first-class deploy gates. A broken internal link or a missing meta description blocks the deploy, not just flags a warning.
The if: condition on the deploy job is equally important. Pull requests trigger build-and-test for validation, but never reach the deploy job. That one condition gives me PR-as-validation-gate for free: open a PR, see if the validators pass, merge with confidence.
The id-token: write permission enables GitHub’s OIDC integration with GitHub Pages. No long-lived AWS credentials, no stored secrets for the production path. The runner requests a short-lived token, proves its identity to GitHub Pages, and deploys. (For more on the tradeoffs between serverless hosting models, see Serverless vs Containers: A Decision Framework.)
The gate costs about 45 seconds per run. The alternative is discovering a broken link after a Medium cross-post drives traffic.
When a gate fails, the pipeline stops. If htmlproofer catches a dead internal link, the deploy job never runs and the PR shows a red check. If the SEO validator flags a missing description, same result. There’s no manual override, no “deploy anyway” button. The only path to production is fixing the issue and pushing again.
The Daily Cron: Scheduled Posts Without a CMS
Jekyll has a --future flag that controls whether posts with a future date: in their front matter get rendered. By default, --future is off: a post dated next Tuesday simply won’t appear in the build output until that date arrives.
The production pipeline doesn’t pass --future, so future-dated posts are invisible. But something still needs to trigger a build on the day a scheduled post’s date arrives. That’s what the schedule trigger does:
schedule:
- cron: '0 22 * * *' # 6:00 AM SGT (UTC+8)
Every day at 6 AM Singapore time, the pipeline rebuilds. If a post’s date has arrived, it enters the build. If nothing is scheduled, the build is a no-op that validates and redeploys the same content. Because the deploy is idempotent, rerunning the job produces the same output without side effects.
The schedule event is included in the deploy job’s if: condition alongside push, so scheduled builds go through the same validation gate as manual pushes. No special path, no shortcut.
One thing to note: staging does the opposite. The staging pipeline passes --future explicitly, so you can preview scheduled posts before their publish date. That asymmetry is intentional. Production answers “what do readers see right now?” and staging answers “what will readers see after this merges?”
Staging: One Bucket, Many Branches
The staging problem is straightforward on the surface. I want a preview URL for every feature branch. The naive solution is one S3 bucket per branch, but that means creating and destroying buckets on every branch lifecycle event, plus managing the IAM policies for each one.
The solution is branch-path multiplexing: one bucket, one set of credentials, with each branch writing to its own prefix.
Branch name as path prefix
The first step transforms the branch name into a URL-safe path segment:
BRANCH_NAME=${GITHUB_REF#refs/heads/}
BRANCH_NAME=$(echo "$BRANCH_NAME" | sed 's/\//-/g')
feature/dark-mode becomes feature-dark-mode. That string becomes both the S3 prefix and the --baseurl for Jekyll:
S3 bucket
├── feature-dark-mode/
│ ├── index.html
│ └── ...
└── feature-search-page/
├── index.html
└── ...
Config layering
Jekyll’s --config flag accepts a comma-separated list of config files. Later files override earlier ones:
bundle exec jekyll build \
--config _config.yml,_config.staging.yml \
--baseurl "/feature-dark-mode" \
--future
The staging config is four lines:
url: "" # Use relative URLs for S3
google_analytics: "" # Disable analytics
staging: true
Blanking url prevents the site from generating absolute links back to the production domain. Blanking google_analytics keeps staging traffic out of the analytics pipeline. The staging: true flag is available in templates for conditional rendering (like showing a “this is a staging preview” banner).
The Perl rewrite gotcha
Jekyll’s --baseurl flag handles most path rewriting, but not all of it. Some absolute paths in templates and includes don’t go through Jekyll’s URL filters, so they render as /assets/css/style.css instead of /feature-dark-mode/assets/css/style.css.
The staging pipeline catches these with a post-build Perl regex:
find ./_site -name '*.html' -exec perl -pi -e \
"s|href=\"/(?!\Q${BRANCH}\E/)|href=\"/${BRANCH}/|g; \
s|src=\"/(?!\Q${BRANCH}\E/)|src=\"/${BRANCH}/|g" {} +
The negative lookahead prevents double-rewriting paths that already include the branch prefix. Without it, you’d end up with /feature-dark-mode/feature-dark-mode/assets/....
Auto-cleanup
When a feature branch is deleted, the delete event triggers the cleanup job:
cleanup:
runs-on: ubuntu-latest
if: github.event_name == 'delete'
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: $
aws-secret-access-key: $
aws-region: $
- name: Delete branch preview
run: |
BRANCH_NAME=$(echo "$" | sed 's/\//-/g')
aws s3 rm s3://$/$BRANCH_NAME/ --recursive
Merge or delete the branch, and its staging prefix disappears. No stale previews accumulating in the bucket.
The preview URL gets posted to the GitHub Step Summary, so you can click through directly from the Actions run page without constructing the URL manually.
Production vs. Staging at a Glance
| Factor | Production | Staging |
|---|---|---|
| Trigger | push to main, daily cron | push to feature/*, branch delete |
| Build config | _config.yml |
_config.yml + _config.staging.yml |
--future flag |
off | on |
| Analytics | enabled | disabled |
| Deploy target | GitHub Pages (OIDC) | S3 prefix (IAM access key) |
| Validation gates | htmlproofer + SEO | none |
| Auto-cleanup | n/a | on branch delete |
Staging has no validation gates on purpose. The whole point of a staging preview is to see broken things before they reach production. If staging blocked on htmlproofer failures, I couldn’t preview a half-finished post with placeholder links. Staging is for seeing the current state. Production is for ensuring correctness.
Failure Modes
If the deploy job fails after validation passes, rerun the deploy job. The build is deterministic, so a retry produces the same artifact.
If a staging deploy fails, the branch preview URL won’t update, but the previous version remains accessible. No data is lost.
Production and staging are completely decoupled. A staging failure never blocks a production deploy, and a production gate failure never affects staging previews.
Every job emits logs and artifacts in GitHub Actions, which acts as the single source of truth for deployment history.
What I’d Change Next
OIDC for staging. Production uses GitHub’s OIDC integration with Pages, which means no stored secrets. Staging still uses IAM access keys stored in GitHub Secrets. The fix is to create an IAM role with an OIDC trust policy scoped to the repository, but the sub claim in the trust policy needs to match the branch pattern. Wildcard branch matching in OIDC trust policies requires careful scoping to avoid granting access too broadly. It’s on the list, but the blast radius of getting it wrong is higher than the risk of rotating access keys periodically.
Share build artifacts between jobs. The production pipeline builds Jekyll twice: once in build-and-test for validation, and again in deploy. The cleaner approach is to upload _site/ as a GitHub Actions artifact in the test job and download it in the deploy job. I haven’t done this because the Jekyll build takes about 40 seconds and the site is small. For a larger site with a multi-minute build, this would be worth the added complexity.
CI/CD isn’t about automation. It’s about enforced correctness in production and safe iteration everywhere else. The validation gate enforces the first: no deploy without passing htmlproofer and SEO checks. Branch-path multiplexing enables the second: every feature branch gets a preview URL at the cost of one S3 prefix, not one S3 bucket. The daily cron fills the gap between them, turning future-dated posts into scheduled publishes without a CMS.
Discussion
Comments are powered by GitHub Discussions. Sign in with GitHub to join the conversation.