resume
Graeme Fawcett
Kingsville, Ontario
e: i.am@graemefawcett.ca
t: 226.246.0259
p: https://graemefawcett.ca
Summary
Senior platform engineer, 25 years across warehouse floors, ERP, logistics, and AWS infrastructure. Most recently I spent 11 years building Peregrine, a deployment platform serving 30 teams across multiple business units, 4,000+ pipelines and 135K+ monthly builds, ELITE DORA metrics across all four dimensions.
I build platforms that make best practices the path of least resistance. I call it institutional laziness: find the pattern in the problem and encode it in the platform. If you make the right thing easier than the wrong thing, it soon becomes the only thing. Teams adopt best practices because it's easier than not.
Every feature I've built started as a ticket from a team with a real problem. Platform development starts with the people using it.
Experience
Platform Architect
Advaita Bio (contract) | April 2026 -
- Implementing Vivarta, a multi-cloud developer platform supporting AWS and CoreWeave. Fully automated IaC & CI/CD pipelines, linked through a KV stored resource graph allowing lazy loadable resource distribution and healing. Multi-tier security shared between human and autonomous agents with attenuated roles and depth of defence. Cryptographically signed traces captured by default per boundary (née middleware transform) as a Merkle DAG for compliance and auditing.
- K8s deployed Jenkins as a Service. Declaratively configured through containerized controller w/ JCasC provided through a configmap. Job creation is templated and for Vivarta pipelines, deployed through JCasC. Standard pipelines configure their baseline through JCasC and delegate to per-repo Jenkinsfiles
- Documentation for Vivarta deployed on K8s through Diataxis formatted Documentation as Code. A thin Express backend serves markdown content and YAML data to a collection of web components which enhance and render. Living run books and live resouce / inventory views are composed through simple text, tag and YAML compositions
- A source, requirements and resource graph is produced through Dark Lantern, a Rack powered API fronting tree-sitter, which is used to respond to structured queries for source objects which can be merged into the documentation of the same through the DoC site, as a means of minimizing the drift between documentation and code caused by refactoring
Lead Platform Engineer
Clarivate (remote) | Jan 2021 – Jan 2026
Clarivate inherited Peregrine through their acquisition of ProQuest, evaluated a replacement platform during the transition, and shelved it. Led platform engineering for AWS infrastructure supporting $300M+ in revenue across 30 Jenkins instances, one per team.
- Built Nightingale, a live documentation platform and control plane, on top of a custom executable markdown engine. Code fences and structured data in markdown become API endpoints, CLI commands, and MCP tools from a single source. New tools ship as a markdown file in a git repository and a pull request.
- Used Nightingale to ship living runbooks (workflows that adapt to current conditions) and live infrastructure views unifying Cloudflare, AWS, and internal services. The runbooks and documentation couldn't go stale because each view rendered them live.
- Built tacit knowledge capture into Nightingale: morning team training sessions, transcripts, and code walkthroughs recorded into the same system, becoming guardrails and grounding context for agentic assistants. Institutional knowledge became encoded in the platform, not just its users.
- Designed the Peregrine Files: a 189-page documentation-as-code site that couldn't drift from the codebase. 165 reference pages auto-generated from Java reflection across the pipeline library's class hierarchy; tutorials, how-to, explanations, and reference organized on the Diátaxis framework. The site deployed through a Peregrine pipeline like those it documented. Every documentation release was a live integration test of the platform it described and platform releases couldn't ship without fresh documentation.
- Made documentation staleness architecturally impossible. Annotations on pipeline activity classes in Peregrine's Jenkins-based YAML DSL performed three jobs: drove execution, enforced validation through Peregrine's runtime compiler, and generated documentation through Gradle extractors at compile time. A passing test guaranteed accurate docs because test scenario YAML files were the documentation examples. The same metadata was fed to the Peregrine Files as Hugo-ready output.
- Built a custom Terraform provider encoding InfoSec and FinOps policy as code, allowing compliance enforcement at plan time, not review time. Clarivate's tagging taxonomy required 20+ tags per resource on a schema that changed regularly; the provider replaced all of that with a single system identifier, and schema changes propagated automatically on next build instead of triggering migration projects across every team.
- Built Methuselah, a Laravel/Vue developer portal as the canonical surface for everything Peregrine-adjacent that needed a human in the loop. Originally intended for maintenance window automation, it grew organically as new operations surfaced requiring click interaction.
- Within the portal: automated maintenance windows (replacing manual DNS surgery and bash scripts to take DCs offline across dozens of products); real-time SSH for executing repetitive tasks across grouped host fleets; a multi-axis SSM parameter grid that fed a dozen pipeline configs through macro expansion: branching, environment, and version state collapsed to a single declarative input; and credential management for the globally distributed single-pipeline CDN.
- Built and operated Peregrine solo for five years before growing the team to three. I stayed the architect, while helping the new engineers grow into platform ownership.
Senior Systems Administrator – DevOps
ProQuest | Jun 2014 – Dec 2020
Designed and built the Peregrine platform from scratch, growing it from five chained Jenkins jobs into a YAML-driven CI/CD framework serving multiple product lines by the time of ProQuest's acquisition by Clarivate.
- Built Peregrine's core tooling as a Ruby gem: 34 CLI executables, ~800 files, and ~85k LoC covering 36 AWS services, Cloudflare, Jenkins, Datadog, and internal service registries. The gem was the operational surface for the entire platform, handling infrastructure management, ECS orchestration, AMI lifecycle, and pipeline state shared across jobs through a DynamoDB cache.
- Built Firkin, an autonomous cluster control plane running as an ECS service in every Peregrine cluster. Eleven pluggable reconciliation monitors handled autoscaling, bin-packing, health-driven host eviction, and load-shedding on intervals. The operator pattern, in Ruby against ECS APIs, before ECS native autoscaling was available and Kubernetes Operators existed. Firkin was operational on ~50 clusters; the largest carried 200-300 nodes.
- Built a trip recording system inside the gem that captured full AWS API request/response cycles as YAML. Tests replayed real API behavior deterministically without live infrastructure or mocks. The recordings were the regression tests.
- Distributed the gem through Peregrine's own AMI pipeline. The platform shipped itself in development and production versions. Developers chose between the two or rolled their own AMIs from a provided template at any specific gem version.
- Enforced immutable infrastructure across all Peregrine deployments. AMIs or OCI images as application packaging, provisioned via CloudFormation, CDK, or Terraform, with EC2, ECS or Lambda providing compute. Patching cycles were not necessary for Peregrine teams; security updates shipped automatically with regular deployments. A centralized resource graph with custom YAML extensions (!Value, cross-pipeline lookups) made shared infrastructure composable across pipelines.
- Exposed the Peregrine toolchain as a versioned REST API (Grape/Ruby, Apirah) - 14 resource modules covering ECS management, Jenkins pipeline state, DNS, Cloudflare, tagging, discovery, and infrastructure health. OAuth2 authentication with full audit logging to DynamoDB on every request. Per-request STS role assumption made multi-account operations transparent to callers. Jenkins pipelines, monitoring systems, and the Jenkins EC2 cloud plugin (responsible for build agent provisioning) all ran through this layer rather than shelling out to CLI tools directly.
- Designed Peregrine's configuration language: 17 custom YAML tags that turned deployment configs into live infrastructure queries.
!Ami and !Image resolved the current artifact at execution time from DynamoDB caches; !Priority queried ALB listener rules and calculated the next available slot, with cross-account STS assumption baked in; !Conditional evaluated runtime environment conditions to select scaling parameters. !Resource and !ComponentResource turned cross-pipeline dependency graphs into a single declarative line. Pipelines could express "I need the database from system X" without coordinating with the team that owned X. Per-team deployment scripts were replaced by composable, declarative YAML that knew the state of the infrastructure it was deploying into.
- Built observability pipeline (Filebeat → Logstash → Kinesis → KCL → StatsD) processing millions of daily events. A custom Java KCL consumer emitted metrics to DataDog with full internal instrumentation: timestamp drift bucketed by age, parser-level throughput, consumer lag tracked against a 5-minute threshold that triggered automatic ECS-managed recovery. Two paths ran near hands-off. Metrics fed DataDog dashboards; archive wrote full events to S3.
- Every feature of the platform was delivered in response to a request received through Jira. The abstractions were created as patterns emerged. For example: the first deployment behind AWS' Application Load Balancers carried a hard-coded
Priority parameter. By the fifth, the !Priority YAML extension had been born.
Technical Founder
MozoMedia / Zumu | Mar 2013 – May 2014
Built a full warehouse management system from scratch as sole developer using Vaadin UI, Java EJB orchestration, and Oracle stored procedures. Hosted on AWS, it supported 60+ users across UK and US warehouses at ~$200K monthly turnover. The abstractions emerged from the warehouse floor, not the architecture diagram.
Technical Analyst / Developer
SilkRoute Global | Sep 2008 – Feb 2013
Joined the team operating the systems transferred from Handleman, eventually becoming the sole technical lead for the Oracle EBS suite managing Tesco's entertainment product operations (~£800M annual turnover in music, DVDs, and video games). Designed B2B integration layers and built an order management system processing 3M+ order lines at £40M+ turnover. Flown on-site to UK warehouses every peak season, Halloween through Christmas, to support operations directly on the floor. Delivered £1.5M+ in annual savings through logistics automation and EDI integration.
Business System Analyst
Handleman Company | Jan 2005 – May 2008
Functional/technical analyst for Oracle ERP customization at Handleman, which ran the second-largest Oracle EBS installation after Dell and held a seat on Oracle's performance advisory board during this period. Wrote the integration layer between Handleman's proprietary retail forecasting systems and Tesco's order pipeline. When Handleman sold those systems to Tesco in 2008, the transition team was named in the asset sale agreement; I stayed on as a Canadian employee through Handleman's wind-down (with a small group closing the books) before joining SilkRoute Global to operate the transferred systems.
Warehouse Lead
Handleman Canada | Jan 2003 – Jan 2005
Managed returns department operations with a team of 12. Created efficiency improvements and daily operational reporting.
Technologies
Languages: Ruby, Java, Go, Python, JavaScript, PL/SQL, bash
AWS: EC2, ECS, Lambda, ALB, CloudFormation, CDK, DynamoDB, S3, Kinesis, STS, SSM
Platforms: Jenkins, Cloudflare, Datadog, Terraform, Filebeat/Logstash
Frameworks: Grape, Laravel, Vue, Sinatra, Hugo, tree-sitter
Methods: Diátaxis, immutable infrastructure, documentation-as-code, YAML DSL design
Education
Electronics Engineering Technologist - Radio College of Canada, 2001–2003