Resume

Available in PDF format

SRE / Infra / Platform Engineer. I solve ambiguous production challenges across app, cloud, network & security layers

Overview #

Staff-level SRE/platform engineer with 15+ years across regulated SaaS, cloud security, Kubernetes use, Terraform/IaC implementation, observability, and incident response. I specialize in ambiguous production problems that cross application, network, cloud, and security layers, then turn the fix into automation, better delivery pipelines, and stronger operating practices.

Projects #

Outcron (outcron.com): “Cron-as-a-Service”
lambda-deploy-log-compare (nat.sh/lambdalog): AWS Lambda invocation log analysis tool

Key Skills #

Reliability Engineering: Incident response, production troubleshooting, SLOs/SLIs, toil reduction, operational readiness, escalation handling
Platform Engineering: Kubernetes/EKS, Terraform, Helm, Argo CD, Flux, GitHub Actions, GitLab CI, Jenkins
Cloud & Infrastructure: AWS, GCP, distributed systems, service delivery, server consolidation, autoscaling, cost optimization
Observability: Prometheus, Grafana, Alertmanager, Thanos, Loki, Splunk, Elasticsearch
Security & Regulated Environments: FedRAMP Moderate/High, PCI-DSS, Cloudflare One, Fastly CDN/NGWAF, WireGuard, IPsec, OpenVPN
Software & Automation: Go, Python, Ruby, Bash, Git, API integration, cloud resource automation
AI-Augmented Engineering: Claude Code, OpenAI Codex, self-hosted LLMs, Ollama, AI-assisted automation workflows

Experience #

2026 / Block #

Cloud Security Engineer / Frisco, TX (remote)

Designed and delivered an enriched error reporting pipeline for a network ACL deployment system used for Square, CashApp, etc, propagating failure context across AWS Lambda functions via S3 and surfacing actionable alerts in Slack to reduce mean time to detect (MTTD).
Improved reliability of a proxyless security pipeline by implementing retry logic, hardening automated CA bundle and system registry API calls against transient failures.
Ramped up on a complex security platform quickly, shipping production-ready code across AWS Lambda, S3, Cloudflare One, and Fastly CDN/NGWAF services while completing all new-engineer onboarding.

2024 ~ 2025 / Cisco Systems #

Site Reliability Engineer / Frisco, TX (remote)

Architected service re-implementation in Kubernetes with Argo and Helm.
Achieved 99.99% uptime for FedRAMP-compliant environments at Moderate and High Impact Levels.
Streamlined deployment pipelines for over 150 component services using GitHub, Kubernetes, and Argo, reducing service onboarding time by more than 50%.

2022 ~ 2024 / Schmoll Systems LLC #

Founder / Principal / Frisco, TX (remote)

Built Go-based cloud resource management tooling to automate AWS/GCP provisioning for client environments.
Reduced client infrastructure costs by 30% through consolidation, autoscaling improvements, and platform simplification.
Led production incident response for client outages, resolving issues within SLA 98% of the time.
Mentored client teams on Kubernetes, Terraform, and operational practices to improve long-term platform ownership.

2020 ~ 2022 / Salesforce.com (MuleSoft) #

Site Reliability Engineer / Santa Fe, NM (remote)

Enhanced stability of FedRAMP Moderate GovCloud environments, achieving 99.9% uptime and uplifting to FedRAMP High.
Automated incident remediation workflows, reducing manual interventions by more than 40%.
Collaborated with development teams to implement cloud-native monitoring with Prometheus and Grafana, improving availability of common Service Level Indicators (SLIs) and establishing useful Service Level Objectives (SLOs).
Mentored 3 junior engineers in advanced troubleshooting techniques, fostering a culture of proactive incident management.

2018 ~ 2020 / Subsplash #

Site Reliability Engineer / Santa Fe, NM (remote)

Migrated 20+ Go-based microservices from AWS EC2 instances to AWS EKS, reducing deployment time by 50% and standardized deployments using Terraform, GitLab CI, and Helm.
Ensured PCI-DSS compliance for payment card processing systems, passing all audits with zero outstanding findings.
Oversaw and implemented infrastructure consolidation from 3 distinct acquisitions, unifying networking and systems, and scaling infrastructure to handle 200% user growth.
Trained 10+ developers in Kubernetes best practices, enabling daily production deployments.

2013 ~ 2018 / Salesforce.com (Pardot) #

Site Reliability Engineer / Seattle, WA (remote)

Automated infrastructure deployments with Chef and Terraform, supporting 10+ daily application code deployments in a dynamic environment of more than 50 developers.
Ensured Salesforce Trust compliance, reducing security vulnerabilities by 25% through proactive monitoring with standard tooling.
Led cross-functional teams to optimize system performance, improving application response times by more than 50%.
Mentored junior SREs and built a scalable incident response framework.

Earlier Career #

ServiceNow, Performance Engineer (2012 ~ 2013): Optimized MySQL database performance for 1000+ instances, improving query response times by more than 25%.
SAP Concur, Unix Systems Engineer (2007 ~ 2011): Managed 100+ Red Hat Linux systems and supported a Hadoop cluster for data mining; assisted with over 1300 servers across multiple sites.
Breakwater Security Associates, Network Engineer (2005 ~ 2006): Responded to network and system outages under strict Service Level Agreements; assisted clients with system modifications and updates.

Education #

2005 ~ 2008 / University of Washington #

Bothell, WA

Attained Bachelors of Science in Computing & Software Systems