Brett Michaelis

[email protected] 801-310-2818 Orem, UT 84057 linkedin.com/in/brettmichaelis
Summary
MLOps & Infrastructure Engineer with 10+ years building and operating cloud-native systems at scale, including hands-on experience designing data pipelines and GPU-accelerated CUDA compute infrastructure for production ML workloads. Skilled in Kubernetes, Terraform, and CI/CD automation across GCP and AWS, with a track record of close collaboration with ML engineers and data scientists to bridge research requirements and production infrastructure. Strong observability foundation (Prometheus, Grafana, Mimir) and automation bias—replacing manual processes with reliable, repeatable systems that let ML teams ship and iterate with confidence.
Core Skills
Experience
Operations Engineer
Smarty.com | Orem, UT
  • Leading migration of a legacy Grafana observability platform to a GitOps-managed deployment, auditing and rationalizing all alerting across production services as part of the initiative.
  • Operate observability stack using Prometheus, Grafana, Mimir, and Alloy for metrics collection, long-term storage, and dashboarding.
  • Implement canary deployments via Nomad for progressive production rollouts, enabling confident releases with automated rollback.
  • Driving company-wide migration from Bitbucket to GitHub, including full re-implementation of all CI/CD workflows in GitHub Actions.
  • Manage multi-cloud deployments (Tier.Net, UpCloud, GCP, AWS, Hetzner) with Terraform, Nomad, and Bitbucket Pipelines, improving uptime and deployment velocity.
  • Automate repetitive workflows with Bash and Go, reducing manual toil across operations.
Senior DevOps Engineer
Five9.com
  • Orchestrated multi-cloud deployments on GCP using Kubernetes, Helm, and Terraform to support high-availability SaaS workloads.
  • Built self-service deployment tooling and automation, enabling engineering teams to provision and release independently while preserving platform standards.
  • Partnered with product and engineering teams to define and track SLOs/SLIs, supporting customer-facing uptime goals.
  • Streamlined incident response with Five Whys, improving on-call processes and reliability through blameless postmortems.
Software Engineer / DevOps Engineer
Vivint SmartHome
  • Designed and operated GCP-based ML pipeline infrastructure supporting data science and ML engineering teams, including large-scale data ingestion, processing, and storage across a 1.5 PB data lake.
  • Provisioned and managed GPU-accelerated CUDA compute clusters for ML model training workloads, optimizing resource allocation and job throughput.
  • Collaborated directly with ML engineers and data scientists to translate research requirements into scalable, production-ready infrastructure.
  • Developed and deployed Golang-based microservices, optimizing performance and reducing latency.
  • Automated infrastructure operations with Saltstack and Jenkins, establishing observability via TICK stack across ML and application workloads.
Director, IT & Software Development
Unicity International
  • Led global infrastructure modernization, migrating legacy apps to containerized, cloud-native environments.
  • Standardized multi-cloud deployments (AWS EC2, S3) to improve scalability and global availability.
  • Introduced reliability practices, including error budgeting and deployment automation.
Assistant Director, Web Development
Utah Valley University
  • Directed university-wide web development projects, improving service reliability and scalability for mission-critical systems.
Counterintelligence Agent
U.S. Army – Utah National Guard
  • Conducted secure intelligence operations, leveraging structured incident response and AAR methods.
Education
Bachelor of Science: Information Systems
Utah Valley University | Orem, UT