Maksym Puzik — Lead Platform Engineer | DevOps | MLOps

Maksym
Puzik

Platform · MLOps · DevSecOps · Remote

I build the platforms that let engineering teams ship faster, safer, and cheaper. 9+ years across AWS, GCP, and Azure — from bare-metal networking to ML pipeline orchestration. I've cut cloud bills by up to 50%, reduced release cycles by 45%, and built observability systems processing 4 trillion datapoints — while keeping security and compliance non-negotiable.

FinOps

−50% cloud cost

Cutting Cloud Spend in Half Without Touching SLAs

Multi-cloud environment · AWS + GCP · Fintech workloads

Infrastructure costs were growing faster than revenue. Ran a systematic FinOps audit: Spot Instance migration for stateless workloads, Savings Plans for baseline compute, rightsizing 200+ EC2 instances, and architectural consolidation of redundant services. Zero SLA degradation, zero prod incidents during migration.

30–50% cost reduction 0 SLA breaches 3 cloud providers

DORA Metrics

−45% release cycle

From Weekly Releases to Daily Deploys

Multi-team product org · GitLab CI → GitHub Actions + ArgoCD

Release cycles were slow and error-prone — manual steps, inconsistent environments, and deployment anxiety. Rebuilt pipelines end-to-end: standardized environments with Terraform, introduced progressive delivery (blue-green + canary), rolled out GitOps with ArgoCD. Change failure rate dropped below 2%, MTTR under 10 minutes.

45% faster releases <2% failure rate 70% less manual work

Observability

4T+ datapoints

Building Observability That Actually Works at Scale

High-throughput platform · VictoriaMetrics + Prometheus + Loki + Grafana

The existing monitoring was fragmented — infra metrics in one place, app metrics in another, business metrics nowhere. Built a unified observability stack processing 4M+ metrics/min with 4+ trillion datapoints stored, automated incident routing to PagerDuty, and custom dashboards for engineering and business stakeholders alike.

4M+ metrics/min 4T+ datapoints 1 unified stack

Security & Compliance

SOC 2 ready

Shifting Security Left Across the Entire SDLC

Multi-team engineering org · Fintech compliance requirements

Security was reactive — vulnerabilities found in prod, secrets occasionally committed to Git, IaC configs drifting from policy. Embedded tfsec, Checkov, Snyk, and TFLint as mandatory CI/CD gates. Standardized secrets handling with Vault + AWS Secrets Manager. Centralized SSO via SAML 2.0/OAuth2. Supported full SOC 2 readiness audit end-to-end.

100% IaC policy gates SOC 2 compliant 0 secrets in Git

ML Infrastructure

ML team unblocked

ML Platform That Data Scientists Actually Use

Big Data domain · Airflow MWAA + MLflow + Argo Workflows + Kafka + Spark

Data science team was bottlenecked on infra — model training jobs competing for resources, no reproducible environments, retraining pipelines requiring manual intervention. Built end-to-end ML infrastructure: managed Airflow for orchestration, MLflow for experiment tracking and model registry, Argo Workflows for scalable training, Kafka + Spark for feature pipelines.

0 manual retrain steps Full experiment tracking Auto scaling training

Reliability

RTO < 10 min

DR Strategy That Survives a Region Outage

Multi-region AWS architecture · Fintech & eCommerce

The DR plan existed on paper but had never been tested. Designed and implemented a real multi-AZ, multi-region architecture with automated failover, database replication, and runbook-driven recovery. Conducted regular DR drills. Achieved RTO under 10 minutes and near-zero RPO — tested, not estimated.

<10 min RTO ~0 RPO Tested not estimated

Maksym
Puzik

About

Industries

Success Stories

Experience

Skills

Education & Certifications

Education

Languages

Certifications & Training

MaksymPuzik