Head of Kafka Platform


New York
Permanent
USD200000 - USD500000
Financial Technology
PR/575294_1773696200
Head of Kafka Platform

Role Summary

We're seeking a visionary Head of Kafka Platform to lead the strategy, architecture, and evolution of our enterprise event‑streaming ecosystem. In this role, you will own the end‑to‑end Kafka platform-driving its scalability, resiliency, security, and automation-while setting the standards and long‑term roadmap for real‑time data movement across the organization. You will champion a cloud‑native, Kubernetes‑driven operating model, build world‑class self‑service capabilities, and partner with engineering, data, and infrastructure leaders to ensure the platform serves as a strategic backbone for mission‑critical applications.

This is a high‑impact technical leadership role for a seasoned Kafka expert who can combine deep platform engineering with architectural vision, operational excellence, and people mentorship.


Key Responsibilities

Platform Strategy & Ownership

  • Define and execute the multi‑year roadmap for Kafka as a first‑class enterprise platform.
  • Architect, scale, and operate large‑scale Kafka clusters across self‑managed and cloud‑hosted environments.
  • Drive architectural decisions around brokers, storage layers, multi‑AZ/region resiliency, replication models, and global data distribution.

Modern Deployment & Automation Leadership

  • Champion the operation of Kafka on Kubernetes using Operators, Helm, CRDs, and GitOps patterns.
  • Build robust automation frameworks (IaC, GitOps pipelines) enabling repeatable, zero‑downtime cluster operations.
  • Establish enterprise automation guardrails, golden patterns, and best‑practice templates.

Kafka Ecosystem Stewardship

  • Lead strategy and operational excellence for Kafka Connect, Schema Registry, MirrorMaker 2, and Cluster Linking.
  • Define connector standards, build self‑service capabilities, and partner with app teams to onboard new use cases with ease.

Reliability Engineering & Operational Excellence

  • Own SLO definitions, incident management, runbooks, and postmortem culture.
  • Drive resilience engineering across clusters-automated remediation, failure testing, capacity modeling, and disaster‑recovery patterns.

Observability & Performance Leadership

  • Build an enterprise monitoring and telemetry framework for Kafka metrics, logs, traces, partition health, consumer lag, and storage forecasting.
  • Leverage Prometheus, Grafana, Burrow, Cruise Control, and OpenTelemetry to drive proactive visibility and optimization.

Security, Governance & Compliance

  • Establish and enforce security standards including encryption, mTLS, access controls, secrets management, and audit policies.
  • Partner with Security, Infrastructure, and Compliance teams to ensure the platform meets enterprise‑grade standards.

Organizational Enablement & Architecture Guidance

  • Serve as the primary Kafka subject‑matter authority for engineering, data, and SRE teams.
  • Provide architectural mentorship on schema evolution, retention strategies, partitioning models, exactly‑once semantics, and DLQ patterns.
  • Build documentation, best‑practice guides, onboarding materials, and internal training.

Technical Leadership & Team Development

  • Mentor engineers, contribute to hiring and talent development, and shape a high‑performance platform culture.
  • Influence enterprise streaming strategy through thought leadership, tooling decisions, and cross‑team design reviews.

Core Skills & Experience

Kafka Mastery

  • Deep expertise operating Kafka at scale, including broker internals, replication mechanics, controller behavior, ISR, rebalancing, and storage tuning.
  • Proven experience leading complex upgrades, scaling efforts, global clustering, and high‑availability design.

Kubernetes & Cloud‑Native Engineering

  • Extensive experience running stateful distributed systems on Kubernetes via Operators, Helm, and CRD‑based workflows.
  • Expertise in cloud‑native deployment models, containerization strategies, and GitOps lifecycle patterns.

Automation & DevOps

  • Strong proficiency with Terraform, GitOps tooling (Argo CD, Flux), CI/CD pipelines, and automated operational workflows.

Programming & Systems Foundations

  • Proficiency in Python, Go, or Java; strong Linux fundamentals, networking, JVM tuning, and Bash scripting.

Observability & Performance

  • Expertise with capacity planning, performance tuning, monitoring stacks, alerting systems, and operational readiness reviews.

Security & Governance

  • Hands‑on experience with TLS/mTLS, SASL/OAuth2, ACL/RBAC, and secret‑management solutions like HashiCorp Vault.

Ecosystem Proficiency

  • Strong familiarity with Kafka Connect, Schema Registry, MirrorMaker 2/Cluster Linking, and Cruise Control.

Cloud Experience

  • Solid understanding of AWS, Azure, or GCP networking, IAM, and managed streaming services (e.g., Confluent Cloud, AWS MSK).

Preferred Qualifications

  • Experience with stream‑processing frameworks (Kafka Streams, Flink, Spark Structured Streaming).
  • Hands‑on expertise operating Strimzi or Confluent for Kubernetes in production.
  • Deep understanding of CDC technologies (e.g., Debezium) and high‑volume connector operations.
  • Experience designing multi‑region architectures, active‑active replication, and disaster‑recovery strategies.

Locations

  • New York, NY
  • Chicago, IL

FAQs

Congratulations, we understand that taking the time to apply is a big step. When you apply, your details go directly to the consultant who is sourcing talent. Due to demand, we may not get back to all applicants that have applied. However, we always keep your resume and details on file so when we see similar roles or see skillsets that drive growth in organizations, we will always reach out to discuss opportunities.

Yes. Even if this role isn’t a perfect match, applying allows us to understand your expertise and ambitions, ensuring you're on our radar for the right opportunity when it arises.

We also work in several ways, firstly we advertise our roles available on our site, however, often due to confidentiality we may not post all. We also work with clients who are more focused on skills and understanding what is required to future-proof their business. 

That's why we recommend registering your resume so you can be considered for roles that have yet to be created. 

Yes, we help with resume and interview preparation. From customized support on how to optimize your resume to interview preparation and compensation negotiations, we advocate for you throughout your next career move.

Handpicked roles for you