Senior Kafka Platform Engineer
Senior Software Engineer - Event Streaming Platform
Locations: Chicago or New York
We are looking for a skilled software engineer with extensive experience in Apache Kafka to help advance a modern event-streaming ecosystem. This position suits an individual who blends strong development expertise with a deep understanding of distributed systems and messaging platforms.
In this role, you will be responsible for building and enhancing platform services, automation, and developer tools that support a scalable streaming infrastructure. You'll contribute both hands-on engineering and operational excellence to deliver a secure, reliable, and easy-to-use platform for internal teams.
Key Responsibilities
- Build and maintain internal applications, APIs, and automation to streamline Kafka cluster provisioning, access controls, topic management, and operational processes.
- Design and manage production Kafka environments (self-hosted or managed services such as Confluent or cloud-native equivalents), including upgrades, scaling strategies, disaster recovery planning, and performance optimization.
- Deploy and operate Kafka workloads in Kubernetes environments using tools like Helm, operators, and GitOps workflows to ensure consistent, repeatable infrastructure delivery.
- Implement and support components such as Kafka Connect, schema management services, and cross-cluster replication technologies; standardize connector usage and promote reusable patterns.
- Strengthen platform reliability through service level objectives (SLOs), incident management practices, automated recovery mechanisms, and clear operational documentation.
- Develop comprehensive observability solutions, including metrics, logging, tracing, lag monitoring, and capacity reporting dashboards.
- Ensure platform security through encryption, authentication mechanisms, access controls, and compliance-focused automation practices.
- Promote best practices for event streaming, including topic structure, partitioning strategies, schema evolution, message ordering, and fault-tolerant processing approaches.
- Collaborate closely with application engineers, data teams, and infrastructure groups to improve adoption and usability of the platform.
- Contribute to technical leadership by mentoring team members and shaping platform direction, standards, and roadmap.
Required Qualifications
- Strong collaboration and communication skills when working with engineering and platform teams.
- Significant hands-on experience operating Kafka at scale in production (including cluster internals, replication, partitioning, and recovery processes).
- Solid software engineering background, with a track record of delivering reliable production systems.
- Experience running stateful workloads in Kubernetes environments.
- Proficiency in infrastructure automation using Terraform, Helm, GitOps tools (e.g., Argo CD or Flux), and CI/CD pipelines.
- Programming experience in at least one of the following: Python, Go, or Java, along with command-line scripting and Linux fundamentals.
- Experience implementing monitoring, alerting, and performance tuning solutions for distributed systems.
- Familiarity with security practices including encryption, identity/authentication protocols, and secrets management solutions.
- Knowledge of Kafka ecosystem tools such as connectors, schema services, and replication frameworks.
- Experience working in public cloud environments and understanding networking and access control concepts.
- Proven experience in incident response, operational readiness, and driving improvements through post-incident reviews.
Preferred Qualifications
- Background in building developer platforms or self-service infrastructure solutions.
- Experience with stream-processing technologies such as Kafka Streams, Flink, or Spark Streaming.
- Familiarity with Kubernetes-native Kafka solutions (e.g., operator-based deployments).
- Knowledge of change data capture (CDC) tools and scalable database integration patterns.
- Experience designing multi-region architectures and implementing disaster recovery strategies.
FAQs
Congratulations, we understand that taking the time to apply is a big step. When you apply, your details go directly to the consultant who is sourcing talent. Due to demand, we may not get back to all applicants that have applied. However, we always keep your CV and details on file so when we see similar roles or see skillsets that drive growth in organisations, we will always reach out to discuss opportunities.
Yes. Even if this role isn’t a perfect match, applying allows us to understand your expertise and ambitions, ensuring you're on our radar for the right opportunity when it arises.
We also work in several ways, firstly we advertise our roles available on our site, however, often due to confidentiality we may not post all. We also work with clients who are more focused on skills and understanding what is required to future-proof their business.
That's why we recommend registering your CV so you can be considered for roles that have yet to be created.
Yes, we help with CV and interview preparation. From customised support on how to optimise your CV to interview preparation and compensation negotiations, we advocate for you throughout your next career move.