High Performance Computing Specialist - PR/586098_1774879291

High Performance Computing Specialist

Chicago

Permanent

CAD140000 - CAD180000

Financial Technology

PR/586098_1774879291

High Performance Computing Specialist

An elite Montreal based Trading Firm is seeking an HPC Systems Specialist to join a team responsible for designing and operating high performance GPU platforms that support advanced AI and machine learning workloads. This role sits at the intersection of infrastructure engineering, distributed systems, and performance tuning, with ownership spanning from physical hardware through large‑scale model serving. You will work closely with ML practitioners and infrastructure peers to build reliable, scalable, and highly optimized compute environments.

What You'll Do

Build, operate, and continuously improve GPU-based compute platforms supporting large-scale inference and ML workloads
Design and deploy distributed model serving architectures across multi-node, multi-GPU environments
Operate and evolve Kubernetes clusters with GPU scheduling for AI and ML use cases
Configure and tune networking components such as load balancers, firewall rules, and high-throughput interconnects for GPU clusters
Develop and optimize storage solutions for model artifacts, checkpoints, and inference caches
Diagnose and resolve performance and stability issues across hardware, drivers, networking, and application layers
Partner with ML engineers to benchmark models, analyze performance characteristics, and apply inference acceleration strategies
Evaluate new GPU hardware, serving frameworks, and infrastructure patterns to improve efficiency and scalability
Improve system reliability through observability, alerting, capacity planning, and on-call/incident response processes
Automate provisioning and lifecycle management using infrastructure-as-code and scripting

What You Bring

Bachelor's or Master's degree in Computer Science, Engineering, or a related discipline
5+ years of experience in managing high performance computing environments
Hands-on experience operating GPU compute environments for ML inference or training
Familiarity with modern model serving frameworks (e.g., vLLM, SGLang, or similar) and GPU driver/runtime management
Strong Linux systems expertise, including networking, storage, and kernel-level performance considerations
Practical experience running GPU workloads on Kubernetes at scale
Experience with infrastructure automation tools such as Terraform, Ansible, or equivalent
Solid understanding of distributed systems concepts, networking fundamentals (TCP/IP, HTTP/2), and load-balancing strategies
Proficiency in Python and shell scripting for tooling and automation
Experience with monitoring and observability platforms such as Prometheus, Grafana, or comparable tools

This is a hybrid role in the firms Montreal office requiring 3 days per week onsite, and 2 days remote.

FAQs

I have applied for a role, what happens next?

Congratulations, we understand that taking the time to apply is a big step. When you apply, your details go directly to the consultant who is sourcing talent. Due to demand, we may not get back to all applicants that have applied. However, we always keep your CV and details on file so when we see similar roles or see skillsets that drive growth in organisations, we will always reach out to discuss opportunities.

This role doesn’t 100% suit my next professional move, can you help?

Yes. Even if this role isn’t a perfect match, applying allows us to understand your expertise and ambitions, ensuring you're on our radar for the right opportunity when it arises.

We also work in several ways, firstly we advertise our roles available on our site, however, often due to confidentiality we may not post all. We also work with clients who are more focused on skills and understanding what is required to future-proof their business.

That's why we recommend registering your CV so you can be considered for roles that have yet to be created.

Can you help with resume preparation, interviews, and offer negotiation?

Yes, we help with CV and interview preparation. From customised support on how to optimise your CV to interview preparation and compensation negotiations, we advocate for you throughout your next career move.

High Performance Computing Specialist

FAQs

I have applied for a role, what happens next?

This role doesn’t 100% suit my next professional move, can you help?

Can you help with resume preparation, interviews, and offer negotiation?

Handpicked roles for you