Machine Learning Engineer

Shanghai

Permanent

Negotiable

Quantitative Analytics Research and Trading

PR/595853_1781749041

Responsibilities

Architecting and developing the next generation of Company's machine learning research platform, with an emphasis on scalability, reliability, observability, and reproducibility
Building infrastructure that enables large-scale experimentation, model training, and simulation across on-premises HPC and multi-cloud environments
Partnering closely with quantitative researchers to understand evolving research workflows and translate them into robust platform capabilities
Designing and optimizing distributed training pipelines for high-throughput, GPU-accelerated workloads
Improving experiment management, model versioning, artifact tracking, and data lineage to ensure transparent and reproducible research
Developing tools and frameworks that streamline feature engineering, dataset generation, and large-scale backtesting
Leading initiatives to improve compute efficiency, resource scheduling, and workload isolation across heterogeneous environments
Enhancing platform observability, including metrics, logging, tracing, and debugging capabilities tailored to ML workloads
Supporting rapid iteration by implementing features and fixes on tight timelines while maintaining high engineering standards
Contributing to long-term architectural decisions that enable the platform to scale with increasing data volumes and model complexity

Qualifications

2+ years of experience designing and building large-scale distributed systems, ideally in support of research or data-intensive workloads
Strong programming experience in Python, with a focus on writing clean, maintainable, and high-performance code
Experience developing and operating applications on Linux-based HPC clusters and/or cloud platforms
Solid understanding of distributed computing concepts, parallel processing, and resource management
Experience with GPU-based workloads and familiarity with modern ML frameworks (e.g., PyTorch, TensorFlow, JAX)
Experience optimizing data pipelines and handling large-scale structured and unstructured datasets
Strong troubleshooting skills with the ability to debug complex, cross-layer system issues
Ability to work independently in a fast-paced, research-driven environment
Strong communication skills and experience collaborating directly with researchers or data scientists

FAQs

I have applied for a role, what happens next?

Congratulations, we understand that taking the time to apply is a big step. When you apply, your details go directly to the consultant who is sourcing talent. Due to demand, we may not get back to all applicants that have applied. However, we always keep your resume and details on file so when we see similar roles or see skillsets that drive growth in organizations, we will always reach out to discuss opportunities.

This role doesn’t 100% suit my next professional move, can you help?

Yes. Even if this role isn’t a perfect match, applying allows us to understand your expertise and ambitions, ensuring you're on our radar for the right opportunity when it arises.

We also work in several ways, firstly we advertise our roles available on our site, however, often due to confidentiality we may not post all. We also work with clients who are more focused on skills and understanding what is required to future-proof their business.

That's why we recommend registering your resume so you can be considered for roles that have yet to be created.

Can you help with resume preparation, interviews, and offer negotiation?

Yes, we help with resume and interview preparation. From customized support on how to optimize your resume to interview preparation and compensation negotiations, we advocate for you throughout your next career move.