We are seeking a Senior Data Engineer to spearhead the development of innovative solutions for our in-house data analytics platform within the PARIS 2.0 ecosystem. The ideal candidate will be instrumental in delivering robust, sustainable, and flexible analytics solutions, including Machine Learning capabilities, to achieve strategic goals.
Key Responsibilities
Data Solutions Implementation
- Develop and maintain cloud-based data lakes and warehouses in accordance with best practices.
- Collaborate with the development team to design and deliver back-end data pipeline components that adhere to industry standards and architectural guidelines.
- Design, develop, and maintain efficient ETL/ELT data pipelines sourced from various internal and external data streams.
- Gather business requirements to create data models that ensure quality, integrity, and performance.
- Perform thorough testing and validation of data pipelines to guarantee accuracy and consistency.
High-Quality Analytics Solutions
- Partner with data scientists and analysts to understand their needs and develop suitable solutions.
- Diagnose data-related issues, conduct root cause analysis, and implement preventive measures.
- Clearly document architectures, data dictionaries, data mappings, and other relevant technical information.
Team Collaboration and Communication
- Stay updated on emerging technologies, tools, and best practices in data engineering to improve existing processes and systems.
- Work collaboratively with onshore and offshore IT team members to ensure the delivery of high-quality, supportable solutions.
- Communicate effectively at all levels to aid in design decisions with stakeholders.
Internal and External Relationships
- Internal: Development Teams (HK & India), Ops & Support (HK & India).
- External: Third-party vendors for data source integration and external development teams.
Experience and Qualifications
Essential:
- More than 6 years of experience in a Data Engineering role with a focus on SQL and Python.
- In-depth understanding of data lake and data warehouse design principles.
- Practical experience with cloud-based ETL services (e.g., AWS, EMR, Airflow, Redshift, Glue).
- Experience in deploying and managing MLOps frameworks (e.g., AWS SageMaker).
- Familiarity with distributed computing systems (e.g., Spark, Hive, Hadoop).
- Proficient in databases such as Postgres, MySQL, and Oracle.
- Excellent communication skills in English, both verbal and written.
Desirable:
- Experience with additional cloud platforms and hybrid cloud infrastructures (e.g., GCP, Azure).
- Knowledge of Machine Learning and Deep Learning concepts.
- Proficient in real-time and near real-time data streaming technologies (e.g., Kafka, Spark Streaming, Pub/Sub).