US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Data Engineer II - Python/Spark/AWS

You thrive on diversity and creativity, and we welcome individuals who share our vision of making a lasting impact.

Your unique combination of design thinking and experience will help us achieve new heights.

As a Data Engineer II - Python/Spark/AWS at JPMorgan Chase within the Corporate Sector AI/ML Data Platform's Fusion Data Onboarding and Enablement team, you are part of an agile team that works to enhance, design, and deliver the data collection, storage, access, and analytics solutions in a secure, stable, and scalable way.

As an emerging member of a data engineering team, you execute data solutions through the design, development, and technical troubleshooting of multiple components within a technical product, application, or system, while gaining the skills and experience needed to grow within your role.

Job responsibilities


* Design and build scalable, high-performance, and reliable data pipelines.


* Gather, analyze, model, and transform datasets to extract valuable insights from a large and diverse pool of both structured and unstructured data.


* Organizes, updates, and maintains gathered data that will aid in making the data actionable


* Provide technical expertise in designing and implementing solutions related to data delivery.


* Ensure adherence to data governance principles, implement data quality checks, and maintain data lineage throughout the data lifecycle.


* Collaborate with cross-functional teams to gather business requirements and translate them into effective database designs and data flows.


* Prepare accurate documentation on database design, data flow architecture, and pipeline orchestration.


* Demonstrates basic knowledge of the data system components to determine controls needed to ensure secure data access


* Be responsible for making custom configuration changes in one to two tools to generate a product at the business or customer request

Required qualifications, capabilities, and skills


* Formal training or certification on software engineering concepts and 2+ years applied experience


* Basic knowledge of the data lifecycle and data management functions


* Proficiency in SQL, ETL, data modeling, and Python.


* Hands-on experience with building data pipelines using Python and PySpark.


* Strong database skills with a thorough understanding of databases and data modelling concepts.


* Advanced at SQL (e.g., joins and aggregations)


* Working understanding of NoSQL databases


* Significant experience with statistical data analysis and ability to determine appropriate tools to perform analysis


* Basic knowledge of data system components to determine controls needed

Preferred qualifications, capabilities, and skills



* Knowledge of Apache Iceberg.


* Knowledge of AWS and relevant services like S3, Glue.


* Knowledge of pipeline orchestrators like Airflow, Argo.


* Knowledge of version control systems like GitHub.
...




Share Job