US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Lead Data Engineer - Pipelines, Spark Streaming and Spark Offline

Join us as we embark on a journey of collaboration and innovation, where your unique skills and talents will be valued and celebrated.

Together we will create a brighter future and make a meaningful difference.

As a Lead Data Engineer at JPMorganChase within the Commercial & Investment Bank, you are an integral part of an agile team that works to enhance, build, and deliver data collection, storage, access, and analytics solutions in a secure, stable, and scalable way.

As a core technical contributor, you are responsible for maintaining critical data pipelines and architectures across multiple technical areas within various business functions in support of the firm's business objectives.

Job responsibilities


* Collaborate with all of JPMorgan's lines of business and functions to delivery software solutions


* Experiment, Architect, develop and productionize efficient Data pipelines, Data services and Data platforms contributing to the business


* Design and implement highly scalable, efficient and reliable data processing pipelines and perform analysis and insights to drive and optimize business result


* Design and develop features and entities for ML and rule using spark or any bigdata environment


* Acts on previously identified opportunities to converge physical, IT, and data security architecture to manage access


* Applies reuse-first, AI-assisted practices within delivery and operational routines (e.g., backup/recovery validation and access control review support), ensuring traceability/auditability and alignment to resiliency and security expectations

Required qualifications, capabilities, and skills


* Formal training or certification on Data Engineering concepts and 5+ years applied experience


* Demonstrated experience using enterprise-authorized AI capabilities within the work environment to support data engineering workflows with strong validation habits and awareness of data sensitivity


* Ability to review and validate AI-assisted outputs (e.g., model/design summaries or operational checklists) before use, escalating when uncertain and following data handling requirements


* Experienced programming skills with Python, PySpark


* Experience across the data lifecycle, building Data frameworks, working with Data lakes


* Experience with Batch and Real time Data processing with Spark or Flink and Batch and Real time feature engineering with Spark or Flink or data brick


* Working knowledge of AWS Glue and EMR usage for Data processing and real time data processing and features using Flink or Data brick live tables or Spark streaming


* Experience working with Databricks and data brick live tables


* Experience working in building services using Glue, Lamida, EMR or Flask, and deploying them on AWS EKS or Kubernetes


* Working experience with both relational and NoSQL databases


* Experience in ETL data pipelines both batch and real-time data processing, Data war...




Share Job