US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Data Engineer III- Python / Spark / Data Lake

Be part of a dynamic team where your distinctive skills will contribute to a winning culture and team.

As a Data Engineer III - Python / Spark / Data Lake at JPMorgan Chase within the Consumer and Community Bank - Connected Commerce Technology, you will be a seasoned member of an agile team, tasked with designing and delivering reliable data collection, storage, access, and analytics solutions that are secure, stable, and scalable.

Your responsibilities will include developing, testing, and maintaining essential data pipelines and architectures across diverse technical areas, supporting various business functions to achieve the firm's business objectives.

Job responsibilities



* Supports review of controls to ensure sufficient protection of enterprise data


* Advises and makes custom configuration changes in one to two tools to generate a product at the business or customer request


* Updates logical or physical data models based on new use cases


* Frequently uses SQL and understands NoSQL databases and their niche in the marketplace


* Adds to team culture of diversity, opportunity, inclusion, and respect


* Develop enterprise data models, Design/ develop/ maintain large-scale data processing pipelines (and infrastructure), Lead code reviews and provide mentoring thru the process, Drive data quality, Ensure data accessibility (to analysts and data scientists), Ensure compliance with data governance requirements, and Ensure business alignment (ensure data engineering practices align with business goals)

Required qualifications, capabilities, and skills


* Formal training or certification on data engineering concepts and 3+ years applied experience


* Experience across the data lifecycle, advanced experience with SQL (e.g., joins and aggregations), and working understanding of NoSQL databases


* Experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis


* Advanced proficiency in at least one programming language including Python, Java or Scala


* Advanced proficiency in at least one cluster computing frameworks including Spark, Flink or Storm


* Advanced proficiency in leveraging Gen AI models from Anthropic (or OpenAI, or Google) using APIs/SDKs


* Advanced proficiency in Gen AI SDKs such as LangChain, LangGraph, LangSmith


* Advanced proficiency in at least one cloud data lakehouse platform such as AWS data lake services, Databricks or Hadoop, at least one relational data store such as Postgres, Oracle or similar, and at least one NOSQL data store such as Cassandra, Dynamo, MongoDB or similar


* Advanced proficiency in at least one scheduling/orchestration tool such as Airflow, AWS Step Functions or similar


* Proficiency in Unix scripting, data structures, data serialization formats such as JSON, AVRO, Protobuf, or similar, big-data storage formats such as Parquet, Iceberg, or similar, data processing methodol...




Share Job