US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Data Engineer [Multiple Positions Available]

DESCRIPTION:

Duties: Perform solution architecture, and design and develop data ingestion processes for Machine Learning pipelines.

Evaluate new and current technologies using emerging model feature engineering standards and frameworks.

Provide technical guidance and direction to support the business and its technical teams, contractors, and vendors.

Contribute to the engineering community as an advocate of firm-wide data frameworks, tools, and practices in the AI and ML Development Life Cycle.

Influence peers and project decision-makers to consider the use and application of leading-edge technologies.

Apply advanced analytics techniques to identify, analyze, and interpret trends or patterns in complex data sets enabling superior machine learning model outcomes.

Innovate new ways of managing, transforming, and validating Machine learning model outputs.

Establish and enforce guidelines to ensure consistency, quality, and completeness of Machine learning feature data assets.

Act as the coach and mentor to team members on their assigned project tasks.

Develop a cohesive MLOps and DataOps pipeline to ensure scalability, reliability and resiliency.

Conduct product work reviews with team members.

QUALIFICATIONS:

Minimum education and experience required: Bachelor's degree in Electronic Engineering, Computer Engineering, Computer Science or related field of study plus 7 years of experience in the job offered or as Data Engineer, IT Project Architect, IT Consultant, Application Developer, Software Engineer, or related occupation.

Skills Required: This position requires seven (7) years of experience with the following: utilizing Data Lake and Delta Lake Management Architecture for AI and ML enablement; designing and implementing data lake management architecture for AI-driven solutions, including both traditional Data Lakes and Delta Lakes for optimized data storage and processing; technology, big data analysis, and ML features domain consulting; analyzing, designing, and conducting proof of concepts (POC) to validate architectural decisions and data strategies; delivering incremental solutions using an Agile approach, ensuring continuous integration and delivery; implementing transformations on big data platforms, Python, PySpark and Scala programming languages, including NoSQL databases, Teradata, DB2, Hadoop, Snowflake and SAS BI tools with a focus on leveraging Delta Lake for ACID transactions and scalable data processing.

This position requires five (5) years of experience with the following: utilizing Databricks and AWS and Azure data processing tools to support ML model training; utilizing data transformation tools including AWS Glue, EMR, EKS, Redshift, MSK (Managed Streaming for Apache Kafka), AWS Kinesis, and Databricks for collaborative data engineering and machine learning workflows; handling terabyte- sized datasets with multi-threading in PySpark on cloud platforms, utilizing Databricks for enhanced performance and scalability; utiliz...




Share Job