Job Description


Job Board - Data Umbrella Job Board - Data Umbrella

← Back to all jobs

Johnson & Johnson

Manager, Machine Learning and Data Engineering

Titusville, New Jersey, United States / Full Time

June 6

Job Description


Title: Manager, Machine Learning & Data Engineering

Location of Position: Titusville, NJ

The Manager, Machine Learning & Data Engineering & is responsible for developing and deploying data engineering solutions in support of our End - to -End patient engagement data science initiatives. The Manager will be involved in the full life-cycle development and support of leading-edge data pipelines that enable analytics and drive organizational action across the Janssen portfolio of brands. The Manager will be responsible for designing, configuring, implementing, and supporting specific data pipelines and feature sets, including highly governed and sensitive data assets.

The Manager, Machine Learning & Data Engineering will collaborate with internal data engineers, data scientists, and our data science implementation partners to understand the business requirements that drive patient data integrations and technical solutions. This is a hands-on data engineering role, which will support our large-scale data ecosystem incorporating external data hosting solutions and internal AWS cloud computing data platforms. The scope of this role includes enriching and transforming data to support Janssen data scientists and analytic teams, deep partnership with Data Science and Advanced Analytics teams to build and maintain data pipeline for predictive/ML models, and multi-functional collaboration to ensure licensing and privacy compliance.

As a member of the data engineering team, the Manager, Machine Learning & Data Engineering will also be responsible for understanding Janssen's existing data science, data engineering pipelines, and providing suggestions to optimize the architecture and performance across various projects and initiatives. The Manager will be responsible for supporting the data engineering platform (infrastructure, codebase, and data processing), collaborating with internal and external partners, and handling End-to-End Patient Engagement partners' expectations. She/he will also be working closely with our sales data processing Data Stewardship teams to meet machine learning, advanced analytic and forecasting data needs.

Major Duties & Responsibilities

End-to-End patient data engineering & data science platform support

  • Understand Janssen End-to-End Patient Engagement data, business use cases, and functional requirements to enable and support data engineering pipelines.
  • Act as a domain specialist in Data Engineering technologies and bring outstanding, innovative ideas to develop, test, and measure performance and impact of initiatives.
  • Understand existing external health data hosting solutions and internal data engineering pipelines related to Janssen’s End-to-End Patient Engagement brands, provide inputs and suggestions to securitize, optimize the data architecture, provide recommendations on user accessibility and performance improvement.
  • Audit for data access and ensuring transparency of access to Patient Data Governance Council
  • Integration of de-identified data from external vendor into Janssen’s AWS infrastructure

Drive the design and build of new data pipelines & feature engineering layer

  • Apply data modeling, data engineering and feature engineering principles to support data science requirements and supply raw, curated, and processed data for machine learning engineers and data scientists.
  • Collaborate with other data engineers, ML professionals, business users, and partners from multiple therapeutic areas to take learnings and synergies as they arise.
  • Own the development and implementation of data engineering and feature engineering pipelines for predictive models and model tracking.

Lead the design and implementation of an elite data engineering platform

  • Collaborate with data engineers and data scientists to build scalable data engineering and data science solutions using the AWS platform (S3, EC2, EMR, Amazon Redshift), PySpark, Python, and Dataiku.
  • Assist in developing architectural models for cloud-based data engineering solutions leveraging AWS technologies and PySpark to support large scale and high-performance data science and machine learning platform.
  • Collaborate with JJIT to ensure data license and privacy compliance.

Innovation and leadership

  • Provide thought leadership by researching standard methodologies, conducting experiments, and collaborating with industry leaders.
  • Work in multi-functional agile teams to implement POC, iterate, and deliver business goals and objectives.
  • Cultivate Innovation via improving ML Model performance, efficiency and infrastructure, experimentation, and testing.



Knowledge, Skills and Abilities:

  • Bachelor’s degree in Computer Science, Computer Information Systems, Business Information Systems, Informatics, or related field with 3 - 5+ years of experience.
  • Proven experience as an architect, designing and implementing large-scale data engineering solutions in a fast-paced environment.
  • Proven experience in building data engineering pipelines in Cloud to support ML projects using Pyspark and Python.
  • Demonstrable experience in data modeling, data access, and data storage techniques in the Cloud environment.
  • Proficient in Python, SPARK, EMR, EC2, and RedShift technologies.
  • Expert knowledge in collaborative data science platform like Dataiku will be preferred
  • Solid background and experience in the healthcare/ Life Sciences field will be highly beneficial
  • Strong business and data analysis skills with a focus on patient engagement and commercial pharmaceutical operations is highly preferred.
  • Good track record of translating business requirements into technical designs for new technology solutions.
  • Ability to provide data pipeline implementation guidance based on standard methodologies throughout the life cycle of the project.
  • Demonstrated good leadership capabilities through technology solution ownership and adoption.
  • Understand the value of teamwork within teams, are excellent communicators, and establish relationships with a diverse set of partners.
  • Demonstrated technical innovation and experimentation of the emergent solutions in alignment with project roadmap.

Preferred Knowledge, Skills and Abilities:

  • Knowledge of commercial and patient Life Sciences data sets (Specialty Pharmacy, Claims, Medical Payer, Patient Hub, EMR, Market Research)
  • Exposure to cloud technologies like Amazon Glue, Containers, Lambda functions, Serverless architecture.


Made by Hyperplane