The Data Science and Analytics COE is responsible for leading the creation and development of the overall strategy and direction of data science and advanced analytics at CDW – including ensuring continuity and seamless extension of existing programs, the development of a short- and long-term vision and roadmap, and defining and institutionalizing the role that data and analytics play throughout the organization as the fuel that drives and shapes CDW’s priorities and serves as an accelerant for CDW’s progress.
The ML Engineer is a key player in the Data Science & Analytics team. This role will be responsible for data engineering, data science Model deployment, testing and management for the end-to-end ML and data pipeline including data products. This role will leverage CDW’s AI labs environment to enable the delivery in a common data lake and products.
Reporting to the Sr Manager AI Engineering & Architecture of Data Science and Analytics the ML Engineer must have data infrastructure, data engineering and Machine Learning skills, a proven track record of leading and scaling data pipelines, ML Model deployments in a cloud/on prem/big data environment, strong operational skills to drive efficiency and speed. In addition, strong technical leadership skills are required with a vision for how data science can proactively improve company.
Key Areas of Responsibility
- Responsible for building and managing end-to-end data pipelines and operations from ingestion and integration through delivery for the data science prototyes and data products.
- Adept at queries, report writing and presenting findings, analyze large complex datasets to extract insights and decide on the appropriate technique.
- Understand and use data and ML fundamentals, including data structures, algorithms, computability and complexity and computer architecture.
- Collaborate with data engineers to build data and model pipelines, manage the infrastructure and data pipelines needed to bring code to production.
- Provide support to engineers and product managers in implementing machine learning in the product.
- Drive the design, building and launching of new data models and ML/Data pipelines in production.
- Identify, analyze, and interpret trends or patterns in complex data sets.
- Consulting with managers, Product owners to determine and refine machine learning objectives.
- Transforming data science prototypes and applying appropriate ML tools and technologies.
- Research and implement best practices to improve the existing machine learning infrastructure.
- Keeping abreast of developments in machine learning.
- Contribute and support the development of the overall data science and machine learning strategy and roadmap.
Education and/or Experience Qualifications
- Bachelor’s degree in computer science, Information Systems, or equivalent IT knowledge/experience.
- 2+ years of relevant work experience in Data Analysis, Data Engineer, Data Science & Data Integration.
- Experience working with Data engineering, Data science, ETL teams and managing implementing projects that utilize big data, advanced analytics, and machine learning technologies.
- Hands-on experience in building data and ML pipelines from variety of sources such as data warehouses and in-memory OLAP models, as well as experience in NoSQL/cloud.
- Strong understanding of data, ML Models, Big Data, Relational databases, streaming and batch data processing.
- Knowledge of machine learning evaluation metrics and best practice.
- Strong experience building end-to-end data view with focus on integration.
- Hands-on experience Developing, implement data solutions using technologies such as:
- Programming languages use (SQL, Spark, Python, R, Jupyter Notebooks, Java, Scala, C++).
- Data Exploration and ETL: Alteryx, Talend, H2O, Informatica, Data Stage, Azure Data explorer, Azure Data Factory.
- Data Warehouse Solutions: Redshift, Snowflake, Postgres, Data Lake.
- Big Data technologies, Azure, AWS, Hadoop, Spark, Hive, Kafka, Flume, NoSQL stores (HBase, Cassandra, DynamoDB, MongoDB).
- Cloud storage: S3, GCS, ADLS, Blob.
- Machine Learning: Cloudera Data Science Workbench, Azure ML , Amazon ML, Google AutoML, Vertex AI.
- Data Visualization Solutions: MS Power BI, Looker, Tableau, Azure Streaming Analytics, Data Lake Analytics, Azure Time Series Insights, Azure Synapse Analytics.
- CI/CD and Code Management: Git, Maven, Docker , Jenkins, Azure Dev Ops.
CDW is committed to maintaining a workplace that is free of known hazards and to ensuring the safety, health, and well-being of coworkers and candidates for employment and their families, as well as the community.
CDW requires all coworkers be fully vaccinated against COVID-19, with the only exceptions being a documented, legally required medical or religious accommodation. Prior to starting with CDW, successful candidates will be required to: (i) be fully vaccinated against COVID-19 and provide CDW with proof of full vaccination; or (ii) apply for and receive a medical or religious-based accommodation to be exempt from the mandatory vaccination policy.