The Data Science and Analytics COE is responsible for leading the creation and development of the overall strategy and direction of data science and advanced analytics at CDW – including ensuring continuity and seamless extension of existing programs, the development of a short- and long-term vision and roadmap, and defining and institutionalizing the role that data and analytics play throughout the organization as the fuel that drives and shapes CDW’s priorities and serves as an accelerant for CDW’s progress.
The Sr. ML Data Engineer is a key player in the Data Science & Analytics team. This role will be responsible for data engineering, testing and management for the end to end ML and data pipeline including data products. This role will leverage CDW’s AI labs environment to enable the delivery in a common data lake and products.
Reporting to the Sr Manager AI Engineering & Architecture of Data Science and Analytics the Sr. ML Data Engineer must have data infrastructure, data engineering and Machine Learning skills, a proven track record of leading and scaling data pipelines in a cloud/on prem/big data environment, strong operational skills to drive efficiency and speed. In addition, strong technical leadership skills are required with a vision for how data science can proactively improve company.
Key Areas of Responsibility
- Responsible for building and managing end-to-end data pipelines and operations from ingestion and integration through delivery for the data products.
- Strong analytical skills with the ability to collect, organize, analyze, and disseminate information with attention to detail and accuracy.
- Outline best practices and establish standards for data strategy, data lifecycle, data ownership, data definition, and data classification.
- Adept at queries, report writing and presenting findings.
- Build cross-functional relationships with Business Stakeholders, Business Analysts, ML Data Engineering, Architects, Data Scientists, Product Managers and IT to understand data needs and deliver on those needs.
- Contribute and support the development of the overall data science and machine learning strategy and roadmap.
- Drive the design, building and launching of new data models and ML/Data pipelines in production.
- Primary data liaison for stakeholders to drive transformation and to democratize use of data.
- Identify internal and external data sources for potential integration into platform; Lead data migration and integration projects.
- Identify, analyze, and interpret trends or patterns in complex data sets.
- Consolidate the fragmented data across the company and provide simplified access to data for the stakeholders, internal users as well as external partners.
- Stay abreast of technology development in retail and other industries.
- Work with multiple complex and disparate datasets to enable data delivery through various means and APIs to evaluate performance and amalgamate information to derive strategic insights and recommendations.
- Support delivery of scalable data products.
Education and/or Experience Qualifications
- Bachelor’s degree in Computer Science, Information Systems or equivalent IT knowledge/experience.
- 5+ years of relevant work experience in Data Analysis, Data Engineer & Data Integration.
- Experience working in Data engineering and ETL teams and on managing implementation projects that utilize big data, advanced analytics and machine learning technologies.
- Hands-on experience in building pipelines from variety of sources such as data warehouses and in-memory OLAP models, as well as experience in NoSQL/cloud.
- Strong understanding of data, Big Data, Relational databases, streaming and batch data processing.
- Strong experience building end-to-end data view with focus on integration.
- Demonstrated experience in teaching and/or mentoring professionals.
- Hands-on experience Developing, implement data solutions using technologies such as:
- Data Exploration and ETL: Alteryx, Talend, H2O, Informatica, Data Stage, Azure Data explorer, Azure Data Factory.
- Programming languages use (SQL, Spark, Python, R, Jupyter Notebooks, Java, Scala).
- Data Warehouse Solutions: Redshift, Snowflake, Postgres, Data Lake.
- Big Data technologies, Azure, AWS, Hadoop, Spark, Hive, Kafka, Flume, NoSQL stores (HBase, Cassandra, DynamoDB, MongoDB).
- Cloud storage: S3, GCS, ADLS, Blob.
- Data Visualization Solutions: MS Power BI, Looker, Tableau, Azure Streaming Analytics, Data Lake Analytics, Azure Time Series Insights, Azure Synapse Analytics.
- CI/CD and Code Management: Git, Maven, Docker , Jenkins, Azure Dev Ops.
CDW is committed to maintaining a workplace that is free of known hazards and to ensuring the safety, health, and well-being of coworkers and candidates for employment and their families, as well as the community.
CDW requires all coworkers be fully vaccinated against COVID-19, with the only exceptions being a documented, legally required medical or religious accommodation. Prior to starting with CDW, successful candidates will be required to: (i) be fully vaccinated against COVID-19 and provide CDW with proof of full vaccination; or (ii) apply for and receive a medical or religious-based accommodation to be exempt from the mandatory vaccination policy.