The Data Science and Analytics COE is responsible for leading the creation and development of the overall strategy and direction of data science and advanced analytics at CDW – including ensuring continuity and seamless extension of existing programs, the development of a short- and long-term vision and roadmap, and defining and institutionalizing the role that data and analytics play throughout the organization as the fuel that drives and shapes CDW’s priorities and serves as an accelerant for CDW’s progress.
The Manager - Data Engineering is a key player in the Data Science & Analytics team. This role will be responsible for leading a team of data engineers who are accountable for data engineering, testing and management for the end to end data pipeline including data products. This role will leverage CDW’s data labs environment to enable the data delivery in a common data lake. The data scientists must be enabled with the POC data at their time of the need.
The ideal person will have strong data infrastructure and data engineering skills, a proven track record of leading and scaling data pipelines in a cloud/on prem/big data environment, strong operational skills to drive efficiency and speed. In addition, strong technical leadership skills are required with a vision for how data science can proactively improve companies.
Key Areas of Responsibility
- Develop CDW Data Engineering and architecture standards.
- Develop and execute data engineering strategy and roadmap for data architecture, AI and ML, advanced analytics across the organization.
- Manage a cross-functional team of data engineers and data analysts.
- Responsible for building and managing end-to-end data pipelines and operations from ingestion and integration through delivery.
- Build cross-functional relationships with Business Stakeholders, Architects, Data Scientists, Product Managers and IT to understand data needs and deliver on those needs.
- Drive the design, building, and launching of new data models and data pipelines in production.
- Manage the development of data resources and support new product launches.
- Leads discussion of product-oriented analysis in meetings with clients and partners;comfortable speaking to executives.
- Primary data liaison for stakeholders to drive transformation and to democratize use of data.
- Sunset multiple redundant warehouses and marts with significant cost savings and support new integration and modernization.
- Consolidate the fragmented data across the company and provide simplified access to data for the stakeholders, internal users as well as external partners.
- Support easier compliance and auditing through a single gateway for data exchange.
- Stay abreast of technology development in retail and other industries.
- Act as a sounding board on testing, experimentation, target audience profiling and consumer insights that analyze the relationship between customers, products, partners, conversions, engagement and revenue and drivers.
- Work with multiple complex and disparate datasets to enable data delivery through various means and APIs to evaluate performance and amalgamate information to derive strategic insights and recommendations.
- Contribute and support the development of the overall data science and machine learning strategy and roadmap.
- Establish the core data foundation and common data lake to enable data driven decisions.
- Support delivery of scalable data products.
- Build, motivate & mentor team of specialists to grow their skills and careers.
- Actively participate in the industry externally through internet research, white papers, or conferences.
Education and/or Experience Qualifications
- Bachelor’s degree in computer science, Information Systems or equivalent IT knowledge/experience.
- 10+ years of relevant work experience as a Data Engineer.
Other Required Qualifications
- Experience in leading Data engineering and ETL teams and managing implementation projects that utilize big data, advanced analytics and machine learning technologies.
- Experience with agile software development methodologies.
- Management of onshore and offshore resources.
- Distributed architecture and SaaS experience.
- Hands-on experience in building pipelines from variety of sources such as data warehouses and in-memory OLAP models, as well as experience in NoSQL/cloud.
- Strong understanding of data and information architecture, including experience with Big Data, Relational databases, streaming and batch data processing.
- Strong experience building end-to-end data view with focus on integration.
- Ability to effectively present information, interact with, and respond to questions from managers, employees, customers, and vendors.
- Demonstrated experience in teaching and/or mentoring professionals.
- Passion to evangelize data science and engineering, teach others and learn new techniques.
Data Science and Advanced Analytics Required Qualifications
- Expert Level -Data Exploration and ETL: Alteryx, TalenD, H2O, Informatica, Data Stage etc.
- Expert Level - Experience with programming languages use (Spark, Python, R, Jupyter Notebooks, Java, Scala).
- Expert Level -Data Warehouse Solutions: Redshift, Snowflake, Postgres
- Expert Level - Big Data technologies, Azure, AWS, Hadoop, Spark, Hive, Kafka, Flume, NoSQL stores (HBase, Cassandra, DynamoDB, MongoDB).
- Expert Level -Workflow management: Airflow, Oozie, Azkaban
- Advanced Level -Cloud storage: S3, GCS
- Advanced Level – Github, Maven etc. – Modern code organizer and build process for about half of our applications
- Advanced Level – Expert at Jenkins – Modern build executor
- Advanced Level – Containers – Modern build with microservices
- Advanced Level – Swagger – Experience with modern features for the API including an automatically generated user interface
- Beginner Level -Data Visualization Solutions: Looker, Tableau etc.
- Beginner Level -Distributed logging systems: Pulsar, Kinesis etc.
- Beginner Level -Data Science Workbenches: Cloudera, SAS etc.
- Experience working for consumer or business-facing digital brands.
- Bachelor’s degree in Business, Math, Engineering, Statistic, Economics, Operation Research, Data Science, Computer Science or related quantitate field.