Virtual Applied Data Science Training Institute
VADSTI 2024 Spring Training Series
Technological advancements and efficient use of computational tools have made it possible to generate and store large amounts of heterogeneous and complex datasets in many disciplines, including public health, clinical, biomedical, and genomics. There is therefore increased demand for data analytics capabilities including artificial intelligence including (AI), and Machine learning (ML) to look at trends, predict outcomes, and make better clinical and health policy decisions. The Howard University Research Centers in Minority Institutions, and the Public Health Informatics Technology for District of Columbia (PHIT4DC) program is pleased to announce VADSTI 4.0, Spring 2025 Training Series to the Howard University research community, the workforce in the District of Columbia and the surrounding region. The goal is to enhance data science capability and application by providing training in the foundations of programming and critical data analytic skills for planning and conducting research involving big data pertinent to biomedical and minority health and health disparities research.
This free Spring Training Series will cover topics including, Foundations of Data Science, Introduction to Python, Basic Statistical Concepts, Data Exploration & Visualization, Experimentation in Data Science, and introduction to algorithmic techniques in Machine Learning.
To register, click the following link. Click Here
For questions, contact VADSTI at vadsti@howard.edu , phit4dc@howard.edu or John Kwagyan, Ph.D. at jkwagyan@howard.edu
Program Objectives & Competencies
The primary objective of the 2025 VADSTI Spring Training Series is to provide training in data science fundamentals, and computing skills with hands-on application to minority health and health disparity datasets. Over the course of the training program, participants will:
- Be introduced to principles of data science.
- Gain practical, hands-on experience with Python and related libraries for accessing data from multiple sources and use analytic methods for analyses.
- Learn about data exploration, and visualization using Python.
- Be introduced to probability and statistical analysis concepts utilized in data science.
- Learn about principles and applications of A/B testing
- Understand the concepts of data partitioning and practice behind supervised learning.
- Be introduced to algorithmic techniques in machine learning.
Digital Certificate of Completion: Participants who complete all the modules and submit their projects in the VADSTI GitHub Data Science Project Portfolio will receive a verified digital certificate of completion.
Evaluation: At the end of each training module, you will be requested to complete electronic feedback forms to the extent to which expectations and objectives were met.
Registration & Fees: No fees for participation, but registration is required to attend.
VADSTI Training Program Schedule
Basic undergraduate knowledge of algebra and probability is recommended for content knowledge topics. The training series consists of the following modules.
Module 1 | Foundations of Data Science with Python
Wednesday, January 29, & Thursday, January 30, 2025
5:00 PM – 8:00 PM EST
Module 2A | Introduction to Python I
Wednesday, February 5, & Thursday, February 6, 2025
5:00 PM – 8:00 PM EST
Module 2B | Introduction to Python II
Wednesday, February 12, & Thursday, February 13, 2025
5:00 PM – 8:00 PM EST
Module 3A | Experimentation in Data Science (A/B Testing and Statistical Analyses) I
Wednesday, February 19, & Thursday, February 20, 2025
5:00 PM – 8:00 PM EST
Module 3B: Experimentation in Data Science (A/B Testing and Statistical Analyses) II
Wednesday, February 26, & Thursday, February 27, 2025
5:00 PM – 8:00 PM EST
Module 4 | Seminal Presentation on Current Development in Data Science
Wednesday, March 12, & Thursday, March 13, 2025
5:00 PM– 8:00 PM EST
Wednesday, February 28, & Thursday, February 29, 2024
11:00 AM – 2:00 PM EST
Module 5 | Introduction to Machine Learning I
Wednesday, March 19, & Thursday, March 20, 2025
5:00 PM – 8:00 PM EST
Module 6| Machine Learning II
Wednesday, March 21 & Thursday, March 22, 2025
5:00 PM – 8:00 PM EST
VADSTI Training Program Curriculum
Here are details for each of the modules
Week 1) Module 1 | Foundations of Data Science with Python
Wednesday, January 29, & Thursday, January 30, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR – Ebelechukwu Nwafor, PhD.
This module introduces students to the fundamental concepts of data science, including data collection, cleaning, exploration, and basic analysis. By the end of the module, students will learn about data visualization techniques, and statistical methods essential for understanding and interpreting data. This foundation is designed to equip students with the tools and techniques required to extract meaningful insights from data and make informed decisions. Hands-on exercises will help reinforce these concepts, emphasizing data manipulation using libraries like Pandas and data visualization with tools like Matplotlib and Seaborn.
Week 2) Module 2A | Introduction to Python I
Wednesday, February 5, & Thursday, February 6, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR – Moussa Doumbia, Ph.D.
This introductory course will be your guide to learning how to set up the working environment and use the power of Python to analyze data, create beautiful visualizations, and use powerful machine learning algorithms. In this module, you will be introduced to Python programming skills and the related libraries for accessing from multiple sources. You will learn how to create amazing data visualizations. Topical areas will include:
- Setting the working environment
- Programming with Python
- NumPy with Python
- Using pandas Data Frames to solve complex tasks.
- Use pandas to handle Excel Files
Week 3) Module 2B | Introduction to Python II
Wednesday, February 12, & Thursday, February 13, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR – Moussa Doumbia, Ph.D.
The module builds upon Introduction to Python I. Topics we include:
- Web scraping with python
- Use matplotlib and seaborn for data visualizations
- Use plotly for interactive visualizations
Week 4) Module 3A | Experimentation in Data Science (A/B Testing and Statistical Analyses) I
5:00 PM – 8:00 PM EST
INSTRUCTOR – Roland Doku, Ph.D.
This topic covers the principles and applications of A/B testing, a foundational technique for making data-driven decisions in healthcare, business and various other industries. The session will focus on hypothesis testing, including the formulation of null and alternate hypotheses, and how they apply to experimental design. Attendees will learn about statistical concepts such as the central limit theorem, the distribution of sample means, and the importance of standard error in determining statistical significance. We will also explore how to calculate and interpret z-scores and t-scores, with a focus on understanding effect size, statistical power, and their relationship to sample size.
Week 5) Module 3B: Experimentation in Data Science (A/B Testing and Statistical Analyses) II
Wednesday, February 26, & Thursday, February 27, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR – Roland Doku, Ph.D.
Part II will build on concepts discuss in Part I. We will go through examples of how to conduct an A/B test on a real-world use case, covering all necessary processes from designing the experiment to interpreting the results. Discuss situations when A/B testing doesn’t work. By the end of the session, participants will be able to design and analyze A/B tests, ensuring reliable conclusions from their experiments to support data driven decision making.
Week 6) SPRING BREAK
March 1 – March 8, 2025
Week 7) Module 4| Seminal Presentation on Current Development in Data Science
Wednesday, March 12, & Thursday, March 13, 2025
5:00 PM– 8:00 PM EST
Wednesday, February 28, & Thursday, February 29, 2024
11:00 AM – 2:00 PM EST
INSTRUCTORS – TBD.
Topics to be determined.
Week 8) Module 5 | Introduction to Machine Learning I
Wednesday, March 19, & Thursday, March 20, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR – Ebelechukwu Nwafor, PhD
This two-part module delves into the core principles of machine learning. In Part I, students will learn about supervised learning techniques, covering linear regression, classification, and model evaluation metrics. They will explore foundational algorithms, including decision trees, support vector machines, and k-nearest neighbors, while focusing on applications in real-world scenarios.
Week 9) Module 6| Machine Learning II
Wednesday, March 21 & Thursday, March 22, 2025
5:00 PM – 8:00 PM EST
INSTRUCTOR: Ebelechukwu Nwafor, PhD
Part II builds on these basics from Part I, introducing unsupervised learning techniques such as clustering and dimensionality reduction. The module will also cover key concepts like overfitting, and model selection. By the end, students will understand both theoretical and practical aspects of machine learning, with hands-on experience in building and evaluating models using tools like Scikit-Learn.