Course Info
Staff
Professor: Dennis Sun (
dlsun@stanford
)
- Office: Sequoia 124
- Office Hours (in person and on Zoom):
- Mon 3 - 4 PM
- Wed 11:30 AM - 12:30 PM
Teaching Assistants:
- Amber Hu (
amberhu@stanford
)
- Section (02/03): Tues, Thurs 9:30 - 10:20 AM in 380-380Y
- Office Hour: Wed 1:30 - 2:30 PM in Sequoia 207 (Bowker)
- Rahul Kanekar (
rkanekar@stanford
)
- Section (04): Tues, Thurs 10:30 - 11:20 AM in Gates B12
- Office Hour: Tues 1:30 - 2:30 PM in Sequoia 105 (Library)
- Ran Xie (
ranxie@stanford
)
- Section (05/08): Tues, Thurs 10:30 - 11:20 AM in 380-380Y
- Office Hour: Mon 9:30 - 10:30 AM in Sequoia 207 (Bowker)
- Ben Seiler (
bbseiler@stanford
)
- Section (06): Tues, Thurs 4:30 - 5:20 PM in 160-326
- Office Hour: Fri 1 - 2 PM in Sequoia 220 (Fishbowl)
- Michael Howes (
mhowes@stanford
)
- Section (07/09): Tues, Thurs 4:30 - 5:20 PM in 50-51P
- Office Hour: Tues 3:15 - 4:15 PM in Sequoia 220 (Fishbowl)
- Sophia Lu (
sophialu@stanford
)
- Office Hours:
- Thurs 1:30 - 2:30 PM in Sequoia 207 (Bowker)
- Fri 9:30 - 10:30 AM in Sequoia 220 (Fishbowl)
Course Description
A hands-on introduction to the principles and methods of data science. This course
is designed to equip you with tools to begin extracting insights and making decisions
from data in the real world, as well as to prepare you for further study in statistics, machine learning,
and artificial intelligence. We will analyze and visualize data of different shapes and sizes
(e.g., tabular, textual, hierarchical, geospatial). We will discuss common patterns and
pitfalls of data analysis. We will build and evaluate machine learning models, focusing on
general concepts (rather than specific methods), such as supervised vs. unsupervised learning,
training vs. testing error, hyperparameter tuning, and ensemble methods. The focus will be on
intuition and implementation, rather than theory and math. Implementation will be in Python
and Jupyter notebooks, using libraries such as pandas and scikit-learn. This course
culminates in a project where you apply the ideas to a data science problem of your choosing.
This course satisfies the WAY-AQR requirement.
Who Should Take This Class?
This course is designed for undergraduates, particularly freshmen and sophomores.
- This is a hands-on class where you will be analyzing real data.
To do this, you need to know how to code (preferably in Python).
So CS 106a is a prerequisite. You don't need to be a very experienced programmer;
if you know how to write a
for
loop and use a
dict
in Python, you'll likely be fine. But we won't review
basic programming concepts.
- You need to be comfortable doing basic math. We'll be aggregating,
transforming, and comparing numbers every day in this class. But there will be
no calculus, no linear algebra. We will focus understanding the intuition
behind data science methods, rather than the math.
This course is a good fit for the following audiences:
- freshmen or sophomores who are considering majoring in Data Science
- students who want to know what Data Science is and how it applies to the real world
- students needing to pick up practical Data Science skills for an internship
- undergraduates looking to fulfill the WAY-AQR requirement
This course is likely not a good fit for students with experience in
data science and/or machine learning already. You'll almost certainly
pick up some useful skills, but you might
find the pace of the class slow. Consider taking STATS 216 instead.
Class Structure
There will be class every weekday:
- On Mondays, Wednesdays, and Fridays, we will meet as an entire class for lecture.
In lecture, the professor will introduce data science concepts, do some coding demos, and
assign some exercises for you to try at home.
- On Tuesdays and Thursdays, you will meet with a TA in small sections. In section,
you will present and discuss solutions to the exercises posed in lecture the day before.
- If you have a one-time conflict with your section and want to attend another section,
please e-mail both your section leader and the leader of the section you want to attend.
- If you are unable to come to campus one day (e.g., illness, sports travel), please e-mail your
section leader with documentation.
Attendance is an essential part of the learning experience. Therefore, it counts
towards your participation grade (see below).
Grading
- Weekly Labs: 15%
- 2 Exams: 35% (15% each, with 5% added to your higher exam)
- Final Project: 40%
- Participation: 10%
About the Participation Grade: There are several ways
to earn this participation grade:
- presenting your solutions in section
- answering other students' questions on the Ed Discussion board