Syllabus

  

qsbrii2025.blogspot.com


Policy on video recording or remote access

1. All lectures are in-person. Students are strongly encouraged to attend, especially for R-sessions. 
2. The video recordings of the lectures will NOT be available for any personal or professional reasons. However, all lecture notes and homework are available online, and the students are encouraged to schedule (remote or in-person) office hours with me. 
3. Remote access by zoom will be available. However, all attendees will be muted but you can communicate through chat. I expect that you will join zoom from a private and quiet environment. 

Canvas vs blogspot.com
1. ALL lecture notes, homework assignments, and other teaching material are only available on qsbrii2025.blogspot.com.
2. Canvas will be used for class announcement ONLY.  

Time and Location 
MW 1:20-2:40
F 1:20-3:00
Foch 3rd floor Lecture Hall 
or ED Center lower floor (10/1, 10/17)

GOAL: Build your confidence in comprehending and evaluating the statistical approach in most publications in your field of research, and in using some statistical methods not taught in this course when you need them in your future research.

SCOPE: Apply the basic concepts of statistical inference to explore relations between two or more variables, with focus on statistical reasoning and the ART of data analysis. 

TIPS: 
The best way to learn statistics is to apply your own common sense and reasoning and applying statistical methods to real problems encountered in your research.
Although math plays an important role in statistics, for the vast majority of biomedical researchers, it is more important to understand what a particular statistical method tries to do than to know the details of the mathematical formula and computational algorithms. 

"Any fool can know. The point is to understand" - Albert Einstein

In other words, you want to have the big pictures before getting into the details. Mathematics mostly serves the purpose of justifying our common sense and enabling us to handle complicated problems.

For data analysis, it is often much more important to make sense of the data using a variety of visualization tools before describing them with numbers and statistical models.

Why R? 

The #1 reason for using R for data analysis is Reproducible Research. 
But it is also good for 
1. visualizing the data,
2. numerical simulation to help understand statistical methods,
3. performing modern statistical computational methods,
4. generating beautiful graphs for your publications.  


Jupyter Notebook and Google Colab

All homework involving R should be submitted in the format of Jupyter Notebook. The main benefit of Jupyter notebook is to have your R script, results, and comments/annotation in one file. Such practice is called literate programming.

The easiest way to start using Jupyter Notebook is through Google Colab,  as long as you have a google account, you have access to it, just go to colab.google.com. The TAs will have a help session to get you started on Sept 30th. Note that to run R instead of Python, you need to "Change runtime type" to R. Look for a "connect" button on the upper right corner once you open a Jupyter notebook, and you will see the options.

An alternative is to install them locally on your computer. See my old post for a guide on installation. 


Tentative Schedule

Sept 29, Introduction, Fisher Exact Test and Hypothesis Testing
Oct 1, Chi-square Tests and Simulation with R (Education Center Lower floor)
Oct 3, Correlation and Linear Regression
Oct 6, Regression Diagnostics  
Oct 10, One-Way ANOVA (The contents of final paper/project will be announced)
Oct 13, Two-Way ANOVA and Statistical Interactions 
Oct 17, Permutation tests +  R session  (Education Center Lower floor)
Oct 20, TBD
Oct 22, Logistic Regression

Office Hour:

MW 2:40-3:40 or by appointment

TAs:

Todd Richmond
Gordon Huang
Renae Irving
Felipe Vilicich 


TA's office hours: 

Monday 3-4pm: Renae

Tuesday 5-6pm: Todd

Wednesday 3-4pm: Gordon

Thursday 7:30-8:30pm: Felipe

 

Library 1st floor, Group Study Room 1 

Recommended Books

Peter Dalgaard: Introductory Statistics with R
Robert Elston: Basic Biostatistics for Geneticist and Epidemiology

Homework Policy

1. I will assign homework for every lecture. Unless specified, it is due the day of the next lecture, by midnight (so that you can ask me questions before and after my lectures).

2. Homework has to be submitted through Email. For an assignment that requires using R, it MUST be in the format of a Jupyter notebook, with your code and relevant output, as well as your comments. Do NOT send R-code without any explanation.  For an assignment that does not require R, it can be in one of any common format, such as txt, docx, pdf,  jpg, gif. I will take it if you write on paper, take a photo/scan, and email it. 

VERY IMPORTANTThe TITLE of the email and the FILE NAME of the homework have to be in the format of QSBRII_HW#_FirstName_LastName, where # be the homework number, e.g. QSBRII_HW1_Joe_Doe, QSBRII_HW2_Jane_Doe, etc.  Be aware that the hashtag(#) should not be in the file name. Note that you can split a homework assignment into multiple files, such as QSBRII_HW1a_Joe_Doe, QSBRII_HW1b_Joe_Doe, etc.  If you have any questions on this, do NOT hesitate to ask me or TAs.

3. At any time of the course, you can re-submit your homework to fix any mistakes. Your resubmission should have the filename in the format of QSBRII_HW#.%_FirstName_LastName, where # is the homework number, and % is a number denoting the version, e.g QSBRII_HW1.1_Joe_Doe for your first resubmission, QSBRII_HW1.2_Joe_Doe for your 2nd resubmission, etc

4. TAs will post the grade of HWs online in a google sheet (with the last 4 digit student ID as identifier), and you may make inquiries as well. 

Final Paper/Project

The scope of a final paper/project will be announced on Oct 10th, and will be due on Oct 30th. 

Final Grade

The grade of this course is pass/fail, and usually 10-15% will get honor. The grade will be based on the quality and effort of the homework (60%) and the final paper (30%), and class participation (10%).  The grade should be posted in the system about two weeks after the due date of the final paper/project. 

Comments

Popular posts from this blog

Lecture 3

Lecture 2 + Lab