Posts

Mix-Effect Models

Data rats.txt Jupyter Notebook Slides HW7 (Due Oct 22nd) Find an example in your research area that should be modeled by mixed effect model instead of regular linear regression model. You only need to described the study design. 

Lecture 7

Slides Jupyter Notebook Lab3 HW7 (Due Oct 22) Compare the statistical power between One-way ANOVA F-test and Permutation Test on range statistics under the following scenario:  Number of Groups: 4 Number of Observations per group: 10 Data in each group are generated from the following normal distributions N(0,1) N(-1,1) N(1,1) N(0,1).  Note that this numerical experiment might take a bit long to finish, as we will perform permutation tests 1000 times. Below is a template of code to help you get started. pval.F <- rep(0,1000) pval.R <- rep(0,1000) for(i in 1:1000){     ### Simulate Data     x <- rep(c("A","B","C","D"),rep(10,4))     y <-           ### fstatistics and degree of freedoms     fstat <- summary(lm(y~x))$fstat          ### get P-val from the f-distribution     pval.F[i] <- 1- pf(fstat[1],fstat[2],fstat[3])          ### do...

Lecture 6

  Slides Jupyter Notebook HW6 (Due Oct 17): Analyze the data set  warpbreaks  that is included in R distribution. It is regarding the Number of Breaks in Yarn during Weaving. Describe how two factors affect the response.  You can view the description and access the data in R by  data(warpbrea ks) Briefly explain the difference between "correlation" and "interaction". Self-administer Lab 2 . You do not have to show me your work. 

Lecture 5

  Jupyter Notebook Slides lymphosyte data HW5 For the example given in the lecture, with the same data, use R function   lm   to fit one-way ANOVA, aov.fit <- lm(count ~ drug,data=lympho) Then use the function   anova   to get the same ANOVA table as shown in the class.   anova(aov.fit) You are not required to submit this homework assignment!

The final project

  The final project  How much does LLM "know" about statistical analysis 1. Identify a set of original data from your research lab, either generated by you or other members, whose goal is to explore the relationship between at least TWO variables. For unpublished data, the data and your analysis will be kept strictly confidential and will not be shared with anybody including the TAs.  If you have trouble access the original data, you can use some recently published data from your lab.  Prepare for a brief description of the background and hypothesis, a clear description of the study design and data collection, to be present to the LLM.  2. Use a LLM, ChatGPT, Gemini, Llama, etc., present your question, and the data, and see how they respond. You should try different prompt and push it to test the limit. You may use multiple LLMs.  3. Follow the instruction by the LLM to carry out the analysis. Feed the results back to them and ask them to interpret.  4...

Lecture 4

Slides Jupyter Notebook HW4: 1. Following the Problem 3 of HW 3 on the relation between body weight and brain weight,  a) reproduce the residual plot, but without the red-line. In the lm object, you can call fitted value by your.lm.object$fit, and plot it against the residuals. b) Find the correlation coefficient between brain weight and body weight (before and after the log-transformation). Does the value of correlation coefficient change?  2. Generate two random vectors of the same length, take the position of the lowest value of the first vector, and see where value at the same position of the 2nd vector was ranked. Take the positions that is below the median of the first vectors, and see where values at the same positions of the 2nd vector was ranked.