4_GCA

This script uses the following packages:

# library(ggplot2)
# library(tidyr)
# library(lme4)
# library(knitr)
# library(kableExtra)
# library(AICcmodavg)

Overview

Previously, we fit a model on children’s accuracy in fixating the target image averaged over the critical window (2_MeanAcc). We also fit multiple models on children’s accuracy in fixating the target image for each time frame during the critical window and then used permutation analyses to determine whether a cluster of sequential effects was statistically significant (3_Cluster).

Now, we will instead fit a model using growth curves to quantify how children’s accuracy in fixating the target image changes throughout the critical window. We will use different orthogonal time terms to quantify different aspects of this change (more on this in a moment).

I learned how to conduct these growth curve analyses (GCA) from a 2-day workshop led by Dan Mirman (https://www.danmirman.org/gca). And from a book that Dan wrote on GCA:

Mirman, D. (2014). Growth Curve Analysis and Visualization Using R. Boca Raton, FL: Chapman and Hall / CRC Press. https://www.routledge.com/Growth-Curve-Analysis-and-Visualization-Using-R/Mirman/p/book/9781466584327

This book is very practical and covers both conceptual aspects of GCA and practical information about how to conduct GCA in R (with lots of examples of R code).

I’ll be showing you the specific GCA models that I use for my eye-tracking data, but I encourage you to check out Dan’s book if you are interested in learning more.

Time Course

Here are time course plots of the changes in children’s accuracy over time on trials in the Pre-Switch and Post-Switch blocks of the LWL task. These raw data that we will then be modelling.

Figure Legend Time course of changes in children’s accuracy in fixating the target object over time for trials in the pre-switch (blue) and post-switch (red) blocks. Solid lines are children’s accuracy in fixating the target image for each time frame (i.e., every 33 ms) averaged across trials within a condition and across children. Ribbons around the solid lines are +/- 1 SE. The gray vertical dashed line at -924 ms indicates the onset of the carrier phrase (e.g, “Find the”), the black vertical dashed line at 0 ms indicates the onset of the target word (e.g., “Sock”), and the gray vertical dashed line at 986 ms indicates the offset of the target word.

GCA

Growth Curve Analysis (GCA) was used to quantify changes in the timecourse of children’s word recognition accuracy during a critical window 300 to 1800 ms after the onset of the target word. Children’s accuracy in fixating the target object was calculated as the empirical log odds of fixations to the target over fixations to the distractor at each time point (i.e., every 33 ms).

Time Terms

Time course changes were measured using the following orthogonal polynomial time terms:

intercept, which quantifies overall accuracy (i.e., average across the entire window)
linear, which quantifies the average increase (i.e., slope of the line connecting accuracy from the onset to offset)
quadratic, which quantifies the steepness of the peak in accuracy (i.e., more negative value means sharper inverted u-shape)
cubic, which quantifies asymptotes in accuracy at the tails (i.e., delayed increase from chance at onset and maintained peak accuracy at offset)

A few comments on time terms is necessary:

First, there’s health skepticism that the time terms (especially quadratic and cubic) do not reflect actual underlying cognitive constructs. Put another way, a steeper slope captured by a quadratic term does not necessarily mean faster lexical processing. See for instance, this paper by Bob McMurray: https://psyarxiv.com/pb2c6/

Second, the estimates for these time terms are highly dependent on the critical window that you choose. You want the window to begin right when fixations deviate from chance and end shortly after they asymptote. Too much of an asymptote in either direction will warp the time terms as they try to fit the flat lines of the asymptote. If this is happening you would notice a poor fit between the model predictions and your raw data (which we will be plotting later).

Here are two further examples:

with a longer window (like here from 300 to 1800) the quadratic time term will be negative and capture the inverted u-shaped curve as the increase in accuracy is initially steep and then decelerates as it reaches the peak. with a shorter window (like my coarticulation paper from 300 to 900) the quadratic time term is positive and captures the u-shaped curve as children’s fixations are initially flat (at chance) and then rapidly accelerate.
if there is a baseline effect (i.e., children’s accuracy is above chance in one condition, but not the other) you may find a significant effect of linear time that is spurious. linear time captures the change in children’s accuracy from the start of the crtical window to the end. if children’s accuracy ends at 90% in both conditions, but starts at 50% in condition A and 60% in condition B, you may find a difference in linear time between condition A (40% improvement) and condition B (30% improvement).

Third, we need to use orthogonal time terms to remove the covariance between the time terms. With non-orthogonal time terms both linear and quadratic are correlated - as linear time increases, so does quadratic time.

Let me show you what I mean graphically:

timebin = data.frame(TimeC=sort(unique(d$TimeC[d$TimeC>=300 & d$TimeC <=1801])),Bin=seq(from=1,to=length(unique(d$TimeC[d$TimeC>=300 & d$TimeC <=1801])),by=1))

timebin = cbind(timebin,poly(1:max(timebin$Bin), 3)) 

timebin = timebin %>% rename('linear'=`1`,'quadratic'=`2`,'cubic'=`3`)

TimeC	Bin	linear	quadratic	cubic
300	1	-0.25	0.31	-0.34
333	2	-0.24	0.27	-0.25
367	3	-0.23	0.23	-0.17
400	4	-0.22	0.19	-0.10
433	5	-0.21	0.16	-0.04
467	6	-0.19	0.12	0.02
500	7	-0.18	0.09	0.06
533	8	-0.17	0.06	0.10
567	9	-0.16	0.03	0.12
600	10	-0.15	0.01	0.15
633	11	-0.14	-0.02	0.16
667	12	-0.13	-0.04	0.17
700	13	-0.12	-0.06	0.17
733	14	-0.11	-0.08	0.17
767	15	-0.09	-0.10	0.17
800	16	-0.08	-0.11	0.16
833	17	-0.07	-0.13	0.14
867	18	-0.06	-0.14	0.13
900	19	-0.05	-0.15	0.11
933	20	-0.04	-0.15	0.09
967	21	-0.03	-0.16	0.06
1000	22	-0.02	-0.16	0.04
1033	23	-0.01	-0.16	0.01
1067	24	0.01	-0.16	-0.01
1100	25	0.02	-0.16	-0.04
1133	26	0.03	-0.16	-0.06
1167	27	0.04	-0.15	-0.09
1200	28	0.05	-0.15	-0.11
1233	29	0.06	-0.14	-0.13
1267	30	0.07	-0.13	-0.14
1300	31	0.08	-0.11	-0.16
1333	32	0.09	-0.10	-0.17
1367	33	0.11	-0.08	-0.17
1400	34	0.12	-0.06	-0.17
1433	35	0.13	-0.04	-0.17
1467	36	0.14	-0.02	-0.16
1500	37	0.15	0.01	-0.15
1533	38	0.16	0.03	-0.12
1567	39	0.17	0.06	-0.10
1600	40	0.18	0.09	-0.06
1633	41	0.19	0.12	-0.02
1667	42	0.21	0.16	0.04
1700	43	0.22	0.19	0.10
1733	44	0.23	0.23	0.17
1767	45	0.24	0.27	0.25
1800	46	0.25	0.31	0.34

Formatting data

Let’s start with a simple model that just includes the effect of condition, by first creating the orthogonal time terms (using slightly different column headers):

timebin = data.frame(TimeC=sort(unique(d$TimeC[d$TimeC>=300 & d$TimeC <=1801])),TimeBin=seq(from=1,to=length(unique(d$TimeC[d$TimeC>=300 & d$TimeC <=1801])),by=1))
timebin = cbind(timebin,poly(1:max(timebin$TimeBin), 3))
colnames(timebin)=c('TimeC','TimeBin','ot1','ot2','ot3')

We will be using a different DV for these models: empirical logits. In all of our analyses we are modeling bounded data (accuracy can be no higher than 100% and no lower than 0%). When we previously conducted the cluster-based permutation analyses we kept the data at the individual trial level (1’s and 0’s) and used logistic mixed effects modeling.

As we’ve already seen the logistic models can take a while to fit and (more problematically) they often do not converge. Therefore, for GCA Dan Mirman recommends analyzing the the data averaged across trials using empirical logits (also known as log-odds). We could use raw accuracy as before, but as I already mentioned, the use of this measure can be problematic if there are a lot of data points near the boundaries (which is likely when keeping the data separate for each time frame). We cannot use regular logits, because these are calculated as the log of target fixations over distractor fixations. On time frames where the child was fixating the target image on every trial, we cannot calculate the logit, because log(8/0) is positive infinity. Similarly, on time frames where the child was fixating the distractor image on every trial, we also cannot calculate the logit, because log(0/8) is negative infinity. So the empirical logit adjusts for this by adding 0.5 to both the numerator and denominator.

One important feature of logits is that they capture an important property of bounded measures: as you approach floor (0%) or ceiling (100%) changes in accuracy are more difficult. For example, it is harder to increase accuracy by 10% from 90 to 100% than it is from 50 to 60%.

We can see this numerically if we had an experiment with 10 trials:

a 10% change in accuracy from 50% (log(5.5/5.5)=0) to 60% (log(6.5/4.5)=0.368) is associated with a change in 0.368 increase in empirical log-odds
a 10% change in accuracy from 90% (log(9.5/1.5)=1.846) to 100% (log(10.5/0.5)=3.045) is associated with a change in 1.199 increase in empirical log-odds

Okay, so we are aggregating our data by averaging across all of the trials within a condition for each time frame during our critical window and calculating the empirical logits (elog):

d.gca = d %>%
  filter(TimeC>=300 & TimeC <=1800) %>%
  right_join(timebin,by='TimeC') %>%
  group_by(Sub.Num,Condition,TimeC) %>%
  summarise(
    TimeBin=TimeBin[1],
    ot1=ot1[1],
    ot2=ot2[1],
    ot3=ot3[1],
    TrialN = sum(!is.na(Accuracy)),
    TargetN = sum(Accuracy==1,na.rm=TRUE),
    DistractorN = sum(Accuracy==0,na.rm=TRUE),
    Accuracy = mean(Accuracy,na.rm=TRUE)) %>%
  mutate(
    elog=log((TargetN+0.5)/(DistractorN+0.5)),
    wts=1/(TargetN+0.5) + (1/(DistractorN+0.5))
    )

d.gca$Condition <- factor(d.gca$Condition, c('pre-switch','post-switch'))
contrasts(d.gca$Condition) = c(-.5,.5)
colnames(attr(d.gca$Condition,"contrasts")) = ""

Recall, however, that for any participant at any given time trial they may not have the maximum number of trials (e.g., the child was not looking at either image, their eyes were blocked, etc.). When fitting our model, however, we want to give more weight to elog values that were calculated using more compared to less data. We will use these weights as an input to our model.

Model fit

m <- lmer(elog ~ (ot1+ot2+ot3)*Condition + ((ot1+ot2+ot3)*Condition|Sub.Num), data=d.gca, weights=1/wts, control=lmerControl(optimizer='bobyqa'),REML=FALSE)

Because it is computationally and theoretically difficult to estimate the degrees of freedom in mixed-effects models (using the kenward-rogers approximate can take 5 to 10 minutes), we will conduct tests of significance by assuming a Gaussian distribution for our t values; therefore, t-values > ± 1.96 are statistically significant.

coefs <- data.frame(coef(summary(m)))
coefs$p.value <- 2*(1-pnorm(abs(coefs[,"t.value"])))
coefs$sig[coefs$p.value < .10] = '+'
coefs$sig[coefs$p.value < .05] = '*'
coefs$sig[coefs$p.value > .10] = ''
coefs$p.value = round(coefs$p.value,digits=4)

To make it easier to show our tables (without lots of decimals), I have created a roundDF() function that will go through a data frame, identify columns that are numerics and round them to have 3 decimal places:

roundDF <- function(df) {
  for (col in 1:ncol(df)) {
  if (sapply(df,class)[col][[1]] == 'numeric') {
    df[,col] = round(df[,col],digits=3)
  }
  }
  return(df)
}

	Estimate	Std..Error	t.value	p.value
(Intercept)	1.533	0.041	37.603	0.000
ot1	3.294	0.289	11.388	0.000
ot2	-1.518	0.196	-7.761	0.000
ot3	0.171	0.139	1.236	0.217
Condition	-0.249	0.093	-2.671	0.008
ot1:Condition	-0.096	0.495	-0.193	0.847
ot2:Condition	0.546	0.313	1.743	0.081
ot3:Condition	-0.487	0.333	-1.461	0.144

Model plot

Let’s now plot the growth curve model fit with the raw data. To do this we need to create a data frame with the values for all of the fixed effects we want the model to generate preditions for:

predictions = as.data.frame(expand.grid(ot1=timebin$ot1,Condition=c("pre-switch","post-switch"))) %>%
  right_join(timebin,by='ot1') %>%
  arrange(Condition,TimeC) %>%
  select("TimeC","ot1","ot2","ot3","Condition")

	TimeC	ot1	ot2	ot3	Condition
1	300	-0.2498843	0.3088653	-0.3423492	pre-switch
2	333	-0.2387784	0.2676833	-0.2510561	pre-switch
3	367	-0.2276724	0.2283731	-0.1701372	pre-switch
4	400	-0.2165664	0.1909349	-0.0991100	pre-switch
5	433	-0.2054605	0.1553686	-0.0374919	pre-switch
6	467	-0.1943545	0.1216742	0.0151994	pre-switch
47	300	-0.2498843	0.3088653	-0.3423492	post-switch
48	333	-0.2387784	0.2676833	-0.2510561	post-switch
49	367	-0.2276724	0.2283731	-0.1701372	post-switch
50	400	-0.2165664	0.1909349	-0.0991100	post-switch
51	433	-0.2054605	0.1553686	-0.0374919	post-switch
52	467	-0.1943545	0.1216742	0.0151994	post-switch

Then we feed this into the model and get the predictions with standard errors using the predictSE() from the AICcmodavg package

predictions = cbind(predictions,as.data.frame(predictSE(m,predictions)))

	TimeC	ot1	ot2	ot3	Condition	fit	se.fit
1	300	-0.2498843	0.3088653	-0.3423492	pre-switch	0.1269761	0.1312904
2	333	-0.2387784	0.2676833	-0.2510561	pre-switch	0.2756989	0.1174043
3	367	-0.2276724	0.2283731	-0.1701372	pre-switch	0.4167679	0.1067480
4	400	-0.2165664	0.1909349	-0.0991100	pre-switch	0.5503832	0.0991128
5	433	-0.2054605	0.1553686	-0.0374919	pre-switch	0.6767448	0.0941316
6	467	-0.1943545	0.1216742	0.0151994	pre-switch	0.7960528	0.0913012
47	300	-0.2498843	0.3088653	-0.3423492	post-switch	0.2370387	0.1248662
48	333	-0.2387784	0.2676833	-0.2510561	post-switch	0.3177508	0.1127208
49	367	-0.2276724	0.2283731	-0.1701372	post-switch	0.3968832	0.1040736
50	400	-0.2165664	0.1909349	-0.0991100	post-switch	0.4744010	0.0985101
51	433	-0.2054605	0.1553686	-0.0374919	post-switch	0.5502692	0.0954324
52	467	-0.1943545	0.1216742	0.0151994	post-switch	0.6244530	0.0941500

And now we can create a plot that includes both these model predictions and the raw data:

plot = d.gca %>% 
  group_by(Condition,TimeC) %>% 
  summarise(elog = mean(elog,na.rm=T))

ggplot(predictions, aes(x=TimeC, y=elog, color=Condition, fill=Condition)) +
  geom_line(aes(y=fit)) +
  geom_smooth(aes(y=fit, ymin=fit-se.fit, ymax=fit+se.fit), stat="identity") +
  geom_point(data=plot) +
  geom_hline(aes(yintercept=0), linetype="dashed", color="gray") +
  theme_bw(base_size=14) +
  labs(x='Time since target word onset (in ms)',y='Fixation Empirical Logit') +
  scale_x_continuous(breaks=seq(from=300,to=1800,by=300)) +
  scale_y_continuous(breaks=seq(from=-1,to=2.5,by=.25)) +
  coord_cartesian(xlim=c(300,1800),ylim=c(-.55,2.2)) +
  scale_color_manual(values=c("dodgerblue","coral2")) +
  scale_fill_manual(values=c("dodgerblue","coral2"))+
  theme(legend.justification=c(1,0), legend.position=c(1,0),legend.background=element_rect(fill= NA, color=NA), legend.text = element_text(size = 12), legend.title=element_blank(), strip.text = element_text(size=12))

Figure legend Time course of changes in children’s accuracy in fixating the target object over time for trials in the pre-switch (blue) and post-switch (red) blocks. Solid lines are the growth curve model fits. The ribbons around the lines represent +/- 1 SE. Data points are the raw data averaged across trials and children for each condition and time frame. The dashed horizontal line at 0 is chance (i.e., equal likelihood of fixating to the target and the distractor object).

Model results

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: 
## elog ~ (ot1 + ot2 + ot3) * Condition + ((ot1 + ot2 + ot3) * Condition |  
##     Sub.Num)
##    Data: d.gca
## Weights: 1/wts
## Control: lmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   6706.3   7000.9  -3308.1   6616.3     5107 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9213 -0.4043  0.0625  0.6947  3.0303 
## 
## Random effects:
##  Groups   Name          Variance Std.Dev. Corr                               
##  Sub.Num  (Intercept)    0.09115 0.3019                                      
##           ot1            4.60229 2.1453    0.21                              
##           ot2            2.05829 1.4347   -0.21 -0.11                        
##           ot3            0.99592 0.9980   -0.14 -0.39  0.06                  
##           Condition      0.48013 0.6929    0.06  0.10  0.14 -0.22            
##           ot1:Condition 13.40177 3.6608   -0.01  0.07 -0.01 -0.20  0.08      
##           ot2:Condition  5.16656 2.2730    0.03  0.10 -0.06 -0.02 -0.21  0.03
##           ot3:Condition  5.90748 2.4305    0.12 -0.08  0.04  0.10  0.13 -0.51
##  Residual                0.12807 0.3579                                      
##       
##       
##       
##       
##       
##       
##       
##       
##  -0.29
##       
## Number of obs: 5152, groups:  Sub.Num, 56
## 
## Fixed effects:
##               Estimate Std. Error t value
## (Intercept)    1.53258    0.04076  37.603
## ot1            3.29420    0.28926  11.388
## ot2           -1.51784    0.19556  -7.761
## ot3            0.17115    0.13851   1.236
## Condition     -0.24926    0.09331  -2.671
## ot1:Condition -0.09572    0.49525  -0.193
## ot2:Condition  0.54619    0.31338   1.743
## ot3:Condition -0.48694    0.33332  -1.461
## 
## Correlation of Fixed Effects:
##             (Intr) ot1    ot2    ot3    Condtn ot1:Cn ot2:Cn
## ot1          0.205                                          
## ot2         -0.204 -0.102                                   
## ot3         -0.134 -0.368  0.066                            
## Condition    0.056  0.100  0.135 -0.215                     
## ot1:Conditn -0.013  0.068 -0.010 -0.185  0.083              
## ot2:Conditn  0.032  0.096 -0.062 -0.022 -0.201  0.032       
## ot3:Conditn  0.116 -0.074  0.038  0.088  0.129 -0.490 -0.263
## optimizer (bobyqa) convergence code: 0 (OK)
## maxfun < 10 * length(par)^2 is not recommended.

Here’s how I would write up the results:

There’s a significant effect of Condition on the following time terms:

intercept, b=-0.249,p=0.008

But not on any of the other time terms, p’s > 0.081

Children have overall higher accuracy in fixating the target object on trials before the dimensional switch (pre-switch b= 1.6575), compared to after the dimensional switch (post-switch b= 1.4085) trials.

Although marginally significant, let’s look at how the growth curve changes when we manipulate just quadratic time from the estimate averaging across conditions: b=-1.518 to the estimate in the pre-switch condition: b=-1.791 to the estimate in the post-switch condition: b=-1.245

b_int = 1.54
b_lin = 3.29
b_quad = -1.51
b_cub = 0.17

predictions.hand = as.data.frame(expand.grid(ot1=timebin$ot1)) %>%
  right_join(timebin,by='ot1') %>%
  arrange(TimeC) %>%
  select("TimeC","ot1","ot2","ot3")

predictions.hand = predictions.hand %>% 
  mutate(
    average = b_int + b_lin*ot1 + b_quad*ot2 + b_cub * ot3,
    `pre-switch` = b_int + b_lin*ot1 + -1.79*ot2 + b_cub * ot3,
    `post-switch` = b_int + b_lin*ot1 + -1.25*ot2 + b_cub * ot3
  ) %>%
  pivot_longer(cols=average:`post-switch`,names_to="Condition",values_to="fit") %>% 
  mutate(se.fit=0) %>% 
  arrange(Condition)

predictions.hand$Condition=factor(predictions.hand$Condition,c("pre-switch","average","post-switch"))
  
  
ggplot(predictions.hand, aes(x=TimeC, y=elog, color=Condition, fill=Condition)) +
  geom_line(aes(y=fit)) +
  geom_smooth(aes(y=fit, ymin=fit-se.fit, ymax=fit+se.fit), stat="identity") +
  geom_hline(aes(yintercept=0), linetype="dashed", color="gray") +
  theme_bw(base_size=14) +
  labs(x='Time since target word onset (in ms)',y='Fixation Empirical Logit') +
  scale_x_continuous(breaks=seq(from=300,to=1800,by=300)) +
  scale_y_continuous(breaks=seq(from=-1,to=2.5,by=.25)) +
  coord_cartesian(xlim=c(300,1800),ylim=c(-.55,2.2)) +
  scale_color_manual(values=c("dodgerblue","green","coral2")) +
  scale_fill_manual(values=c("dodgerblue","green","coral2"))+
  theme(legend.justification=c(1,0), legend.position=c(1,0),legend.background=element_rect(fill= NA, color=NA), legend.text = element_text(size = 12), legend.title=element_blank(), strip.text = element_text(size=12))