Longitudinal Data Measured in Twins:
Growth Curve Models Odense, 26-29 April 2010 Corrado Fagnani, Italian Twin Register
Longitudinal Data Measured in Twins:
Growth Curve Models Odense, 26-29 April 2010 Corrado Fagnani, Italian Twin Register
Summary Approaches to the analysis of longitudinal twin data Linear growth model (objectives, assumptions, Mx implementation) Practical (next session) Why conduct longitudinal analyses of twin data Longitudinal data
Longitudinal data Longitudinal data include:
- ‘Time to event’ data
Information on time of occurrence of a well-defined outcome (e.g., disease onset, death) in individuals is available (survival models, correlated frailty models for survival twin data)
- ‘Repeated measures’ data
Multiple occasions of measurement of a given trait from the same individuals are available [short term trials of an experiment, long term snapshots of a dynamic process (e.g., growth, ageing)]
Why conduct longitudinal analyses of twin data Using multiple observations from the same individuals can improve statistical power and reduce random variation Possible to investigate time-dependent genetic and environmental effects
What sorts of questions:
- Are there changes in the magnitude of genetic and environmental influences over time? (If there are no cohort effects, a cross-sectional study design could be adopted)
- Do the same genetic and environmental factors operate over time? (A longitudinal approach is needed)
Approaches to the analysis of longitudinal twin data Cholesky decomposition Simplex model Growth curve model
Advantages
Has a useful conceptual interpretation: all factors are constrained to impact current and later (but not earlier) time points
Can predict any pattern of change Disadvantages
Not falsifiable
It makes no prediction about future (not yet measured) time points
Feasible for limited number of measurements Questions you can address
Magnitude of genetic/environmental influences at each time point
Extent to which genetic/environmental influences overlap across time points
Advantages
Makes restrictive predictions about covariance pattern
Falsifiable Disadvantages
Future depends on current state only
Number of parameters increases with number of measurements
Growth curve model Focuses on two growth variables - initial level (‘intercept’) and rate of change (‘slope’) - to predict level at a series of time points Longitudinal shape of individual trajectories needs to be specified
- Constant rate of change over time (linear growth model)
- Possible extensions to nonlinear processes, including quadratic, exponential and logistic models Growth variables are modelled as (first-order) latent factors which load on the observed longitudinal measures Biometric analysis of growth variables and their mutual interplay
decomposition of variance/covariance structure of growth variables in terms of (second-order) latent genetic and environmental components
How to decide about the shape of the process Data exploration
- Look at the observed trajectories
- Marginal change of trait level over time Data modelling
- Compare the fit of models implying different shapes for the longitudinal process (linear vs nonlinear models)
- Go for the most likely model Hjelmborg JV et al. Genetic influences on growth traits of BMI: A longitudinal study of adult twins. Obesity, 2008; 16(4): 847-852.
Linear growth model T1 T2 T3 T4 I S R1 1 1 1 1 3 2 1 0 R2 R3 R4 1 1 1 1 Observed longitudinal measures Latent Intercept Latent Slope Residuals
Matrix representation
What sorts of questions Are there inter-individual differences in initial level and rate of change?
- Variance of intercept and slope Are initial level and rate of change associated within an individual?
- Within-twin correlation between intercept and slope To what extent are inter-individual differences in each of the longitudinal measures accounted for by initial level and rate of change?
- Variance of residuals Do genetic or environmental factors explain inter-individual differences in initial level and rate of change?
- Cross-twin/within-trait correlation of intercept and of slope in MZ and DZ twins Do genetic or environmental factors explain the within-individual association between initial level and rate of change?
- Cross-twin/cross-trait correlation between intercept and slope in MZ and DZ twins
Mx implementation of (saturated) linear growth model (NT time points – MZ vs DZ – Raw data as input) GROUP #1 Initialization
BEGIN MATRICES;
F FULL NT 2 FIXED ! matrix of factor loadings
X LOWER 4 4 FREE ! Cholesky factor for variance-covariance of intercept and slope, MZ
Y LOWER 4 4 FREE ! Cholesky factor for variance-covariance of intercept and slope, DZ
L FULL 1 1 FREE ! residual standard deviation (assuming equal standard deviations of residuals)
V IDEN NT NT FIXED ! identity matrix
W ZERO NT NT FIXED ! zero matrix
C FULL 1 1 FREE ! effect of a time-invariant covariate (e.g., age at baseline, gender)
END MATRICES; MATRIX F
1 0
1 1
1 2
…..
1 NT-1 BEGIN ALGEBRA;
M = X*X'; ! variance-covariance matrix of intercept and slope, MZ
K = Y*Y'; ! variance-covariance matrix of intercept and slope, DZ
O = \Stnd(M); ! correlations, MZ
P = \Stnd(K); ! correlations, DZ
G = (L*L')@V | W _
W | (L*L')@V; ! matrix of residual variance
END ALGEBRA;
GROUP #2 MZ twins
Definition obscov; ! ‘obscov’ is one of the input variables
BEGIN MATRICES;
F FULL NT 2 = F1 ! equates this matrix to the matrix F declared in GROUP #1
M COMPUTED = M1
G COMPUTED = G1
C FULL 1 1 = C1
N FULL 1 1 FIXED ! observed time-invariant covariate (e.g., age at baseline, gender)
A FULL 2 1 FREE ! mean intercept & slope twin1
B FULL 2 1 FREE ! mean intercept & slope twin2
I IDEN 2 2
J UNIT 1 NT
END MATRICES;
Specify N obscov
MEANS (F*A)' + (N*C)@J | (F*B)'+ (N*C)@J / ! individual differences in obscov are adjusted for
COVARIANCES (I@F)&M + G /
END
GROUP #2 DZ twins
Definition obscov; ! ‘obscov’ is one of the input variables
BEGIN MATRICES;
F FULL NT 2 = F1 ! equates this matrix to the matrix F declared in GROUP #1
K COMPUTED = K1
G COMPUTED = G1
C FULL 1 1 = C1
N FULL 1 1 FIXED ! observed covariate
A FULL 2 1 FREE ! mean intercept & slope twin1
B FULL 2 1 FREE ! mean intercept & slope twin2
I IDEN 2 2
J UNIT 1 NT
END MATRICES;
Specify N obscov
MEANS (F*A)' + (N*C)@J | (F*B)'+ (N*C)@J / ! individual differences in obscov are adjusted for
COVARIANCES (I@F)&K + G /
END
Biometric linear growth model T1 T2 T3 T4 I S A C E A C E R1 1 R2 1 R3 1 R4 1 1 1 1 1 3 2 1 0 Assumptions (to reduce computational burden)
Equal variance of residuals across time
No genetic effects on residuals
What sorts of questions What is the contribution of genetic factors to inter-individual variation in initial level and rate of change?
- Heritability of intercept and slope What is the contribution of environmental factors to inter-individual variation in initial level and rate of change?
- Shared and unique environmental proportions of variance of intercept and slope Same or different genes influencing initial level and rate of change?
- Genetic correlation between intercept and slope Same or different environments influencing initial level and rate of change?
- Shared and unique environmental correlations between intercept and slope
Mx implementation of biometric (ACE) linear growth model (NT time points – MZ vs DZ – Raw data as input) GROUP #1 Initialization
CALCULATION NG=3
BEGIN MATRICES;
F FULL NT 2 FIXED ! matrix of factor loadings
X LOWER 2 2 FREE ! Cholesky factor for genetic variance-covariance of intercept and slope
Y LOWER 2 2 FREE ! Cholesky factor for shared environmental variance-covariance of intercept and slope
Z LOWER 2 2 FREE ! Cholesky factor for unique environmental variance-covariance of intercept and slope
L FULL 1 1 FREE ! residual standard deviation (assuming equal standard deviations of residuals)
B IDEN NT NT FIXED ! identity matrix
J IDEN 2 2 FIXED ! identity matrix
W ZERO NT NT FIXED ! zero matrix
H FULL 1 1 FIXED ! scalar = 1/2
Q FULL 2 1 FREE ! mean intercept & slope
C FULL 1 1 FREE ! effect of a time-invariant covariate (e.g., age at baseline, gender)
END MATRICES;
MATRIX H .5
MATRIX F
1 0
1 1
1 2
…..
1 NT-1
Comments