Multivariate Analysis of Twin Data:
Independent and Common Pathway Models Odense, 26-29 April 2010 Corrado Fagnani, Italian Twin Register
Multivariate Analysis of Twin Data:
Independent and Common Pathway Models Odense, 26-29 April 2010 Corrado Fagnani, Italian Twin Register
Summary Insights from observed multivariate twin data (prior to modelling) Multivariate twin modelling: Independent and Common Pathway models Practical (next session) Why conduct multivariate twin analysis Multivariate data Relevant statistics
Multivariate data Measures on two (bivariate data) or more (multivariate data) traits are available for subjects (twins) under study Addressable issues:
– Co-occurrence (co-morbidity) of different diseases
– Inter-relations between different symptoms or biomarkers (endophenotypes) of a disease, or between disease and symptoms (or biomarkers)
– Interactions (i.e. moderation effects)
– Causal inferences
– Confounding affects
– Developmental changes (longitudinal data with repeated measurements) Appealing and powerful design, but complex (i.e. computationally demanding) statistical models are needed
Objectives of multivariate twin analysis To determine the sources of correlation (or covariation) between multiple traits The correlation between traits can arise from genetic or environmental effects shared by the traits Partition of the covariation between traits into genetic and environmental components
– Genetic or environmental overlap between traits (origin of co-morbidity)
– Genetic overlap: search for ‘pleiotropic’ genes
– Environmental overlap: prevention strategies impacting on multiple traits/diseases
Insights from observed data Within-twin/cross-trait (phenotypic) covariance (between any two traits within a twin)
– Are there etiological influences common to traits ? P11 PJ1 PN1 …. …. Twin 1 Twin 2 PK1 …. P12 PJ2 PN2 …. …. PK2 …. Cross-twin/cross-trait covariance (between one trait in a twin and another trait in the co-twin)
– Are these common etiological influences familial ? Cross-twin/cross-trait covariance in MZ vs DZ pairs
– Are these common familial etiological influences genetic or environmental ?
Relevant statistics Proportions of covariance explained by genetic and environmental factors Cov(PJ,PK) = CovG(PJ,PK) + CovC(PJ,PK) + CovE(PJ,PK) CovG(PJ,PK)/Cov(PJ,PK) ‘Bivariate heritability’
CovC(PJ,PK)/Cov(PJ,PK)
CovE(PJ,PK)/Cov(PJ,PK) Genetic and environmental correlations rG(PJ,PK) = CovG(PJ,PK)/[SDG(PJ)*SDG(PK)]
rC(PJ,PK) = CovC(PJ,PK)/[SDC(PJ)*SDC(PK)]
rE(PJ,PK) = CovE(PJ,PK)/[SDE(PJ)*SDE(PK)]
More about relevant statistics Bivariate heritability
– If two traits have a bivariate heritability of 0.4, we can conclude that 40% of phenotypic correlation is mediated by shared genetic influence
– If two traits have a phenotypic correlation of 0.6 and a bivariate heritability of 0.5, then the genetic component of phenotypic correlation is 0.3 Genetic correlation
– Is the correlation between genetic effects on two traits, regardless of their heritabilities
– Interpretable as the likelihood that a gene found to be associated with one trait is also associated with the other trait
– If two traits have high genetic correlation and low heritability, then their bivariate heritability is likely to be low
Independent Pathway Model P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 EC AC 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ)
The covariation between traits is caused by genetic and environmental factors common to traits, each having its own paths to each trait P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 EC AC 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ)
There are trait-specific residual variances, further partitioned into genetic and environmental components P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 EC AC 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ)
Model-predicted relevant statistics P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 EC AC 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ) x1 x2 x3 x4 z1 z2 z3 z4 Bivariate heritability of P1 and P2 = x1*x2 / (x1*x2+z1*z2) t1 v1 t2 v2 t3 v3 t4 v4 Genetic correlation of P1 and P2 = x1*x2 / [(x12+t12)*(x22+t22)]1/2
Matrix model representation (one twin) observed traits common factors coefficients common factors specific genetic coefficients specific genetic factors specific unshared environmental coefficients specific unshared environmental factors
Mx implementation GROUP #1 Define matrices
CALCULATION
BEGIN MATRICES ;
X FULL NVAR 1 FREE ! common factor additive genetic paths
Y FULL NVAR 1 FREE ! common factor shared environment paths
Z FULL NVAR 1 FREE ! common factor unshared environment paths
T DIAG NVAR NVAR FREE ! trait-specific additive genetic paths
U DIAG NVAR NVAR FREE ! trait-specific shared environment paths
V DIAG NVAR NVAR FREE ! trait-specific unshared environment paths
M FULL 1 NVAR FREE ! means
END MATRICES ; BEGIN ALGEBRA ;
A = X*X’ + T*T’ ; ! additive genetic variance-covariance components
C = Y*Y’ + U*U’ ; ! shared environment variance-covariance components
E = Z*Z’ + V*V’ ; ! unshared environment variance-covariance components
END ALGEBRA ;
END
GROUP #2 MZ twins
BEGIN MATRICES = GROUP 1 ;
MEANS M | M ; ! means model
COVARIANCE A+C+E | A+C _ ! variance-covariance model
A+C | A+C+E ;
END GROUP #3 DZ twins
BEGIN MATRICES = GROUP 1 ;
H FULL 1 1 FIXED
END MATRICES ;
MATRIX H .5
MEANS M | M ; ! means model
COVARIANCE A+C+E | H@A+C _ ! variance-covariance model
H@A+C | A+C+E ;
END
Common Pathway Model P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 L AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ) EC AC L
The covariation between traits is caused by a single underlying latent phenotype, that is in turn influenced by genetic and environmental factors P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 L AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ) EC AC L
There are trait-specific residual variances, further partitioned into genetic and environmental components P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 L AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ) EC AC L
Model-predicted relevant statistics P1-1 P3-1 P2-1 P4-1 P4-2 P2-2 P3-2 P1-2 L AC EC A1 E1 A2 E2 A3 E3 A4 E4 E4 A4 E3 A3 E2 A2 E1 A1 1 (MZ) or 0.5 (DZ) 1 (MZ) or 0.5 (DZ) EC AC L x z f1 f2 f3 f4 t1 v1 t2 v2 t3 v3 t4 v4 Bivariate heritability of P1 and P2 = f1*x2*f2 / [f1*f2*(x2+z2)] Genetic correlation of P1 and P2 = f1*x2*f2 / [(f12x2+t12)*(f22x2+t22)]1/2
Matrix model representation (one twin) observed traits common factors coefficients common factors specific genetic coefficients specific genetic factors specific unshared environmental coefficients specific unshared environmental factors latent phenotype coefficients
Mx implementation GROUP #1 Define matrices
CALCULATION
BEGIN MATRICES ;
X FULL 1 1 FREE ! latent factor genetic path coefficient
Y FULL 1 1 FREE ! latent factor shared environment path coefficient
Z FULL 1 1 FREE ! latent factor unshared environment path coefficient
T DIAG NVAR NVAR FREE ! trait-specific additive genetic paths
U DIAG NVAR NVAR FREE ! trait-specific shared environment paths
V DIAG NVAR NVAR FREE ! trait-specific unshared environment paths
F FULL NVAR 1 FREE ! loadings of traits on latent factor
M FULL 1 NVAR FREE ! Means
I IDEN 2 2
END MATRICES ; BEGIN ALGEBRA ;
A = (I@F)&(X*X’) + T*T’ ; ! additive genetic variance-covariance components
C = (I@F)&(Y*Y’) + U*U’ ; ! shared environment variance-covariance components
E = (I@F)&(Z*Z’) + V*V’ ; ! unshared environment variance-covariance components
L = X*X’ + Y*Y’ + Z*Z’ ; ! variance of latent factor
END ALGEBRA ;
END
Comments