Mixing Multivariate Genetic Analysis with Neuroscientific Techniques

Multivariate genetic analysis lets you simultaneously consider multiple traits in identical and non-identical twins to draw conclusions about heritability. Here I explain what this means, how this framework is implemented and how it can be mixed with other techniques, such as EEG, MRI and TMS.

MGA graph

Before we can make any attempt of mixing MGA with other specialist techniques, it is imperative to properly introduce the framework on its own to understand what it is that we want to use across methodologies.

As opposed to the univariate model, MGA allows for simultaneous analysis of multiple traits in MZ and DZ twins. For instance trait 1 and trait 2 would be considered at once. Hence we are now interested also in the inter-trait associations and co-occurrences (Boomsma, et.al, 2002: 875). When we are examining multiple traits, then on top of the correlations supplied by the classical twin design, we also get estimates of the co-variation between traits (Posthuma, 2009: 47). In the output of MGA, we often find that inter-trait correlations are very high even for traits that show low correlations on their own between individuals (Bouchard, 2004: 151), entailing that even for traits that are mostly influenced by the environment we can still trace common genetic causes. This is important for two reasons: 1) We can now model how certain traits co-occur with similar genotypes, and 2) can examine polygenes, or genes which only in conjunction with one another are sufficient for producing a trait. To address these questions, we need to add another set of numerical associations, namely “cross-twin/cross-trait”, “within-individual/cross-trait” and “cross-twin/within-trait” correlations to the model (Boomsma, et.al, 2002: 875). Hence in our example, we correlate trait 1 in one twin with trait 2 in the other. Unfortunately this cannot be extrapolated from the data supplied in the question. MGA also increases statistical power in comparison to univariate approaches (Schmitz, et.al, 1998; in Posthuma, 2009: 47; Martin, et.al, 1997).

The simplest MGA model considers only two traits at once and is called the bivariate Cholesky model. Consider the MGA figure above. The latent variables for each source of variance in a circle loading on the observed variables, trait 1 and trait 2 in boxes. The correlations between the genetic factors A (additive) and D (dominant) are 1.0 for MZ twins and .5 for DZ twins. This is the only difference between models for MZ and DZ twins. Other latent factors in this simple model include shared environment (C) and non-shared environment (E) (Note that we could also include an epistatic factor). We now divide the latent factors into those common between traits 1 and 2 (A1,C1,D1 and E1) as well as those that are specific to only trait 2 (A2,C2,D2 and E2) (factor loadings have either a subscript c or s to indicate this). To begin, we calculate a full Cholesky model with all possible latent variables present, where the number of latents is equal to the number of observed variables. Going from here it is our goal to reduce this load of unobserved variables into fewer latents by means of factor analysis (Posthuma, 2009: 50). Ideally, we want to reduce all the various additive A, D, C and E influences into single latent variables for all traits in a twin. Whether we find a model fit in that task will reflect if there is in fact a common underlying genetic or environmental cause to the traits.

Following this modelling approach, we can use the estimated factor loadings to derive two sets of correlations: cross-twin/cross-trait correlations for MZ and DZ twins. For instance we could observe such a correlation between the occurrence of trait 1 in twin 1 and trait 2 in the sibling of twin 1. If this measure for MZ twins exceeds that for DZ twins, then we can conclude that the covariation between the two traits are traceable to genetic influences. If the cross-twin/cross-trait correlation lies at about the same level for MZ and DZ twins, then we can infer some shared environmental influence (ibid: 47). Suppose for our example that the correlation between trait 1 and trait 2 for any individual is the same as this cross-twin/cross-trait correlation. In that case we can furthermore make the inference that the relationship between the two traits is not influenced significantly by non-shared environmental factors. Moreover any prediction of a person carrying trait 2 can be made at the same level of reliability from that same person carrying trait 1 or her identical twin carrying that trait. Quintessentially, we can see that from simply creating another set of correlations using both traits at the same time, we can extrapolate a variety of conclusions from the data. Specifically, consider the following formulae that utilise factor loading estimates from figure 2 (adapted from calculations used in Snieder, et.al, 1999):narrow sense heritability

The way in which we would model for covariates, for instance sex, in this type of analysis is by first testing a bivariate model in which the factor loadings are set to be the same for both sexes and then again where they are allowed to differ in smaller sub-models (see exemplary Snieder, et.al, 1999: 433). One question that we can ask in MGA is whether there is a causal relationship between the occurrence of traits 1 and 2. For instance, Neale & Kendler (1995) found that co-occurrence between depression and generalised anxiety is likely “caused by correlated susceptibilities to the disorders” (In Boomsma, et.al, 2002: 875). Moreover they could identify that while there is a common genetic vulnerability to both, the E components associated with expression of each are distinct. It is hence important to note that a high genetic correlation does not imply direct causal links between those genes and traits, but can be mediated by the environment. The upshot of the multivariate approach is hence its capability of distinguishing the genotypical from the environmental influences on multiple traits, which allows the identification of shared genes that underly varieties of phenotypes. To give another example, consider the relationship between hay fewer, asthma and dust allergies, wherein Duffy (et.al, 1990) found an underlying genetic basis common to all three using this very type of modelling.

Following a proper specification of the MGA framework, it is fortunately very straightforward to integrate other specialist techniques, such as MRI, EEG or Eyetracking. Instead of a personality measure, we simply plug in the output values of these techniques as observed traits onto which we then estimate factor loadings as in figure 2. For instance, we can use MGA to model phenotypes at the same time as endophenotypes (i.e. traits that accompany our main measure, for instance we could also observe hormone levels that we theorise to be different in depressed persons, e.g. cortisol (Boomsma, et.al, 2002: 876)). Here, we observe a physiological measure as trait 1 and a depression as trait 2. The main scientific benefit to this bivariate application is the investigation of a single genetic influence on multiple phenotypical effects that seem not necessarily related (Martin, et.al, 1997).

In combining MGA with EEG, Trubnikov (et.al, 1995) investigated the genetic influences on electro-physical markers of schizophrenia. What was innovative about this study was its use of MGA to calculate the heritability of specific EEG patterns. They took about 50 families of schizophrenics and performed a more complex MGA under the framework introduced above. Since they were investigating twins, the correlations in genetic similarity were much lower, but the basic principle remains essentially the same, replacing twin 1 and twin 2 with relatives of different degrees. Also recording an array of other phenotypical traits including working memory capacity, behavioural abnormalities, and so forth, they constructed a highly sophisticated model that was able to estimate the narrow sense heritability of EEG parameters at .41 (see formula above) as well as working memory capacity at about .5.

Another example of a successful integration comes from Hardoon (et.al, 2009), who attempted to correlate genetic factors to individual differences in brain volumes. The reason that this study is particularly interesting, is because it applies MGA well outside the original twin design to be able to integrate it with sMRI at only n=16. Ordinarily in genetic MRI research, we will perform a t-test or ANOVA between groups that carry a specific gene. However, considering gene interaction and dominance effects, multivariate methods are starting to emerge. What Hardoon did is take actual genetic samples from all participants, which were sequenced for nine genes of interest that were co-analysed. While they were not able to generate statistically solid results on the combined effect of these genes on cortical volume, their research shows methodological promise.

To finish, I would like to propose a study that is more confined to the Cholesky multivariate model, which integrates TMS. There are very large individual differences in the strength of one’s reaction to TMS on the primary motor cortex (Wassermann, 2005: 303). Since TMS studies usually involve very small sample sizes, this can be very problematic since we usually use TMS to the primary motor cortex in order to decide on an appropriate stimulation intensity. Therefore, the output of the actual paradigm might be influenced by individual difference mediated sensitivity of PMC to TMS. A study that explains the source of these differences is needed. Consequently, I propose utilising the bivariate MZ/DZ twin approach, measuring both personality measures as well as the resting motor threshold in TMS application, i.e. the minimum stimulation intensity required to produce a muscle response. MGA would reveal how much of this variability is intrinsic to the individual, or genetically mediated. This would shed light in more general terms on the genetic effect of susceptibility to brain stimulation.


Armour, S., & Haynie, D. L. 2007. Adolescent Sexual Debut and Later Delinquency. Journal of Youth and Adolescence 36: 141-152.

Boomsma, D., Busjahn, A., & Peltonen, L. 2002. Classical Twin Studies and Beyond. Nature Reviews Genetics 3: 872-882.

Borkenau, P., Riemann, R., Angleitner, A., & Spinath, F. M. 2002. Similarity of Childhood Experiences and Personality Resemblance in Monozygotic and Dizygotic Twins: A Test of the Equal Environments Assumption. Personality and Individual Differences 33: 261-269.

Bouchard, T. J. 2004. Genetic Influence on Human Psychological Traits. Current Directions in Psychological Science 13(4): 148-151.

Duffy, D. L., Martin, N. G., Battistutta, D., Hopper, J. L., & Mathews, J. D. 1990. Genetics of Asthma and Hay Fever in Australian Twins. American Review of Respiratory Disease 142: 1351-1358.

Falconer, D. S. 1960. Introduction to Quantitative Genetics. New York: Ronald Press.

Harden, K. P., Mendle, J., Hill, J. E., Turkheimer, E., & Emery, R. E. 2008. Rethinking Timing of First Sex and Delinquency. Journal of Youth and Adolescence 37: 373-385.

Hardoon, D. C., Ettinger, U., Mourão-Miranda, J., Antonova, E., Collier, D., Kumari, V., Williams, S. C. R. & Brammer, M. 2009. Correlation-based Multivariate Analysis of Genetic Influence on Brain Volume. Neuroscience Letters 450: 281-286.

Jang, K. L. 2005. The Behavioral Genetics of Psychopathology: A Clinical Guide. London: Routledge.

Johnson, W., Turkheimer, E., Gottesmann, I. I. & Bouchard, T. J. 2009. Beyond Heritability: Twin Studies in Behavioural Research. Current Directions in Psychological Science 18(4): 217-220.

MacGregor, A. J., Snieder, H., Schork, N. J. & Spector, T. D. 2000. Twins: Novel Uses to Study complex Traits and Genetic Diseases. Trends in Genetics 16(3): 131-134.

Martin, N., Boomsma, D. & Machin, G. 1997. A Twin-pronged Attack on Complex Traits. Nature Genetics 17: 387-392.

Neale, M. C. & Kendler, K. S. 1995. Models of comorbidity for multifactorial disorders. American Journal of Human Genetics. 57: 935-953.

Posthuma, D. Multivariate Genetic Analysis. In Kim, Y. K. 2009. Handbook of behavior genetics. 47-59. New York: Springer.

Rutter, M. 2002. Nature, Nurture, and Development: From Evangelism through Science toward Policy and Practice. Child Development 73: 1-21.

Schmitz, S., Cherny, S. S. & Fulker, D. W. 1998. Increase in Power through Multivariate Analysis. Behavior Genetics 28(5): 357-363.

Snieder, H., Boomsma, D. I., van Doornen, L. J. P., & Neal, M. C. 1999. Bivariate Genetic Analysis of Fasting Insulin and Glucose Levels. Genetic Epidemiology 16: 426-446.

Trubnikov, V. I., Alfimova, M. V., Uvarova, L. G., Orlova, V. A. 1995. A Multivariate Genetic Analysis of the Data from a Complex Study of the Predisposition of Schizophrenia. [article published in Russian]. Zh Nevrol Psikhiatr Im S S Korsakova 95(2): 50-56.

Wassermann, E. M. Individual Differences in Response to Transcranial Magnetic Stimulation of the Motor Cortex. In Hallett, M. & Chokroverty, S. (Eds.). 2005. Magnetic stimulation in clinical neurophysiology. Philadelphia: Elsevier Health Sciences.