Sex-specific trajectories of molecular cardiometabolic traits from childhood to young adulthood

Background The changes which typically occur in molecular causal risk factors and predictive biomarkers for cardiometabolic diseases across early life are not well characterised. Methods We quantified sex-specific trajectories of 148 metabolic trait concentrations including various lipoprotein subclasses from age 7 years to 25 years. Data were from 7065 to 7626 offspring (11 702 to14 797 repeated measures) of the Avon Longitudinal Study of Parents and Children birth cohort study. Outcomes were quantified using nuclear magnetic resonance spectroscopy at 7, 15, 18 and 25 years. Sex-specific trajectories of each trait were modelled using linear spline multilevel models. Results Females had higher very-low-density lipoprotein (VLDL) particle concentrations at 7 years. VLDL particle concentrations decreased from 7 years to 25 years with larger decreases in females, leading to lower VLDL particle concentrations at 25 years in females. For example, females had a 0.25 SD (95% CI 0.20 to 0.31) higher small VLDL particle concentration at 7 years; mean levels decreased by 0.06 SDs (95% CI −0.01 to 0.13) in males and 0.85 SDs (95% CI 0.79 to 0.90) in females from 7 years to 25 years, leading to 0.42 SDs (95% CI 0.35 to 0.48) lower small VLDL particle concentrations in females at 25 years. Females had lower high-density lipoprotein (HDL) particle concentrations at 7 years. HDL particle concentrations increased from 7 years to 25 years with larger increases among females leading to higher HDL particle concentrations in females at 25 years. Conclusion Childhood and adolescence are important periods for the emergence of sex differences in atherogenic lipids and predictive biomarkers for cardiometabolic disease, mostly to the detriment of males.


Laboratory
For each sample, the nuclear magnetic resonance (NMR) spectra were analysed for absolute metabolite quantification (molar concentration) in automated fashion. A ridge regression model was applied for quantification of each metabolite to overcome the problems of heavily overlapping spectral data. Quantification of lipoprotein lipid data was performed by calibrating against high performance liquid chromatography methods, and then individually crossvalidated against NMR-independent lipid data. Low-molecular-weight metabolites, as well as lipid extract measures, were quantified as mmol/l based on regression modelling calibrated against a set of manually fitted metabolite measures. The calibration data was quantified based on iterative line-shape fitting analysis using PERCH NMR software (PERCH Solutions Ltd., Kuopio, Finland). Absolute quantification could not be directly established for the lipid extract measures due to experimental variation in the lipid extraction protocol. Therefore, serum extract metabolites have been scaled via the total cholesterol as quantified from the native serum LIPO spectrum. We have previously shown strong correlation between the NMR and clinical chemistry measures that are available from both methods.

Data preparation
Prior to statistical analysis, preparation of metabolomics data was performed for each occasion separately using the R package metaboprep (https://github.com/MRCIEU/metaboprep) (version 0.0.1) 1 . Quality control was performed excluding the derived metabolomics measures from missingness and clustering. Briefly, individuals, and then metabolites, with high missingness (>=80%) were removed. Missingness was then re-calculated for individuals and metabolites, with removal based on >=20% missingness. Individuals were then removed based BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Model selection
To select the optimal linear spline model for 144 trait concentrations measured from 7y to 25 years (y), we ran a series of models including; model 1: a model with two linear spline periods (7y to 18y and 18y to 25y) model 2: a second model with 2 linear spline periods (7y to 15y and 15y to 25y) and model 3: a single slope model (a single age term which assumed constant change from 7y to 25y). Linear spline periods were chosen to reflect ages in whole years that were closest to mean age at clinics and hence where the density of measures was greatest; note that the same process was carried out to select models for the four traits with measures only available to 18y with a model with two linear spline periods (7y to 15y and 15y to 18y) and single slope (7y to 18y) being compared. For each trait and model, we examined Akaike's Information Criterion (AIC) as an indicator of model fit with lower AIC values indicative of better model fit. Upon selection of the best fitting model based on AIC, we examined, observed and predicted values of models to further assess model fit. Model selection was carried out in both sexes combined to select an optimal model for each trait that would be comparable between the sexes. However, all trajectories were allowed to vary by sex in our final model for each trait by including an interaction term between the linear spline periods/age and sex. S1 Table shows a complete list of all 148 outcomes (144 measured up to 25y and four measured to 18y) and the model details for each outcome). Following the above process of model comparisons, 68 of the 144 outcomes measured to 25y had two linear spline periods from 7-15y and 15-25y, 75 of the 144 outcomes had two linear spline periods from 7-18y and 18-25y and one of the 144 outcomes (acetoacetate) was a single slope model from 7y to 25y. The final models selected for the four outcomes measured only up to 18y had two linear spline periods from 7-15y and 15-18y.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Age (in years) was centred at the first available measure (7y). Models were estimated with robust standard errors for both fixed effects and individual level random effects to account for skewed distributions in some traits. Unstructured variance-covariance matrices for the individual level random effects were used to estimate most trajectories; the optimal linear spline model selected and other model details including details of the variance-covariance matrix for each trait are shown in S1 Table. All models included individual level random effects for the intercept and each linear spline period selected. For 35 of 148 outcomes modelled (S1 Table) the covariance of the individual level random effects (level 2) were set to zero for some parameters to improve model convergence. Models allowing occasion level measurement error to vary with age (level 1 random effects for the slopes) were also explored for each risk factor.
However, due to difficulties with convergence given sparsity of measures, our models included only a random effect for the intercept at level 1. The model for each outcome took the form of metaboliteij = β0 + β1 + u0j + (β2+ u1j )sij1 + (β3+ u2j )sij2 + (β4+ u1j )sij1 + (β5+ u2j )sij2 + eij where for person j at measurement occasion i; β0 represents the fixed effect coefficient for the average intercept in males, β1 represents the difference between the intercept for females compared with males, β2 and β3 represent fixed effect coefficients for the average linear slopes of each linear spline in males, β4 and β5 represent the difference in the fixed effect coefficients for the average linear slopes of each linear spline in females compared with males,u0j to u2j indicate person-specific (or individual level/level 2) random effects for the intercept and slopes respectively, and eij represents the occasion-specific residuals or measurement error which was allowed to vary by the intercept.
For each sex, models directly estimate mean predicted level of each metabolite at 7y (the intercept) and mean predicted slopes in original units (mostly mmol/l), with slopes interpreted as change per year in each metabolite in the respective spline period/age period. Following BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) analysis, these estimates were then used to calculate mean predicted absolute change in each trait level from 7y to 25y using the slopes given by each model. The mean predicted level of each trait at 25y was also estimated. Post-analysis, all the above estimates were then converted to SD units by dividing by the sex-combined SD of the observed metabolite at 7y, to aid comparison of results between metabolites.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Characteristics of included vs. included participants
We examined characteristics associated with not being included in our analyses due to missing data on sex or molecular traits. To do this, we compared the socio-demographic characteristics at birth of mothers and partners of participants included in the analyses compared to those excluded from the analyses. All characteristics were measured during pregnancy or at birth through questionnaires or from routine health records.
Marital status was obtained from antenatal questionnaires and classified as never married, widowed, divorced, separated, first marriage, marriage 2 or 3. Household social class was measured as the highest of the mother's or her partner's occupational social class using data on job title and details of occupation collected about the mother and her partner from the mother's questionnaire at 32 weeks gestation. Social class was derived using the standard occupational classification (SOC) codes developed by the United Kingdom Office of Population Census and Surveys and classified as I professional, II managerial and technical, IIINM non-manual, IIIM manual, and IV&V part skilled occupations and unskilled occupations. A questionnaire at 32 weeks gestation asked mothers to report their educational attainment, which was categorized as below O-Level (Ordinary Level; exams taken in different subjects usually at age 15-16 at the completion of legally required school attendance, equivalent to today's UK General Certificate of Secondary Education), O-Level only, A-Level (Advanced-Level; exams taken in different subjects usually at age 18), or university degree or above. A questionnaire at 32 weeks gestation asked partners to report their educational attainment, which was categorized as below O-Level (Ordinary Level; exams taken in different subjects usually at age 15-16 at the completion of legally required school attendance, equivalent to today's UK General Certificate of Secondary Education), O-Level only, A-Level (Advanced-Level; exams taken in different BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) subjects usually at age 18), or university degree or above. Smoking in the first trimester of pregnancy was self-reported by mothers at 18 weeks gestation. Birthweight and gestational age were derived from clinical records. Maternal age was reported in the mother's antenatal questionnaires. Maternal pre-pregnancy weight and height were self-reported in antenatal questionnaires.

Sensitivity analyses
We compared sex differences in metabolic traits at 7y and 25y from the multilevel models with the same differences generated from linear regression analyses at each age separately. This was done to explore the appropriateness of our modelling strategy, compared with more conventional analytic approaches. As outlined, participants included in our analyses required data on sex and at least one measure of each metabolite from 7y to 25y. Mothers of participants included in the analyses tended be of higher household social class and more educated than mothers of participants excluded due to missing data and these differences were similar between females and males (S2 Table). Thus, we performed sensitivity analyses weighted by the probability of being included in our analyses to account for the higher probability of being included due to greater social advantage. The participant level weights were estimated using logistic regression using all socio-demographic characteristics listed above with the addition of sex and were subsequently incorporated into the multilevel models as level two weights which adjust for the unequal probability of selection of the participants. We repeated all SD unit analyses standardising with the sex-specific SD of each metabolite at 7y to examine whether our main results (standardised with sex-combined SDs) were similar. BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)