Research Paper
Aging Clocks, Entropy, and the Challenge of Age Reversal
Authors
Andrei E. Tarkhov,1,2,* Kirill A. Denisov,1 Peter O. Fedichev,1,*
1Gero PTE, Paya Lebar Square, Singapore
2Present address: Retro Biosciences Inc., Redwood City, CA, USA
*Corresponding authors: at@gero.ai; pf@gero.ai
DOI:https://doi.org/10.59368/agingbio.20240031
Received: 2/28/2023, Revised: 6/3/2024, Accepted: 6/12/2024, Published: 8/12/2024
Full Article | PDF | Supplementary
Abstract
The ready availability of large longitudinal datasets, such as the UK Biobank, enables analyses of complex aging traits at previously unattainable levels. We analyze the aging signatures of DNA methylation and longitudinal electronic medical records from the UK Biobank and demonstrate that their dynamics can be recapitulated by rare and independent stochastic transitions among numerous metastable states, so that the accumulated effect of aging changes can be captured by a single stochastic variable, termed thermodynamic biological age (tBA), in agreement with other aging omics. In the proposed theoretical model, tBA increases linearly with age, tracks the entropy produced (and hence information lost) during the aging process, and causes an irreversible drift in physiological state variables, reduced resilience, and an exponential acceleration of the incidence of chronic diseases and mortality risks. The entropic nature of aging drift may constrain the possibility of complete age reversal and highlights important distinctions between aging in humans and mice, thus necessitating a re-examination of strategies for engineering negligible human senescence.
Introduction
Aging is a complex process manifesting itself across different organismal levels (see hallmarks of aging1) and leading to the exponential acceleration of the incidence of chronic diseases2 and mortality3. It is both practically and intellectually appealing to reduce the effects of the multitude of phenotypic changes to a few, or, even better, a single actionable indicator, mostly referred to as “biological age” (BA). BA models can be trained to predict the chronological age or mortality risks of an individual from different sources of biomedical data, ranging from DNA methylation (DNAm)4–14 to physical activity records from wearable devices15,16. Excessive BA (or BA acceleration) is associated with all-cause mortality as well as the prevalence, future incidence, and severity of chronic10,17,18 and transient diseases, such as COVID-1916,19–21. No wonder, BA predictors have increasingly gained traction in clinical trials22–24.
The dynamic properties of BA and the exact relation between BA variation and aging are not entirely understood. For example, DNAm age may increase without an appreciable increase in all-cause mortality in negligible senescent species25,26. Moreover, even in the most healthy individuals, BA levels can transiently change throughout the day following circadian rhythms27 or in response to stress factors and lifestyle choices such as smoking18,28. The characteristic time required for an organism state to relax to homeostatic equilibrium and the range of BA fluctuations progressively increase as a function of age18. The number of individuals exhibiting slow recovery increases exponentially and doubles approximately every 8 y, which is close to the mortality doubling time in humans16. Further applications of BA models in aging research and medicine require a better understanding of the dynamics and causal relation between, on the one hand, underlying biological and physiological variations of the organism state captured by various BA indicators and, on the other hand, mortality, prevalence and severity of diseases, and the effects of medical interventions.
To address these fundamental questions, we reviewed the universal features of aging signatures in biomedical data. We performed a principal component analysis (PCA) in a large cross-sectional white-blood-cell DNAm dataset29 and the longitudinal electronic medical records (EMRs) from the UK Biobank30. In both cases, aging dynamics can be recapitulated by rare and independent stochastic transitions among numerous metastable states of the methylation of individual CpG sites (5'-C-phosphate-G-3' sequence of nucleotides) or the incidence of specific diseases during life. At the same time, most of the variance in the data could be explained by a single factor linearly increasing with age and demonstrating the strongest correlation with Horvath’s DNAm age or the number of chronic diseases in the DNAm and EMR datasets, respectively.
To explain the dynamics behind the universally observed aging signatures, we put forward a semiquantitative model of aging in a complex regulatory network. We assumed that living systems are collections of a vast number of interacting functional units (FUs) that are initialized to a metastable state at the end of development. Aging then results from relaxation of the organism state toward equilibrium through a sequence of stochastic transitions, representing microscopic state changes in all FUs.
Dynamically accessible states are countless, as well as the number of stochastic transitions among them. Hence, the accumulated effect of random transitions on individual biological processes can be quantified by a stochastic variable with a linearly increasing mean and variance. The quantity progressively increases over time in a sufficiently large regulatory network and hence may emerge as a natural aging clock—the thermodynamic BA (tBA). We argue that tBA is thus the fundamental aging variable. It is best associated with the dominant principal component (PC) score in biomedical data and Horvath’s methylation clock. It is proportional to configuration entropy and quantifies information lost during aging.
Materials and Methods
PCA of the DNAm data
We took the white-blood-cell methylation data from the GSE87571 dataset29. It contains 729 samples (more than 440k features each), collected from patients of both genders (341 males and 388 females), covering the age range between 14 and 94 y.
To focus the analysis on aging, we filtered out patients younger than 20 y old (620 samples remaining). We filtered out CpG sites according to Pearson’s correlation between the DNAm levels and the chronological age at the level of (where N is the total number of the reported features), thus obtaining 96,536 sites. We performed and reported the results of the PCA on the resulting data.
We computed Horvarth’s methylation age4, a few CpG sites (cg17099569, cg00431549, cg11025793, and cg14409958) were not present in the data, and hence we had to exclude them from the calculation.
DNAm-PC3 increased with age at a rate faster than linear. We collected all the pairs of the DNAm-PC3 scores and the chronological age for every patient n in the dataset and used the available age range to produce a fit of the data to average from Equation (8):
Gene set enrichment analysis
We collected the CpG sites best associated with DNAm-PC1 and DNAm-PC3 according to the values of the respective vector components. We retrieved the gene IDs from Illumina’s 450k methylation arrays documentation. Finally, we performed gene ontology and disease ontology enrichment with the help of the R “clusterProfiler4.0” package31.
Preprocessing of EMRs from the UK Biobank
To avoid using the disease labels corresponding to the transient diseases, we selected 111 chronic diseases diagnoses using Chronic Condition Indicators for ICD-1032. Overall, patients are included in the EMR dataset, mostly of Caucasian origin ( or ), of both sexes ( males and females) in the age range of .
Entropy/entropy production rate determination
In the DNAm dataset, we computed the configuration entropy as the mean Shannon entropy over all individual’s CpG sites indexed by in a sample (patient) indexed by as follows:
In the EMR dataset, the probabilities shown in Equation (2) corresponded to the incidence of the diseases in the subsequent 5-y-old age bins.
Theory: Aging in a complex regulatory network
We propose to model the effect of the interactions among FUs with the help of the auxiliary variables—the effective “regulatory fields” evolving over time according to
We start from Equation (3) and observe that the regulatory fields change over time in response to the deterministic (the direct linear and the higher-order nonlinear interactions between units) and stochastic forces . We naturally assume that the stochastic force terms are not correlated over long time intervals: , where is the power of stochastic noise, represents the averaging along the individual trajectory and over all specimen, and is the Dirac delta function.
In spite of apparent simplicity, Equation (3) is nonlinear and may have highly nontrivial solutions leading to applications in condensed matter physics34 and neurophysiology35. In our discussion, it is important that the stochastic noise drives the system toward equilibrium at an effective temperature controlled by the power of the noise .
The data suggest that there is a large “bulk” of units characterized by excessive lifetimes. Mechanistically, this may be explained by operating within a vicinity of a metastable state with a very high activation energy relative to the effective temperature, (Fig. 3A).
We will assume that the effects of aging are small on the scale of and hence the depolarization rates are not only very small but also do not considerably depend on age. Accordingly, the depolarization is on average a linear function of age and the total number of configuration transitions : and .
Let us think that the aging drift in the form of simultaneously occurring configuration transitions progresses slowly compared with fast functional responses in the organism. We linearize the equations for the regulatory fields next to the youthful state :
The solutions of the linearized Equation (4) can be best understood with the help of a linear decomposition: , where is the pathway activations and is the right eigenvectors of the interaction matrix corresponding to the smallest eigenvalues (the matrix is nonsymmetric and hence its complete eigensystem must include the left eigenvectors , and the right eigenvectors ). The components of the vector characterize the participation of the FU in the pathway .
Substituting the solution into the equation and multiplying both sides by the corresponding left eigenvector, we find that
It is important to understand that all the relevant vectors and constants cannot be derived and could only be measured experimentally. By virtue of the central limit theorem, the large number of configuration transitions ensures that the effect of the mean field is exactly linear in .
Qualitatively, the net effect of the rare transitions and the associated mean field together produce a persistent pathway activation, on average, slowly increasing with age. This is often referred to as an enslavement principle: stochastic depolarization transitions produce a slowly evolving mean field that disturbs pathways characterized by fast relaxation times having thus enough time to adjust to its current level.
Results
Aging signatures in cross-sectional DNAm data
We start by analyzing a dataset of DNAm in aging white blood cells29. Each of the reported DNAm levels is the average of a binary single-cell signal over a bulk tissue sample comprising many cells. In other words, is the probability of finding a CpG site in a methylated state.
To avoid complications due to the crossover between development and aging, we only analyzed donors older than 25 y. Furthermore, to counter the “curse of dimensionality”36 due to the shallow nature of the dataset ( CpGs measured in less than 800 patients), we focused our analysis only on CpGs significantly correlated with age (after the Bonferroni correction for multiple testing, ). Of approximately CpGs significantly correlated with age (almost of all reported), most were either initially hypermethylated (, ) or hypomethylated (, ).
To normalize the distribution of the DNAm signal confined in the interval , we converted DNAm levels to log odds ratios . We refer to as “regulatory fields” by analogy with condensed matter physics (see the Materials and Methods section).
The PCA of regulatory fields reveals a few PCs associated with age (DNAm-PC1 and DNAm-PC3 explained and of variance in the data but changed with age, respectively, and DNAm-PC2 explained of variance but did not change with age and was omitted in the analysis of aging changes; see Figs. S1 and S2). The dominant PC (DNAm-PC1) evolves approximately linearly as a function of age (Fig. 1A; Pearson’s , ). The variance of DNAm-PC1 also increases linearly with age (Fig. 1B), which is a feature of a stochastic process (random walk).
Aside from DNAm-PC1, the best correlation with chronological age was produced by the third PC, DNAm-PC3 (Pearson’s , ). DNAm-PC3 increased faster in the subsequent age-adjusted bins than at a linear pace as a function of age (Fig. 1C). The variance of DNAm-PC3 also increased faster than linearly, so that the inverse variance decreased approximately linearly in the patients older than 40 y old (Fig. 1D). By extrapolation, the inverse variance of DNAm-PC3 would approach zero (and hence the variance would diverge) at some age within the age range of y. This behavior hints at a nonlinear coupling of DNAm-PC3 to (and hence the dependence on) DNAm-PC1.
The loading vectors corresponding to DNAm-PC1 and DNAm-PC3 describe two distinct methylation profile changes with age. The distribution of the PC1 loading vector’s components is non-Gaussian and bimodal. Hence, the dominant aging signature in DNAm data involves two large groups of CpG sites (Fig. 1E) changing their methylation (“polarization”) with age in opposite directions. The first PC score is then proportional to the total number of polarization transitions.
In contrast, the distribution of the loading vector’s components from DNAm-PC3 has a single peak and clear leading contributions from non-Gaussian tails (Fig. 1E). The gene set enrichment analysis of methylation regions associated with the PC3 variation reveals pathways involved in innate immunity and cancer (Fig. 1G,H).
The age-associated PC scores demonstrate the best correlation with Horvarth’s DNAm age4 (Fig. 1F). The corresponding Pearson’s correlation coefficients were () and () for DNAm-PC1 and DNAm-PC3, respectively (see also Figs. S1 and S2, for a summary of other PC scores’ correlation with age and Horvath’s DNAm age).
Next, we checked which characteristic features of dynamics associated with aging in DNAm can be observed in other forms of biomedical data. To confirm the stochastic character of the dominant aging signature in humans, one would need to analyze a large longitudinal dataset. We did not have access to a high-quality set of longitudinal DNAm measurements. Instead, we turned to an extensive EMRs collection from the UK Biobank. Irrespective of the age at the first assessment, the EMRs provided information on the prevalence of chronic diseases from inception until the end of the follow-up (slightly more than 10 y after enrollment, on average). We represent each patient by a vector of binary variables indicating the presence or absence of a disease (see the Materials and Methods section).
Most of UK Biobank’s subjects are healthy early in life. Hence, the states representing the presence of diseases are initially polarized (). Most chronic diseases are relatively infrequent: the most prevalent diseases are metabolic disorders (with the prevalence of ), joint disorders (), and arthrosis (). Hence, most of the states stay polarized for life, with only a small fraction of patients exhibiting depolarization transitions leading to the incidence of specific diseases.
The PCA of binary-valued vectors representing a health state for the EMRs of UK Biobank’s subjects at the time of the first assessment look like the PCA results from the white-blood-cell DNAm study above. This time, we observed only two PCs significantly associated with age (see the blue and green lines and the respective ranges corresponding to the mean levels and one standard deviation in Fig. 2A).
The dominant aging signature, the first PC in the UK Biobank’s EMR data (EMR-PC1), evolves approximately linearly as a function of age and is linearly associated with the total number of diagnosed diseases (Fig. 2B). Hence, in line with the results of our DNAm analysis above, the first PC correlates with the total number of depolarization transitions (this time being equal to the disease burden at the time of measurement).
As expected, the variance of EMR-PC1 increases linearly with age (Fig. 2C), which is a feature of a stochastic process. This time, however, due to the longitudinal nature of the EMR dataset, we can make a stronger claim by computing the autocorrelation function of EMR-PC1. We observe that the autocorrelator increases linearly as a function of the time lag between the observations, which is typical for a result of a stochastic process with a drift (Fig. 2D).
Aging in a complex regulatory network
To explain the key features of dynamics of aging signatures, let us consider an organism as a network of interacting FUs. Each of the units can be observed in multiple states of varying physiological capacity. We have already presented examples of such microscopic states corresponding to the different methylation levels of CpGs or disease states. However, the language may be used to describe other situations involving, e.g., mutations or conformation changes in biomolecules.
For any given FU , we will focus on the two most-occupied microscopic states (Fig. 3A) corresponding to two adjacent potential wells in the free energy landscape shaped by regulatory interactions. We encode the pair of states by a binary variable taking values of and , respectively. At the end of development, most FUs are polarized, so that most of the subjects occupy one of the selected states.
According to the model, the initial states set during development are metastable states. Hence, over time, the organism state relaxes toward thermal equilibrium via a series of configuration transitions between microscopic states driven by fluctuations. Both the DNAm and EMR data suggest that, in most cases, the transitions are infrequent. On average for each FU, we observe fewer than a single transition between the states over the lifetime of an organism. In other words, the corresponding transition rates are slow (, see Fig. 3B). Slow transition rates depend on the activation energies . and the effective temperature exponentially, 37. Therefore, we expect that are barely affected by the effects of aging.
The effective temperature characterizes the statistical properties of regulatory noise, which may depend on the fidelity of regulatory interactions and the deleteriousness of the environment33. The effective temperature shall not be confused with (although maybe related to) the body or environmental temperature (see the Materials and Methods section).
Quantitatively, stochastic fluctuations leading to configuration transitions change the average polarization of every FU linearly over time :
The PCA of a dataset modeled in Equation (6) would produce the first PC, which is directly proportional to the total number of configuration transitions , where is the average depolarization rate and is the total number of FUs. Only a tiny fraction of all FUs is practically observable in any given experiment. However, we may expect that the total number of the depolarization transitions in any sufficiently large subset of the data (such as DNAm or EMRs) is proportional to .
The depolarization rate for each FU may be slow, . However, the total number of FUs available for the configuration transitions is practically infinite , and their compound effect is not necessarily small. If an organism is sufficiently long lived, the number of configuration transitions is substantial, and the aging signature described in Equation (6) would dominate the variance in real-life biomedical data. This is consistent with our observations in the PCA of the DNAm (Fig. 1A) and EMR data (Fig. 2A) above. The distribution of the first PC vector’s components would be bimodal for DNAm (bidirectional gain and loss of methylation, Fig. 1E) and unimodal for EMRs (the number of diagnoses would only grow with age).
Given that is a cumulative number of random transitions, its dynamics would obey a stochastic Langevin equation with a drift (diffusion). Hence, the variance of in age-adjusted bins should increase linearly with age. This prediction is consistent with the observed behavior of the first PC in the DNAm (Fig. 1B) and EMR (Fig. 2C) data.
More evidence in favor of the stochastic character of could be produced by the investigation of the autocorrelation function , where represents the averaging, first, along the individual trajectory and, then over all patients. The autocorrelation function of the leading PC in the EMR data increased linearly as a function of the time lag in the range between and 10 y (Fig. 2D). The diffusion coefficient’s estimates from the variance and autocorrelation increase turned out to be close: 0.012 and 0.009 per year, respectively, thus confirming the association of the leading PC score with the increasing number of configuration transitions .
Assuming that we start from highly polarized states, , we show that the configuration entropy is equal to the number of depolarization transitions
Due to the inverse exponential dependence between the transition rates and the activation barrier, the lowest activation barriers would lead to the highest depolarization rates (the top FUs in Fig. 3B). Below, we show that the interactions between such FUs can no longer be neglected and thus the FUs should form coregulated clusters (see the Materials and Methods section). We expect that the joint activation of FUs forming a cluster (or a pathway) labeled by affects all other FUs in the cluster via a shift of regulatory fields according to , where and are the pathway activation strength and the participation vectors’ components, respectively.
In Figure 3C, the solid blue line represents the cross-sectional view of the free energy as a function of the pathway activation variable experiencing stochastic fluctuations in response to stress factors. The dynamics of the pathway activation depend on the power of stochastic noise (proportional to the effective temperature ) and persistent stress factors . The effects of the regulatory interactions can be described by the recovery rate, , which is directly related to the curvature of the basin of attraction for . The recovery rate is the inverse recovery time and characterizes the pathway’s ability to respond to stress and relax toward the equilibrium position after a shock.
The model suggests that the depolarization transitions occur independently from pathway activation, but their cumulative effect slowly reshapes the free energy landscape of regulatory fields for other FUs (see the dashed blue line in Fig. 3C). In other words, stochastically accumulated changes are tiny and not critical in the short term, but their accumulated effect slowly gnaws away the organism’s resilience and stability. Because the number of transitions is enormous, the central limit theorem comes into play38 and ensures that the net effect of configuration changes on any physiological process must be proportional to the total number of depolarization transitions .
Over longer time scales, well exceeding pathway equilibration times , the stochastic component of fluctuations averages out. The mean pathway activation and variance are then given by (see the Materials and Methods section)
Accordingly, the fluctuations of organism state variables other than stochastic variables described in Equation (6) can be attributed to a few clusters of coregulated features participating in pathways characterized by the slowest recovery rates (vanishing denominators in Eq. 8). In this case, the participation vectors and the pathway activation variables should approximately coincide with the leading PC loading vectors and scores, respectively.
According to Equation (8), an increasing number of depolarization transitions causes progressive shifts in pathway activation. Notably, this effect is indistinguishable, albeit smaller than the effects of constant stress modeled by . More subtly, aging in the form of progressive depolarization of an organism state measured by also affects the recovery rates in the denominator of Equation (8). The two effects combine and cause the mean pathway activations and therefore the leading PC scores in the data depend on age in a nonlinear—hyperbolic fashion (see the dynamics of DNAm-PC3 in Fig. 1C and EMR-PC2 in Fig. 2A).
The nonlinear coupling of organism-state fluctuations with depolarization transitions may reduce one of the smallest recovery rates to zero: at some point late in life at age . The situation corresponds to the critical point corresponding to the complete loss of resilience, that is, the inability of the system to retain its homeostasis equilibrium and hence it is incompatible with survival18.
There is no way to measure the recovery rate in cross-sectional data. However, according to Equation (8), the vanishing recovery rate should lead to the simultaneous divergence of one of the leading PC scores and its variance at a certain advanced age. In our analysis, DNAm-PC3 increases faster than linearly as a function of chronological age. The fit of the DNAm-PC3 scores to the hyperbolic solution for the average from Equation (8) gives y (see the solid line in Fig. 1C and the Materials and Methods section, for the details of the calculations).
In agreement with Equation (8), the extrapolation shows that the inverse variance of DNAm-PC3 hits zero and hence the variance of DNAm-PC3 diverges at approximately 120 y (see the solid line in Fig. 1D). The estimations of the limiting age from the behavior of DNAm-PC3 mean and its variance are comfortably close. Hence, our calculations support the existence of a critical point in the age range of y.
In reality, the disintegration of the organism state happens well before reaching the criticality at the limiting age . Stress factors and the depolarization of the organism state do not merely shift the mean pathway activation levels. Both factors may also decrease the activation energy separating the organism state from disintegration and death (Fig. 3C). In the linear regime, the activation energy linearly depends on the mean field, , where .
Mortality in the model is nothing else but the probability of barrier crossing per unit time: . Therefore, the aging drift in the form of the linearly increasing number of the configuration transitions registered by the progressively increasing may drive the exponential acceleration of all-cause mortality with age: . The mortality doubling rate, , in the model depends on the details of the regulatory interactions (through ), the rate of the aging drift , and the effective temperature .
Discussion
We put forward a semiquantitative model of aging in a complex regulatory network and applied it to the analysis of human aging signatures in a cross-sectional white-blood-cell DNAm dataset29 and the extensive collection of longitudinal EMRs from the UK Biobank30. The model explains the dynamics of aging signatures in both signals by the cumulative effect of numerous stochastic configuration changes accompanied by increasing entropy.
The data suggest that the rates of transitions among microscopic states of methylation or the incidence of specific diseases are slow. In most cases, fewer than a single transition occurs throughout lifetime for each FU. Even though the transition rates are slow, the number of transitions is vast: more than of all CpG sites exhibited age-related dynamics. Hence, the compound effects of transitions accumulate and dominate the dynamics of the physiological state in the long term. We observed that in the leading aging signature explaining most of the variance in the data (the first PC score), which increased linearly with age in the DNAm and EMR data. The first PC score was proportional to the number of configuration transitions (the number of DNAm level changes or chronic diseases). Simultaneously, the first PC variance grew linearly with age in both datasets, as is expected for a stochastic quantity representing accumulation of a large number of independent random transitions. Accordingly, we propose using as a quantitative measure of the net effect of entropic changes on the aging organism—tBA (tBA). The first PC score is then a good proxy of tBA for a specific signal well correlated with age and Horvath’s DNAm age. Due to the central limit theorem38 for a large number of random transitions, tBA would increase linearly with age with a high accuracy. That explains why it is almost always possible to build an accurate predictor of chronological age for different biomedical signals4,5,8.
Configuration transitions do not only work as a natural clock in aging organisms but also define the thermodynamic arrow of time. Our model suggests that tBA is proportional to entropy produced (information lost) during aging. Particular depolarization patterns may differ in various cells of one organism or among different organisms of the same age. However, the number of transitions would be comparable and hence quantify the overall aging state of an organism. In other words, the older an organism gets the more “rust” it accumulates, which is captured by tBA. We expect that the present model can generalize to other FUs and readouts changing over time, such as gene expression changes39, conformation or chemical modifications of macromolecules, DNA damage, etc. Because configuration changes occur simultaneously in every part of an organism and increase its tBA, observation of a change in any single data modality cannot be interpreted as the cause of aging drift in other modalities. The inherent stochasticity of aging epigenome may explain why genetically identical pairs of twins diverge with age40,41 and may drive transcriptomic dysregulation in cancer42.
The aging drift manifests itself as a “mean field” causing the shift of physiological indices that is proportional to tBA. The mean-field theory is a powerful approximation for understanding the behavior of interacting systems first developed in physical sciences43 and since then applied in statistical inference44 (see, e.g., protein structure prediction45,46). Here, we modeled the overwhelming complexity of FU interactions by a simpler approximation, where each FU or large FU clusters operate independently and substitute cumulative effects of all other FUs by their mean field represented by tBA. Other than the dominant aging PC1tBA in biological signals, other leading PCs with the longest recovery times and strongest fluctuations would also change with age. The aging drift affects those modules or pathways in a way similar to other stresses (such as smoking or diet). Because tBA increases linearly with age, we expect that, in the first approximation, all pathways “follow” the aging process by increasing (or decreasing) activation barriers linearly with age. In a higher-order approximation, the nonlinearity of regulatory interactions produces significant deviations from a simple linear age dependence of nondominant PC scores in biomedical data (DNAm-PC3 and EMR-PC2). Nonlinear regulatory interactions let the configuration changes (but not stresses) affect the resilience defined as the ability of an interacting FU cluster to respond to a perturbation and relax to equilibrium afterward. If the recovery rate is slow, this may lead to a divergence of organism state fluctuations at a critical age, where the recovery rate would vanish completely. For example, this happens for the dynamics of DNAm-PC3 because the extrapolation of the mean and variance of DNAm-PC3 diverges at an age close to y. Recently, we also demonstrated that linear log-mortality predictors built from complete blood counts and physical activity18 exhibited similar diverging fluctuations and a vanishing recovery rate at about the same limiting age y.
Hence, the prediction of mortality (or the remaining lifespan) in humans requires an estimate for tBA and for a few most crucial pathway activations (also, on average, depending on ). Hence, no single BA measure fully describes longevity in humans. We expect that the BA models trained to predict chronological age should yield better estimates of tBA. On the other hand, the models trained to predict the remaining lifespan (such as PhenoAge47, GrimAge10, DOSI18, etc.) should return a combination of pathway activations associated with the prevalence of diseases and accelerated mortality13 and hence be better suited for the detection of reversible effects of diseases, lifestyles, and medical interventions15.
The PCA of human biomedical data is peculiar because it produces more than a single age-dependent feature, which is not the case in simpler animals such as worms48, flies49, or mice50, where aging could be explained by a dynamic instability leading to the exponential disintegration of an organism state49,51. We expect that the entropic contribution to aging is not dominant in those cases, and the BA is a dynamic (not entropic) factor, and the aging effects may be reversible in those animal models50. The loss of stability signs in the DNAm-PC3 hint at a two-stage aging process in humans. Our model explains how the entropic changes reduce resilience and the recovery rates of protective mechanisms (hallmarks of aging). The stochastic accumulation of regulatory noise or deleteriousness of the environment lead to an exponential destabilization of the organism state. Therefore, we conclude that the cumulative effect of entropic changes captured by tBA explains the exponential mortality and disease incidence acceleration—a characteristic feature of human aging.
The present model along with other direct dynamics stability analyses of organism state fluctuations in longitudinal biomedical data16,18 support the idea that the human organism state (and potentially other long-lived mammals, such as naked mole rats) stays metastable until very late in life, and slowly loses its stability and resilience with age due to accumulation of entropic changes. According to the model, human aging has a significant entropic component, which not only dominates the variance in biomedical data but also causes increasing stress on adaptive subsystems (stress responses). Our approach is in line with Hayflick’s proposal52 that distinguishes the genetic determinism of longevity from the stochasticity of aging. If the proposition is accurate, we must expect that although the hallmarks of aging (features or activations of specific adaptive pathways leading to mortality and morbidity acceleration1) can, in principle, be reverted, the expected effects of such interventions on lifespan may be transient and limited.
We expect that attempts to reduce the dominant aging signature tBA would require availability and timely application of an immense number of precise interventions. This is, to say the least, technologically challenging. Accordingly, we predict that aging in humans can be reversed only partially. The fact that there is a strong entropic contribution to aging does not necessarily mean that one cannot reset some of the organismal subsystems closer to a younger state. The entropic character of aging implies that age reversal would be limited to a specific organismal subsystem without a full rejuvenation of the whole organism. For example, recent epigenetic reprogramming experiments53–55 led to the reversal of epigenetic clock readouts.
However, our model suggests that achieving strong and lasting rejuvenation effects in humans may remain a remote perspective. However, there may be a more practical way to intercept aging by dramatically slowing the rate of aging. The rates of entropic transitions between any two states depend exponentially on the effective temperature. Hence, even minor alterations of the effective temperature may cause a dramatic drop in the rate of aging. In condensed matter physics, this situation is known as glass transition, where the viscosity and relaxation times may grow by 10–15 orders of magnitude in a relatively narrow temperature range56–63. We note that living organisms are nonequilibrium open systems, and hence the effective temperature is not the same as body or environment temperatures. Rather, the effective temperature is a measure of how deleterious the environment is for an organism33.
We speculate that the evolution of long-lived mammals may have provided an example of tuning the effective temperature. Naked mole rats are known for their exceptional stress resistance, DNA repair efficacy64–67, and translational fidelity68,69. Those factors should reduce noise in regulatory circuits and lower the effective temperature of the system. One example of such tuning may be used to explain the recent studies indicating that naked mole rats breeders age slower than their nonbreeding peers, at least according to the DNAm clock25. Social status and mental health also impact the aging rate measured by DNAm and other clocks in humans31,70, possibly via neuroendocrine system. Higher socioeconomic status, somewhat counter-intuitively, significantly increases the mortality doubling rate and simultaneously reduces the age-independent mortality in such a way that mortality in the highest and lowest income groups converge at an age close to our estimates71. Such a behavior of mortality is consistent with a reduction in the effective temperature in the higher-income cohorts in our model.
Future studies should help establish the best ways to “cool down” the organism state and reduce the rate of aging in humans. The simple linear PCA exemplified here may only help gain a qualitative understanding of underlying processes. We expect that increasing availability of high-quality longitudinal biomedical data will lead to a better understanding of the most critical factors behind the kinetics of aging and diseases, including those controlling entropy production in the course of aging. This should lead to a discovery of actionable targets influencing the rate of aging, help slow down aging, and thus produce a dramatic extension of human healthspan.
P.S. Since our first publication of a preprint of this article, it inspired a number of follow-up works by other groups. For example, three papers investigated the stochastic nature of aging by extensive simulations and analyses of gene expression and DNAm data and supported our original proposal that the dominant component of aging changes was stochastic: on a single-cell level72, for DNAm-based clocks73, and gene expression74. The studies of dynamical properties of stochastic changes in DNAm75 and other longitudinal signals in mice50 in the course of aging and in response to antiaging interventions also corroborated our analyses and predictions regarding the distinction between entropic/irreversible and dynamic/reversible components of aging signatures.
Acknowledgments
We would like to thank A. Velikanova, D. Kriukov, K. Avchaciov, and T. Pyrkov for insightful discussions and help with data preparation and M. Kholin and A. Kadet for stimulating discussions and comments on the article. The work was funded by Gero LLC (Singapore).
Author Contributions
A.E.T. and P.O.F. proposed, formulated, and developed the theory. A.E.T, K.A.D., and P.O.F. conducted data analyses and validated the model assumptions against the data. A.E.T. and P.O.F. wrote and prepared this article.
Conflicts of Interests
P.O.F. is a shareholder of Gero PTE. A.E.T., K.A.D., and P.O.F. were employed by Gero PTE during the work on the article. The study was funded by Gero PTE.
Supplementary Materials
Supplemental information can be found here: Supplementary.