Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression

Affiliations Department of Psychology, University of Gothenburg, Gothenburg, Sweden, Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden

Affiliation Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden

Affiliations Department of Psychology, University of Gothenburg, Gothenburg, Sweden, Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden, Department of Psychology, Education and Sport Science, Linneaus University, Kalmar, Sweden

* E-mail: [email protected]

Affiliations Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden, Center for Ethics, Law, and Mental Health (CELAM), University of Gothenburg, Gothenburg, Sweden, Institute of Neuroscience and Physiology, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Ali Al Nima, 
  • Patricia Rosenberg, 
  • Trevor Archer, 
  • Danilo Garcia

PLOS

  • Published: September 9, 2013
  • https://doi.org/10.1371/journal.pone.0073265
  • Reader Comments

23 Sep 2013: Nima AA, Rosenberg P, Archer T, Garcia D (2013) Correction: Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression. PLOS ONE 8(9): 10.1371/annotation/49e2c5c8-e8a8-4011-80fc-02c6724b2acc. https://doi.org/10.1371/annotation/49e2c5c8-e8a8-4011-80fc-02c6724b2acc View correction

Table 1

Mediation analysis investigates whether a variable (i.e., mediator) changes in regard to an independent variable, in turn, affecting a dependent variable. Moderation analysis, on the other hand, investigates whether the statistical interaction between independent variables predict a dependent variable. Although this difference between these two types of analysis is explicit in current literature, there is still confusion with regard to the mediating and moderating effects of different variables on depression. The purpose of this study was to assess the mediating and moderating effects of anxiety, stress, positive affect, and negative affect on depression.

Two hundred and two university students (males  = 93, females  = 113) completed questionnaires assessing anxiety, stress, self-esteem, positive and negative affect, and depression. Mediation and moderation analyses were conducted using techniques based on standard multiple regression and hierarchical regression analyses.

Main Findings

The results indicated that (i) anxiety partially mediated the effects of both stress and self-esteem upon depression, (ii) that stress partially mediated the effects of anxiety and positive affect upon depression, (iii) that stress completely mediated the effects of self-esteem on depression, and (iv) that there was a significant interaction between stress and negative affect, and between positive affect and negative affect upon depression.

The study highlights different research questions that can be investigated depending on whether researchers decide to use the same variables as mediators and/or moderators.

Citation: Nima AA, Rosenberg P, Archer T, Garcia D (2013) Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression. PLoS ONE 8(9): e73265. https://doi.org/10.1371/journal.pone.0073265

Editor: Ben J. Harrison, The University of Melbourne, Australia

Received: February 21, 2013; Accepted: July 22, 2013; Published: September 9, 2013

Copyright: © 2013 Nima et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors have no support or funding to report.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Mediation refers to the covariance relationships among three variables: an independent variable (1), an assumed mediating variable (2), and a dependent variable (3). Mediation analysis investigates whether the mediating variable accounts for a significant amount of the shared variance between the independent and the dependent variables–the mediator changes in regard to the independent variable, in turn, affecting the dependent one [1] , [2] . On the other hand, moderation refers to the examination of the statistical interaction between independent variables in predicting a dependent variable [1] , [3] . In contrast to the mediator, the moderator is not expected to be correlated with both the independent and the dependent variable–Baron and Kenny [1] actually recommend that it is best if the moderator is not correlated with the independent variable and if the moderator is relatively stable, like a demographic variable (e.g., gender, socio-economic status) or a personality trait (e.g., affectivity).

Although both types of analysis lead to different conclusions [3] and the distinction between statistical procedures is part of the current literature [2] , there is still confusion about the use of moderation and mediation analyses using data pertaining to the prediction of depression. There are, for example, contradictions among studies that investigate mediating and moderating effects of anxiety, stress, self-esteem, and affect on depression. Depression, anxiety and stress are suggested to influence individuals' social relations and activities, work, and studies, as well as compromising decision-making and coping strategies [4] , [5] , [6] . Successfully coping with anxiety, depressiveness, and stressful situations may contribute to high levels of self-esteem and self-confidence, in addition increasing well-being, and psychological and physical health [6] . Thus, it is important to disentangle how these variables are related to each other. However, while some researchers perform mediation analysis with some of the variables mentioned here, other researchers conduct moderation analysis with the same variables. Seldom are both moderation and mediation performed on the same dataset. Before disentangling mediation and moderation effects on depression in the current literature, we briefly present the methodology behind the analysis performed in this study.

Mediation and moderation

Baron and Kenny [1] postulated several criteria for the analysis of a mediating effect: a significant correlation between the independent and the dependent variable, the independent variable must be significantly associated with the mediator, the mediator predicts the dependent variable even when the independent variable is controlled for, and the correlation between the independent and the dependent variable must be eliminated or reduced when the mediator is controlled for. All the criteria is then tested using the Sobel test which shows whether indirect effects are significant or not [1] , [7] . A complete mediating effect occurs when the correlation between the independent and the dependent variable are eliminated when the mediator is controlled for [8] . Analyses of mediation can, for example, help researchers to move beyond answering if high levels of stress lead to high levels of depression. With mediation analysis researchers might instead answer how stress is related to depression.

In contrast to mediation, moderation investigates the unique conditions under which two variables are related [3] . The third variable here, the moderator, is not an intermediate variable in the causal sequence from the independent to the dependent variable. For the analysis of moderation effects, the relation between the independent and dependent variable must be different at different levels of the moderator [3] . Moderators are included in the statistical analysis as an interaction term [1] . When analyzing moderating effects the variables should first be centered (i.e., calculating the mean to become 0 and the standard deviation to become 1) in order to avoid problems with multi-colinearity [8] . Moderating effects can be calculated using multiple hierarchical linear regressions whereby main effects are presented in the first step and interactions in the second step [1] . Analysis of moderation, for example, helps researchers to answer when or under which conditions stress is related to depression.

Mediation and moderation effects on depression

Cognitive vulnerability models suggest that maladaptive self-schema mirroring helplessness and low self-esteem explain the development and maintenance of depression (for a review see [9] ). These cognitive vulnerability factors become activated by negative life events or negative moods [10] and are suggested to interact with environmental stressors to increase risk for depression and other emotional disorders [11] , [10] . In this line of thinking, the experience of stress, low self-esteem, and negative emotions can cause depression, but also be used to explain how (i.e., mediation) and under which conditions (i.e., moderation) specific variables influence depression.

Using mediational analyses to investigate how cognitive therapy intervations reduced depression, researchers have showed that the intervention reduced anxiety, which in turn was responsible for 91% of the reduction in depression [12] . In the same study, reductions in depression, by the intervention, accounted only for 6% of the reduction in anxiety. Thus, anxiety seems to affect depression more than depression affects anxiety and, together with stress, is both a cause of and a powerful mediator influencing depression (See also [13] ). Indeed, there are positive relationships between depression, anxiety and stress in different cultures [14] . Moreover, while some studies show that stress (independent variable) increases anxiety (mediator), which in turn increased depression (dependent variable) [14] , other studies show that stress (moderator) interacts with maladaptive self-schemata (dependent variable) to increase depression (independent variable) [15] , [16] .

The present study

In order to illustrate how mediation and moderation can be used to address different research questions we first focus our attention to anxiety and stress as mediators of different variables that earlier have been shown to be related to depression. Secondly, we use all variables to find which of these variables moderate the effects on depression.

The specific aims of the present study were:

  • To investigate if anxiety mediated the effect of stress, self-esteem, and affect on depression.
  • To investigate if stress mediated the effects of anxiety, self-esteem, and affect on depression.
  • To examine moderation effects between anxiety, stress, self-esteem, and affect on depression.

Ethics statement

This research protocol was approved by the Ethics Committee of the University of Gothenburg and written informed consent was obtained from all the study participants.

Participants

The present study was based upon a sample of 206 participants (males  = 93, females  = 113). All the participants were first year students in different disciplines at two universities in South Sweden. The mean age for the male students was 25.93 years ( SD  = 6.66), and 25.30 years ( SD  = 5.83) for the female students.

In total, 206 questionnaires were distributed to the students. Together 202 questionnaires were responded to leaving a total dropout of 1.94%. This dropout concerned three sections that the participants chose not to respond to at all, and one section that was completed incorrectly. None of these four questionnaires was included in the analyses.

Instruments

Hospital anxiety and depression scale [17] ..

The Swedish translation of this instrument [18] was used to measure anxiety and depression. The instrument consists of 14 statements (7 of which measure depression and 7 measure anxiety) to which participants are asked to respond grade of agreement on a Likert scale (0 to 3). The utility, reliability and validity of the instrument has been shown in multiple studies (e.g., [19] ).

Perceived Stress Scale [20] .

The Swedish version [21] of this instrument was used to measures individuals' experience of stress. The instrument consist of 14 statements to which participants rate on a Likert scale (0 =  never , 4 =  very often ). High values indicate that the individual expresses a high degree of stress.

Rosenberg's Self-Esteem Scale [22] .

The Rosenberg's Self-Esteem Scale (Swedish version by Lindwall [23] ) consists of 10 statements focusing on general feelings toward the self. Participants are asked to report grade of agreement in a four-point Likert scale (1 =  agree not at all, 4 =  agree completely ). This is the most widely used instrument for estimation of self-esteem with high levels of reliability and validity (e.g., [24] , [25] ).

Positive Affect and Negative Affect Schedule [26] .

This is a widely applied instrument for measuring individuals' self-reported mood and feelings. The Swedish version has been used among participants of different ages and occupations (e.g., [27] , [28] , [29] ). The instrument consists of 20 adjectives, 10 positive affect (e.g., proud, strong) and 10 negative affect (e.g., afraid, irritable). The adjectives are rated on a five-point Likert scale (1 =  not at all , 5 =  very much ). The instrument is a reliable, valid, and effective self-report instrument for estimating these two important and independent aspects of mood [26] .

Questionnaires were distributed to the participants on several different locations within the university, including the library and lecture halls. Participants were asked to complete the questionnaire after being informed about the purpose and duration (10–15 minutes) of the study. Participants were also ensured complete anonymity and informed that they could end their participation whenever they liked.

Correlational analysis

Depression showed positive, significant relationships with anxiety, stress and negative affect. Table 1 presents the correlation coefficients, mean values and standard deviations ( sd ), as well as Cronbach ' s α for all the variables in the study.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0073265.t001

Mediation analysis

Regression analyses were performed in order to investigate if anxiety mediated the effect of stress, self-esteem, and affect on depression (aim 1). The first regression showed that stress ( B  = .03, 95% CI [.02,.05], β = .36, t  = 4.32, p <.001), self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.24, t  = −3.20, p <.001), and positive affect ( B  = −.02, 95% CI [−.05, −.01], β = −.19, t  = −2.93, p  = .004) had each an unique effect on depression. Surprisingly, negative affect did not predict depression ( p  = 0.77) and was therefore removed from the mediation model, thus not included in further analysis.

The second regression tested whether stress, self-esteem and positive affect uniquely predicted the mediator (i.e., anxiety). Stress was found to be positively associated ( B  = .21, 95% CI [.15,.27], β = .47, t  = 7.35, p <.001), whereas self-esteem was negatively associated ( B  = −.29, 95% CI [−.38, −.21], β = −.42, t  = −6.48, p <.001) to anxiety. Positive affect, however, was not associated to anxiety ( p  = .50) and was therefore removed from further analysis.

A hierarchical regression analysis using depression as the outcome variable was performed using stress and self-esteem as predictors in the first step, and anxiety as predictor in the second step. This analysis allows the examination of whether stress and self-esteem predict depression and if this relation is weaken in the presence of anxiety as the mediator. The result indicated that, in the first step, both stress ( B  = .04, 95% CI [.03,.05], β = .45, t  = 6.43, p <.001) and self-esteem ( B  = .04, 95% CI [.03,.05], β = .45, t  = 6.43, p <.001) predicted depression. When anxiety (i.e., the mediator) was controlled for predictability was reduced somewhat but was still significant for stress ( B  = .03, 95% CI [.02,.04], β = .33, t  = 4.29, p <.001) and for self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.20, t  = −2.62, p  = .009). Anxiety, as a mediator, predicted depression even when both stress and self-esteem were controlled for ( B  = .05, 95% CI [.02,.08], β = .26, t  = 3.17, p  = .002). Anxiety improved the prediction of depression over-and-above the independent variables (i.e., stress and self-esteem) (Δ R 2  = .03, F (1, 198) = 10.06, p  = .002). See Table 2 for the details.

thumbnail

https://doi.org/10.1371/journal.pone.0073265.t002

A Sobel test was conducted to test the mediating criteria and to assess whether indirect effects were significant or not. The result showed that the complete pathway from stress (independent variable) to anxiety (mediator) to depression (dependent variable) was significant ( z  = 2.89, p  = .003). The complete pathway from self-esteem (independent variable) to anxiety (mediator) to depression (dependent variable) was also significant ( z  = 2.82, p  = .004). Thus, indicating that anxiety partially mediates the effects of both stress and self-esteem on depression. This result may indicate also that both stress and self-esteem contribute directly to explain the variation in depression and indirectly via experienced level of anxiety (see Figure 1 ).

thumbnail

Changes in Beta weights when the mediator is present are highlighted in red.

https://doi.org/10.1371/journal.pone.0073265.g001

For the second aim, regression analyses were performed in order to test if stress mediated the effect of anxiety, self-esteem, and affect on depression. The first regression showed that anxiety ( B  = .07, 95% CI [.04,.10], β = .37, t  = 4.57, p <.001), self-esteem ( B  = −.02, 95% CI [−.05, −.01], β = −.18, t  = −2.23, p  = .03), and positive affect ( B  = −.03, 95% CI [−.04, −.02], β = −.27, t  = −4.35, p <.001) predicted depression independently of each other. Negative affect did not predict depression ( p  = 0.74) and was therefore removed from further analysis.

The second regression investigated if anxiety, self-esteem and positive affect uniquely predicted the mediator (i.e., stress). Stress was positively associated to anxiety ( B  = 1.01, 95% CI [.75, 1.30], β = .46, t  = 7.35, p <.001), negatively associated to self-esteem ( B  = −.30, 95% CI [−.50, −.01], β = −.19, t  = −2.90, p  = .004), and a negatively associated to positive affect ( B  = −.33, 95% CI [−.46, −.20], β = −.27, t  = −5.02, p <.001).

A hierarchical regression analysis using depression as the outcome and anxiety, self-esteem, and positive affect as the predictors in the first step, and stress as the predictor in the second step, allowed the examination of whether anxiety, self-esteem and positive affect predicted depression and if this association would weaken when stress (i.e., the mediator) was present. In the first step of the regression anxiety ( B  = .07, 95% CI [.05,.10], β = .38, t  = 5.31, p  = .02), self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.18, t  = −2.41, p  = .02), and positive affect ( B  = −.03, 95% CI [−.04, −.02], β = −.27, t  = −4.36, p <.001) significantly explained depression. When stress (i.e., the mediator) was controlled for, predictability was reduced somewhat but was still significant for anxiety ( B  = .05, 95% CI [.02,.08], β = .05, t  = 4.29, p <.001) and for positive affect ( B  = −.02, 95% CI [−.04, −.01], β = −.20, t  = −3.16, p  = .002), whereas self-esteem did not reach significance ( p < = .08). In the second step, the mediator (i.e., stress) predicted depression even when anxiety, self-esteem, and positive affect were controlled for ( B  = .02, 95% CI [.08,.04], β = .25, t  = 3.07, p  = .002). Stress improved the prediction of depression over-and-above the independent variables (i.e., anxiety, self-esteem and positive affect) (Δ R 2  = .02, F (1, 197)  = 9.40, p  = .002). See Table 3 for the details.

thumbnail

https://doi.org/10.1371/journal.pone.0073265.t003

Furthermore, the Sobel test indicated that the complete pathways from the independent variables (anxiety: z  = 2.81, p  = .004; self-esteem: z  =  2.05, p  = .04; positive affect: z  = 2.58, p <.01) to the mediator (i.e., stress), to the outcome (i.e., depression) were significant. These specific results might be explained on the basis that stress partially mediated the effects of both anxiety and positive affect on depression while stress completely mediated the effects of self-esteem on depression. In other words, anxiety and positive affect contributed directly to explain the variation in depression and indirectly via the experienced level of stress. Self-esteem contributed only indirectly via the experienced level of stress to explain the variation in depression. In other words, stress effects on depression originate from “its own power” and explained more of the variation in depression than self-esteem (see Figure 2 ).

thumbnail

https://doi.org/10.1371/journal.pone.0073265.g002

Moderation analysis

Multiple linear regression analyses were used in order to examine moderation effects between anxiety, stress, self-esteem and affect on depression. The analysis indicated that about 52% of the variation in the dependent variable (i.e., depression) could be explained by the main effects and the interaction effects ( R 2  = .55, adjusted R 2  = .51, F (55, 186)  = 14.87, p <.001). When the variables (dependent and independent) were standardized, both the standardized regression coefficients beta (β) and the unstandardized regression coefficients beta (B) became the same value with regard to the main effects. Three of the main effects were significant and contributed uniquely to high levels of depression: anxiety ( B  = .26, t  = 3.12, p  = .002), stress ( B  = .25, t  = 2.86, p  = .005), and self-esteem ( B  = −.17, t  = −2.17, p  = .03). The main effect of positive affect was also significant and contributed to low levels of depression ( B  = −.16, t  = −2.027, p  = .02) (see Figure 3 ). Furthermore, the results indicated that two moderator effects were significant. These were the interaction between stress and negative affect ( B  = −.28, β = −.39, t  = −2.36, p  = .02) (see Figure 4 ) and the interaction between positive affect and negative affect ( B  = −.21, β = −.29, t  = −2.30, p  = .02) ( Figure 5 ).

thumbnail

https://doi.org/10.1371/journal.pone.0073265.g003

thumbnail

Low stress and low negative affect leads to lower levels of depression compared to high stress and high negative affect.

https://doi.org/10.1371/journal.pone.0073265.g004

thumbnail

High positive affect and low negative affect lead to lower levels of depression compared to low positive affect and high negative affect.

https://doi.org/10.1371/journal.pone.0073265.g005

The results in the present study show that (i) anxiety partially mediated the effects of both stress and self-esteem on depression, (ii) that stress partially mediated the effects of anxiety and positive affect on depression, (iii) that stress completely mediated the effects of self-esteem on depression, and (iv) that there was a significant interaction between stress and negative affect, and positive affect and negative affect on depression.

Mediating effects

The study suggests that anxiety contributes directly to explaining the variance in depression while stress and self-esteem might contribute directly to explaining the variance in depression and indirectly by increasing feelings of anxiety. Indeed, individuals who experience stress over a long period of time are susceptible to increased anxiety and depression [30] , [31] and previous research shows that high self-esteem seems to buffer against anxiety and depression [32] , [33] . The study also showed that stress partially mediated the effects of both anxiety and positive affect on depression and that stress completely mediated the effects of self-esteem on depression. Anxiety and positive affect contributed directly to explain the variation in depression and indirectly to the experienced level of stress. Self-esteem contributed only indirectly via the experienced level of stress to explain the variation in depression, i.e. stress affects depression on the basis of ‘its own power’ and explains much more of the variation in depressive experiences than self-esteem. In general, individuals who experience low anxiety and frequently experience positive affect seem to experience low stress, which might reduce their levels of depression. Academic stress, for instance, may increase the risk for experiencing depression among students [34] . Although self-esteem did not emerged as an important variable here, under circumstances in which difficulties in life become chronic, some researchers suggest that low self-esteem facilitates the experience of stress [35] .

Moderator effects/interaction effects

The present study showed that the interaction between stress and negative affect and between positive and negative affect influenced self-reported depression symptoms. Moderation effects between stress and negative affect imply that the students experiencing low levels of stress and low negative affect reported lower levels of depression than those who experience high levels of stress and high negative affect. This result confirms earlier findings that underline the strong positive association between negative affect and both stress and depression [36] , [37] . Nevertheless, negative affect by itself did not predicted depression. In this regard, it is important to point out that the absence of positive emotions is a better predictor of morbidity than the presence of negative emotions [38] , [39] . A modification to this statement, as illustrated by the results discussed next, could be that the presence of negative emotions in conjunction with the absence of positive emotions increases morbidity.

The moderating effects between positive and negative affect on the experience of depression imply that the students experiencing high levels of positive affect and low levels of negative affect reported lower levels of depression than those who experience low levels of positive affect and high levels of negative affect. This result fits previous observations indicating that different combinations of these affect dimensions are related to different measures of physical and mental health and well-being, such as, blood pressure, depression, quality of sleep, anxiety, life satisfaction, psychological well-being, and self-regulation [40] – [51] .

Limitations

The result indicated a relatively low mean value for depression ( M  = 3.69), perhaps because the studied population was university students. These might limit the generalization power of the results and might also explain why negative affect, commonly associated to depression, was not related to depression in the present study. Moreover, there is a potential influence of single source/single method variance on the findings, especially given the high correlation between all the variables under examination.

Conclusions

The present study highlights different results that could be arrived depending on whether researchers decide to use variables as mediators or moderators. For example, when using meditational analyses, anxiety and stress seem to be important factors that explain how the different variables used here influence depression–increases in anxiety and stress by any other factor seem to lead to increases in depression. In contrast, when moderation analyses were used, the interaction of stress and affect predicted depression and the interaction of both affectivity dimensions (i.e., positive and negative affect) also predicted depression–stress might increase depression under the condition that the individual is high in negative affectivity, in turn, negative affectivity might increase depression under the condition that the individual experiences low positive affectivity.

Acknowledgments

The authors would like to thank the reviewers for their openness and suggestions, which significantly improved the article.

Author Contributions

Conceived and designed the experiments: AAN TA. Performed the experiments: AAN. Analyzed the data: AAN DG. Contributed reagents/materials/analysis tools: AAN TA DG. Wrote the paper: AAN PR TA DG.

  • View Article
  • Google Scholar
  • 3. MacKinnon DP, Luecken LJ (2008) How and for Whom? Mediation and Moderation in Health Psychology. Health Psychol 27 (2 Suppl.): s99–s102.
  • 4. Aaroe R (2006) Vinn över din depression [Defeat depression]. Stockholm: Liber.
  • 5. Agerberg M (1998) Ut ur mörkret [Out from the Darkness]. Stockholm: Nordstedt.
  • 6. Gilbert P (2005) Hantera din depression [Cope with your Depression]. Stockholm: Bokförlaget Prisma.
  • 8. Tabachnick BG, Fidell LS (2007) Using Multivariate Statistics, Fifth Edition. Boston: Pearson Education, Inc.
  • 10. Beck AT (1967) Depression: Causes and treatment. Philadelphia: University of Pennsylvania Press.
  • 21. Eskin M, Parr D (1996) Introducing a Swedish version of an instrument measuring mental stress. Stockholm: Psykologiska institutionen Stockholms Universitet.
  • 22. Rosenberg M (1965) Society and the Adolescent Self-Image. Princeton, NJ: Princeton University Press.
  • 23. Lindwall M (2011) Självkänsla – Bortom populärpsykologi & enkla sanningar [Self-Esteem – Beyond Popular Psychology and Simple Truths]. Lund:Studentlitteratur.
  • 25. Blascovich J, Tomaka J (1991) Measures of self-esteem. In: Robinson JP, Shaver PR, Wrightsman LS (Red.) Measures of personality and social psychological attitudes San Diego: Academic Press. 161–194.
  • 30. Eysenck M (Ed.) (2000) Psychology: an integrated approach. New York: Oxford University Press.
  • 31. Lazarus RS, Folkman S (1984) Stress, Appraisal, and Coping. New York: Springer.
  • 32. Johnson M (2003) Självkänsla och anpassning [Self-esteem and Adaptation]. Lund: Studentlitteratur.
  • 33. Cullberg Weston M (2005) Ditt inre centrum – Om självkänsla, självbild och konturen av ditt själv [Your Inner Centre – About Self-esteem, Self-image and the Contours of Yourself]. Stockholm: Natur och Kultur.
  • 34. Lindén M (1997) Studentens livssituation. Frihet, sårbarhet, kris och utveckling [Students' Life Situation. Freedom, Vulnerability, Crisis and Development]. Uppsala: Studenthälsan.
  • 35. Williams S (1995) Press utan stress ger maximal prestation [Pressure without Stress gives Maximal Performance]. Malmö: Richters förlag.
  • 37. Garcia D, Kerekes N, Andersson-Arntén A–C, Archer T (2012) Temperament, Character, and Adolescents' Depressive Symptoms: Focusing on Affect. Depress Res Treat. DOI:10.1155/2012/925372.
  • 40. Garcia D, Ghiabi B, Moradi S, Siddiqui A, Archer T (2013) The Happy Personality: A Tale of Two Philosophies. In Morris EF, Jackson M-A editors. Psychology of Personality. New York: Nova Science Publishers. 41–59.
  • 41. Schütz E, Nima AA, Sailer U, Andersson-Arntén A–C, Archer T, Garcia D (2013) The affective profiles in the USA: Happiness, depression, life satisfaction, and happiness-increasing strategies. In press.
  • 43. Garcia D, Nima AA, Archer T (2013) Temperament and Character's Relationship to Subjective Well- Being in Salvadorian Adolescents and Young Adults. In press.
  • 44. Garcia D (2013) La vie en Rose: High Levels of Well-Being and Events Inside and Outside Autobiographical Memory. J Happiness Stud. DOI: 10.1007/s10902-013-9443-x.
  • 48. Adrianson L, Djumaludin A, Neila R, Archer T (2013) Cultural influences upon health, affect, self-esteem and impulsiveness: An Indonesian-Swedish comparison. Int J Res Stud Psychol. DOI: 10.5861/ijrsp.2013.228.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 01 December 2015

Points of Significance

Multiple linear regression

  • Martin Krzywinski 2 &
  • Naomi Altman 1  

Nature Methods volume  12 ,  pages 1103–1104 ( 2015 ) Cite this article

44k Accesses

78 Citations

43 Altmetric

Metrics details

When multiple variables are associated with a response, the interpretation of a prediction equation is seldom simple.

You have full access to this article via your institution.

Last month we explored how to model a simple relationship between two variables, such as the dependence of weight on height 1 . In the more realistic scenario of dependence on several variables, we can use multiple linear regression (MLR). Although MLR is similar to linear regression, the interpretation of MLR correlation coefficients is confounded by the way in which the predictor variables relate to one another.

In simple linear regression 1 , we model how the mean of variable Y depends linearly on the value of a predictor variable X ; this relationship is expressed as the conditional expectation E( Y | X ) = β 0 + β 1 X . For more than one predictor variable X 1 , . . ., X p , this becomes β 0 + Σ β j X j . As for simple linear regression, one can use the least-squares estimator (LSE) to determine estimates b j of the β j regression parameters by minimizing the residual sum of squares, SSE = Σ( y i − ŷ i ) 2 , where ŷ i = b 0 + Σ j b j xij . When we use the regression sum of squares, SSR = Σ( ŷ i − Y − ) 2 , the ratio R 2 = SSR/(SSR + SSE) is the amount of variation explained by the regression model and in multiple regression is called the coefficient of determination.

The slope β j is the change in Y if predictor j is changed by one unit and others are held constant. When normality and independence assumptions are fulfilled, we can test whether any (or all) of the slopes are zero using a t -test (or regression F -test). Although the interpretation of β j seems to be identical to its interpretation in the simple linear regression model, the innocuous phrase “and others are held constant” turns out to have profound implications.

To illustrate MLR—and some of its perils—here we simulate predicting the weight ( W , in kilograms) of adult males from their height ( H , in centimeters) and their maximum jump height ( J , in centimeters). We use a model similar to that presented in our previous column 1 , but we now include the effect of J as E( W | H , J ) = β H H + β J J + β 0 + ε, with β H = 0.7, β J = −0.08, β 0 = −46.5 and normally distributed noise ε with zero mean and σ = 1 ( Table 1 ). We set β J negative because we expect a negative correlation between W and J when height is held constant (i.e., among men of the same height, lighter men will tend to jump higher). For this example we simulated a sample of size n = 40 with H and J normally distributed with means of 165 cm (σ = 3) and 50 cm (σ = 12.5), respectively.

Although the statistical theory for MLR seems similar to that for simple linear regression, the interpretation of the results is much more complex. Problems in interpretation arise entirely as a result of the sample correlation 2 among the predictors. We do, in fact, expect a positive correlation between H and J —tall men will tend to jump higher than short ones. To illustrate how this correlation can affect the results, we generated values using the model for weight with samples of J and H with different amounts of correlation.

Let's look first at the regression coefficients estimated when the predictors are uncorrelated, r ( H , J ) = 0, as evidenced by the zero slope in association between H and J ( Fig. 1a ). Here r is the Pearson correlation coefficient 2 . If we ignore the effect of J and regress W on H , we find Ŵ = 0.71 H − 51.7 ( R 2 = 0.66) ( Table 1 and Fig. 1b ). Ignoring H , we find Ŵ = −0.088 J + 69.3 ( R 2 = 0.19). If both predictors are fitted in the regression, we obtain Ŵ = 0.71 H − 0.088 J − 47.3 ( R 2 = 0.85). This regression fit is a plane in three dimensions ( H , J , W ) and is not shown in Figure 1 . In all three cases, the results of the F -test for zero slopes show high significance ( P ≤ 0.005).

figure 1

( a ) Simulated values of uncorrelated predictors, r ( H , J ) = 0. The thick gray line is the regression line, and thin gray lines show the 95% confidence interval of the fit. ( b ) Regression of weight ( W ) on height ( H ) and of weight on jump height ( J ) for uncorrelated predictors shown in a . Regression slopes are shown ( b H = 0.71, b J = −0.088). ( c ) Simulated values of correlated predictors, r ( H , J ) = 0.9. Regression and 95% confidence interval are denoted as in a . ( d ) Regression (red lines) using correlated predictors shown in c . Light red lines denote the 95% confidence interval. Notice that b J = 0.097 is now positive. The regression line from b is shown in blue. In all graphs, horizontal and vertical dotted lines show average values.

When the sample correlations of the predictors are exactly zero, the regression slopes ( b H and b J ) for the “one predictor at a time” regressions and the multiple regression are identical, and the simple regression R 2 sums to multiple regression R 2 (0.66 + 0.19 = 0.85; Fig. 2 ). The intercept changes when we add a predictor with a nonzero mean to satisfy the constraint that the least-squares regression line goes through the sample means, which is always true when the regression model includes an intercept.

figure 2

Shown are the values of regression coefficient estimates ( b H , b J , b 0 ) and R 2 and the significance of the test used to determine whether the coefficient is zero from 250 simulations at each value of predictor sample correlation −1 < r ( H , J ) < 1 for each scenario where either H or J or both H and J predictors are fitted in the regression. Thick and thin black curves show the coefficient estimate median and the boundaries of the 10th–90th percentile range, respectively. Histograms show the fraction of estimated P values in different significance ranges, and correlation intervals are highlighted in red where >20% of the P values are >0.01. Actual regression coefficients ( β H , β J , β 0 ) are marked on vertical axes. The decrease in significance for b J when jump height is the only predictor and r ( H , J ) is moderate (red arrow) is due to insufficient statistical power ( b J is close to zero). When predictors are uncorrelated, r ( H , J ) = 0, R 2 of individual regressions sum to R 2 of multiple regression (0.66 + 0.19 = 0.85). Panels are organized to correspond to Table 1 , which shows estimates of a single trial at two different predictor correlations.

Balanced factorial experiments show a sample correlation of zero among the predictors when their levels have been fixed. For example, we might fix three heights and three jump heights and select two men representative of each combination, for a total of 18 subjects to be weighed. But if we select the samples and then measure the predictors and response, the predictors are unlikely to have zero correlation.

When we simulate highly correlated predictors r ( H , J ) = 0.9 ( Fig. 1c ), we find that the regression parameters change depending on whether we use one or both predictors ( Table 1 and Fig. 1d ). If we consider only the effect of H , the coefficient β H = 0.7 is inaccurately estimated as b H = 0.44. If we include only J , we estimate β J = −0.08 inaccurately, and even with the wrong sign ( b J = 0.097). When we use both predictors, the estimates are quite close to the actual coefficients ( b H = 0.63, b J = −0.056).

In fact, as the correlation between predictors r ( H , J ) changes, the estimates of the slopes ( b H , b J ) and intercept ( b 0 ) vary greatly when only one predictor is fitted. We show the effects of this variation for all values of predictor correlation (both positive and negative) across 250 trials at each value ( Fig. 2 ). We include negative correlation because although J and H are likely to be positively correlated, other scenarios might use negatively correlated predictors (e.g., lung capacity and smoking habits). For example, if we include only H in the regression and ignore the effect of J , b H steadily decreases from about 1 to 0.35 as r ( H , J ) increases. Why is this? For a given height, larger values of J (an indicator of fitness) are associated with lower weight. If J and H are negatively correlated, as J increases, H decreases, and both changes result in a lower value of W . Conversely, as J decreases, H increases, and thus W increases. If we use only H as a predictor, J is lurking in the background, depressing W at low values of H and enhancing W at high levels of H , so that the effect of H is overestimated ( b H increases). The opposite effect occurs when J and H are positively correlated. A similar effect occurs for b J , which increases in magnitude (becomes more negative) when J and H are negatively correlated. Supplementary Figure 1 shows the effect of correlation when both regression coefficients are positive.

When both predictors are fitted ( Fig. 2 ), the regression coefficient estimates ( b H , b J , b 0 ) are centered at the actual coefficients ( β H , β J , β 0 ) with the correct sign and magnitude regardless of the correlation of the predictors. However, the standard error in the estimates steadily increases as the absolute value of the predictor correlation increases.

Neglecting important predictors has implications not only for R 2 , which is a measure of the predictive power of the regression, but also for interpretation of the regression coefficients. Unconsidered variables that may have a strong effect on the estimated regression coefficients are sometimes called 'lurking variables'. For example, muscle mass might be a lurking variable with a causal effect on both body weight and jump height. The results and interpretation of the regression will also change if other predictors are added.

Given that missing predictors can affect the regression, should we try to include as many predictors as possible? No, for three reasons. First, any correlation among predictors will increase the standard error of the estimated regression coefficients. Second, having more slope parameters in our model will reduce interpretability and cause problems with multiple testing. Third, the model may suffer from overfitting. As the number of predictors approaches the sample size, we begin fitting the model to the noise. As a result, we may seem to have a very good fit to the data but still make poor predictions.

MLR is powerful for incorporating many predictors and for estimating the effects of a predictor on the response in the presence of other covariates. However, the estimated regression coefficients depend on the predictors in the model, and they can be quite variable when the predictors are correlated. Accurate prediction of the response is not an indication that regression slopes reflect the true relationship between the predictors and the response.

Altman, N. & Krzywinski, M. Nat. Methods 12 , 999–1000 (2015).

Article   CAS   Google Scholar  

Altman, N. & Krzywinski, M. Nat. Methods 12 , 899–900 (2015).

Download references

Author information

Authors and affiliations.

Naomi Altman is a Professor of Statistics at The Pennsylvania State University.,

Naomi Altman

Martin Krzywinski is a staff scientist at Canada's Michael Smith Genome Sciences Centre.,

Martin Krzywinski

You can also search for this author in PubMed   Google Scholar

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary figure 1 regression coefficients and r 2.

The significance and value of regression coefficients and R 2 for a model with both regression coefficients positive, E( W | H,J ) = 0.7 H + 0.08 J - 46.5 + ε. The format of the figure is the same as that of Figure 2 .

Supplementary information

Supplementary figure 1.

Regression coefficients and R 2 (PDF 299 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Krzywinski, M., Altman, N. Multiple linear regression. Nat Methods 12 , 1103–1104 (2015). https://doi.org/10.1038/nmeth.3665

Download citation

Published : 01 December 2015

Issue Date : December 2015

DOI : https://doi.org/10.1038/nmeth.3665

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

research paper on multiple regression

multiple linear regression Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

The Effect of Conflict and Termination of Employment on Employee's Work Spirit

This study aims to find out the conflict and termination of employment both partially and simultaneously have a significant effect on the morale of employees at PT. The benefits of Medan Technique and how much it affects. The method used in this research is quantitative method with several tests namely reliability analysis, classical assumption deviation test and linear regression. Based on the results of primary data regression processed using SPSS 20, multiple linear regression equations were obtained as follows: Y = 1,031 + 0.329 X1+ 0.712 X2.In part, the conflict variable (X1)has a significant effect on the employee's work spirit (Y) at PT. Medan Technical Benefits. This means that the hypothesis in this study was accepted, proven from the value of t calculate > t table (3,952 < 2,052). While the variable termination of employment (X2) has a significant influence on the work spirit of employees (Y) in PT. Medan Technical Benefits. This means that the hypothesis in this study was accepted, proven from the value of t calculate > t table (7,681 > 2,052). Simultaneously, variable conflict (X1) and termination of employment (X2) have a significant influence on the morale of employees (Y) in PT. Medan Technical Benefits. This means that the hypothesis in this study was accepted, as evidenced by the calculated F value > F table (221,992 > 3.35). Conflict variables (X1) and termination of employment (X2) were able to contribute an influence on employee morale variables (Y) of 94.3% while the remaining 5.7% was influenced by other variables not studied in this study. From the above conclusions, the author advises that employees and leaders should reduce prolonged conflict so that the spirit of work can increase. Leaders should be more selective in severing employment relationships so that decent employees are not dismissed unilaterally. Employees should work in a high spirit so that the company can see the quality that employees have.

Prediction of Local Government Revenue using Data Mining Method

Local Government Revenue or commonly abbreviated as PAD is part of regional income which is a source of regional financing used to finance the running of government in a regional government. Each local government must plan Local Government Revenue for the coming year so that a forecasting method is needed to determine the Local Government Revenue value for the coming year. This study discusses several methods for predicting Local Government Revenue by using data on the realization of Local Government Revenue in the previous years. This study proposes three methods for forecasting local Government revenue. The three methods used in this research are Multiple Linear Regression, Artificial Neural Network, and Deep Learning. In this study, the data used is Local Revenue data from 2010 to 2020. The research was conducted using RapidMiner software and the CRISP-DM framework. The tests carried out showed an RMSE value of 97 billion when using the Multiple Linear Regression method and R2 of 0,942, the ANN method shows an RMSE value of 135 billion and R2 of 0.911, and the Deep Learning method shows the RMSE value of 104 billion and R2 of 0.846. This study shows that for the prediction of Local Government Revenue, the Multiple Linear Regression method is better than the ANN or Deep Learning method. Keywords— Local Government Revenue, Multiple Linear Regression, Artificial Neural Network, Deep Learning, Coefficient of Determination

Analisis Peran Motivasi sebagai Mediasi Pengaruh Trilogi Kepemimpinan dan Kepuasan Kerja terhadap Produktivitas Kerja Karyawan PT. Mataram Tunggal Garment

The purpose of this study is to find out the motivation to mediate the leadership trilogy and job satisfaction to employee work productivity at PT. Mataram Tunggal Garment. The method used in this study is quantitative. Primary data was obtained from questionnaires with 78 respondents with saturated sample techniques. Then the data is analyzed using descriptive analysis, multiple linear regression tests, t (partial) tests, coesifisien determination (R2) and sobel tests. The results showed that job satisfaction had a significant influence on motivation, leadership trilogy and job satisfaction had a significant influence on employee work productivity, leadership trilogy and motivation had no significant effect on employee work productivity, motivation mediated leadership trilogy and job satisfaction had no insignificant effect on employee work productivity.  Keywords: Leadership Trilogy, Motivation, Job Satisfaction and Employee Productivity.

Prevalence of asymptomatic hyperuricemia and its association with prediabetes, dyslipidemia and subclinical inflammation markers among young healthy adults in Qatar

Abstract Aim The aim of this study is to investigate the prevalence of asymptomatic hyperuricemia in Qatar and to examine its association with changes in markers of dyslipidemia, prediabetes and subclinical inflammation. Methods A cross-sectional study of young adult participants aged 18 - 40 years old devoid of comorbidities collected between 2012 and 2017. Exposure was defined as uric acid level, and outcomes were defined as levels of different blood markers. De-identified data were collected from Qatar Biobank. T-tests, correlation tests and multiple linear regression were all used to investigate the effects of hyperuricemia on blood markers. Statistical analyses were conducted using STATA 16. Results The prevalence of asymptomatic hyperuricemia is 21.2% among young adults in Qatar. Differences between hyperuricemic and normouricemic groups were observed using multiple linear regression analysis and found to be statistically and clinically significant after adjusting for age, gender, BMI, smoking and exercise. Significant associations were found between uric acid level and HDL-c p = 0.019 (correlation coefficient -0.07 (95% CI [-0.14, -0.01]); c-peptide p = 0.018 (correlation coefficient 0.38 (95% CI [0.06, 0.69]) and monocyte to HDL ratio (MHR) p = 0.026 (correlation coefficient 0.47 (95% CI [0.06, 0.89]). Conclusions Asymptomatic hyperuricemia is prevalent among young adults and associated with markers of prediabetes, dyslipidemia, and subclinical inflammation.

Screen Time, Age and Sunshine Duration Rather Than Outdoor Activity Time Are Related to Nutritional Vitamin D Status in Children With ASD

Objective: This study aimed to investigate the possible association among vitamin D, screen time and other factors that might affect the concentration of vitamin D in children with autism spectrum disorder (ASD).Methods: In total, 306 children with ASD were recruited, and data, including their age, sex, height, weight, screen time, time of outdoor activity, ASD symptoms [including Autism Behavior Checklist (ABC), Childhood Autism Rating Scale (CARS) and Autism Diagnostic Observation Schedule–Second Edition (ADOS-2)] and vitamin D concentrations, were collected. A multiple linear regression model was used to analyze the factors related to the vitamin D concentration.Results: A multiple linear regression analysis showed that screen time (β = −0.122, P = 0.032), age (β = −0.233, P &lt; 0.001), and blood collection month (reflecting sunshine duration) (β = 0.177, P = 0.004) were statistically significant. The vitamin D concentration in the children with ASD was negatively correlated with screen time and age and positively correlated with sunshine duration.Conclusion: The vitamin D levels in children with ASD are related to electronic screen time, age and sunshine duration. Since age and season are uncontrollable, identifying the length of screen time in children with ASD could provide a basis for the clinical management of their vitamin D nutritional status.

Determining Factors of Fraud in Local Government

The objectives of this research are to analyze determining factors of fraud in local government. This study used internal control effectiveness, compliance with accounting rules, compensation compliance, and unethical behavior as an independent variable, while fraud as the dependent variable. The research was conducted at Bantul local government (OPD). The sample of this research were 86 respondents. The sample uses a purposive sampling method. The respondent data is analyzed with multiple linear regression. The results showed: Internal control effectiveness has an impact on fraud. Compliance with accounting rules does not affect fraud. Compensations compliance does not affect fraud. Unethical behavior has an impact on fraud.

PENGARUH TINGKAT EFEKTIVITAS PERPUTARAN KAS, PIUTANG, DAN MODAL KERJA TERHADAP RENTABILITAS EKONOMI PADA KOPERASI PEDAGANG PASAR GROGOLAN BARU (KOPPASGOBA) PERIODE 2016-2020

This study aims to test and analyze the effect of effectiveness of cash turnover, receivables,and working capital on economic rentability in the New Grogolan Market Traders Cooperative of Pekalongan City from 2016 to 2020. The method used in this study was quantitative research method with documentation techniques and analyzed used multiple linear regression analysis. The results of this study showed (1) the effectiveness of cash turnover has no significant effect on economic rentability, (2) the effectiveness of receivables turnover has no significant effect on economic rentability, (3) the effectiveness of working capital turnover has a positive and significant effect on economic rentability, and (4) there is a positive and significant effect on the effectiveness of cash turnover, receivables, and working capital together on economic rentability. Keywords: Turnover of cash, turnover of receivables, turnover of working capital, and economic rentability.

Improvement of AHMES Using AI Algorithms

This research aims to improve the rationality and intelligence of AUTOMATICALLY HIGHER MATHEMATICALLY EXAM SYSTEM (AHMES) through some AI algorithms. AHMES is an intelligent and high-quality higher math examination solution for the Department of Computer Engineering at Pai Chai University. This research redesigned the difficulty system of AHMES and used some AI algorithms for initialization and continuous adjustment. This paper describes the multiple linear regression algorithm involved in this research and the AHMES learning (AL) algorithm improved by the Q-learning algorithm. The simulation test results of the upgraded AHMES show the effectiveness of these algorithms.

ANALISIS PENGARUH KUALITAS PELAYANAN, PROMOSI DAN HARGA TERHADAP KEPUASAN PELANGGAN PADA JASA PENGIRIMAN BARANG JNE DI BESUKI

This research was conducted to see the effect of service quality, promotion and price on customer satisfaction. This research was conducted at the Besuki branch of JNE. Sampling was done by random sampling technique where all the population was taken at random to be the research sample. This is done to increase customer satisfaction at JNE Besuki branch through service quality, promotion and price. The analytical tool used is multiple linear regression to determine service quality, promotion and price on customer satisfaction. The results show that service quality affects customer satisfaction, promotion affects customer satisfaction, price affects customer satisfaction. Keyword : service quality, promotion, price, customer satisfaction

PENGARUH KUALITAS PRODUK, HARGA DAN INFLUENCER MARKETING TERHADAP KEPUTUSAN PEMBELIAN SCARLETT BODY WHITENING

This research aimed to figure out the influence between product quality, price and marketing influencer with the purchasing decision of Scarlett Body Whitening in East Java. The research instrument employed questionnaire to collect data from Scarlett Body Whitening consumers in East Java. Since there was no valid data for number of the consumers, the research used Roscoe method to take the sample. Data analyzed using multiple linear regression test. Product quality and price have a positive and significant effect on purchasing decisions. Meanwhile, the marketing influencer had no significant effect on purchase decision for Scarlett Body Whitening. Need further research to ensure that marketing influencer had an effect on purchase decision.   Keywords: Product quality, price, marketing influencer, buying decision

Export Citation Format

Share document.

A Comprehensive Study of Regression Analysis and the Existing Techniques

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Education, income...

Education, income inequality, and mortality: a multiple regression analysis

  • Related content
  • Peer review
  • Andreas Muller , professor ( axmuller{at}ualr.edu )
  • Department of Health Services Administration, University of Arkansas at Little Rock, 207 Ross Hall, 2801 South University Ave, Little Rock, AR 72204, USA
  • Accepted 4 October 2001

Objective: To test whether the relation between income inequality and mortality found in US states is because of different levels of formal education.

Design: Cross sectional, multiple regression analysis.

Setting: All US states and the District of Columbia (n=51).

Data sources: US census statistics and vital statistics for the years 1989 and 1990.

Main outcome measure: Multiple regression analysis with age adjusted mortality from all causes as the dependent variable and 3 independent variables—the Gini coefficient, per capita income, and percentage of people aged ≥18 years without a high school diploma.

Results: The income inequality effect disappeared when percentage of people without a high school diploma was added to the regression models. The fit of the regression significantly improved when education was added to the model.

Conclusions: Lack of high school education accounts for the income inequality effect and is a powerful predictor of mortality variation among US states.

What is already known on this topic

What is already known on this topic Aggregate studies have shown a positive relation between income inequality and mortality and three possible explanations have been suggested (relative deprivation, absolute deprivation, and aggregation artefact)

Income inequality may reflect the effects of other socioeconomic variables that are also related to mortality

What this study adds

What this study adds Multiple regression analysis of the 50 US states and District of Columbia for 1989-90 indicates that the relation between income inequality and age adjusted mortality is due to differences in high school educational attainment: education absorbs the income inequality effect and is a more powerful predictor of variation in mortality among US states

Lack of high school education seems to affect mortality by economic resource deprivation, risk of occupational injury, and learnt risk behaviour. It may also measure the lifetime, cumulative effect of adverse socioeconomic conditions

Introduction

Several recent studies have reported a positive relation between income inequality and mortality. The association has been observed in US metropolitan areas and states and, to varying degrees, in international studies. 1 – 3 The relation remains intact when different measures of income inequality are used. The critical question is how this relation should be interpreted.

Three competing interpretations have been advanced. Wilkinson believes that income inequality produces psychosocial stresses for individuals placed at lower ranks of the socioeconomic hierarchy. 4 – 6 Continuous stress due to deprivation of status will lead to deteriorating health and higher mortality over time. The fact that median or per capita household income cannot account for the relation has been taken as evidence that “relative income,” or income inequality, is more important than absolute income for human health and longevity.

Gravelle argues that the correlation between income inequality and mortality may be artefactual in part. 7 He shows mathematically that the aggregate relation is consistent with a negative, curvilinear relation between income and the probability of dying for individuals. Wolfson et al's clever test of Gravelle's hypothesis indicates, however, that the individual relation between income and mortality cannot fully account for the aggregate relationship. 8

The “neo-material” interpretation asserts that income inequality reflects individual and community forms of absolute deprivation. Lynch et al argue that poorer individuals disproportionately experience health taxing events and lack of resources throughout their lives. 9 They live in deprived communities characterised by “underinvestment” in the social and physical infrastructure. Both forms of deprivation produce cumulative wear and tear. The experience depletes health, resulting in higher mortality for those in lower socioeconomic strata. The aggregate effect is that societies with increasing income inequality will experience higher mortality than they would otherwise. Lynch et al suggest that material conditions may be sufficient in explaining the relation between income inequality and mortality.

The neo-material interpretation gives only a broad indication of which material circumstances are important. Kaplan et al's analysis of US states, however, suggests some potential answers. 2 They report that income inequality is significantly correlated with certain risk factors (homicide rates and unemployment rates), social resources (food stamps and lack of health insurance), and measures of human capital (educational attainment). The substantial correlations with some measures of human capital imply that income inequality may not have a direct effect on mortality. Instead, income inequality may reflect the effects of other socioeconomic variables that are also related to mortality. Among those variables, the contribution of formal education deserves most attention since it typically precedes work and income. It is also related to mortality.

Higher educational degrees are typical prerequisites for highly compensated work in the United States and other industrialised nations. According to US census data for the year 1998, the median earnings of adult, year round workers with professional degrees are about four times higher than those of adults who had not completed high school. 10 Thus, the level of education ought to be correlated with cumulative income, which is the basis for measuring income inequality.

In addition, more schooling seems to extend life. 11 – 14 In econometric studies years of schooling typically had a stronger negative effect on age adjusted mortality than per capita income when other measures were controlled for. Therefore, the association between income inequality and mortality found in aggregate studies may be partially the result of variation in educational attainment. I tested this hypothesis using data for the US states, which have shown substantial associations between measures of income inequality measures and age adjusted mortality.

Data and methods

The study is based on a cross sectional analysis of US census statistics and vital statistics for the years 1989 and 1990 for all US states including the District of Columbia (n=51). Age adjusted mortality from all causes was the main dependent variable of the analysis. 15 I used the CDC WONDER data extraction tool to standardise the age specific death rates by the direct method, 15 using the US age distribution for 1990 as the standard population. The data were pooled for the years 1989 and 1990 to make death rates more reliable.

The Gini coefficient for households was the main independent variable of interest. 16 This measures the difference between the areas under the curve of a graph of actual distribution of cumulative income and one indicating equality of income distribution. The Gini coefficient ranges from 0 to 1 and measures the degree of income inequality. A value of 0 indicates that each household obtains the same amount of income, while a value of 1 indicates that only one household earns all income. 17 18

To control for varying income levels among states, I included the per capita income of all people in the regression model. 19 The per capita income variable was log (ln) transformed to reduce positive skew. Both income variables pertain to the calendar year 1989. I measured educational attainment by the percentage of people aged ≥18 years without a high school diploma in 1990. 20

I analysed age adjusted mortality by multiple regression. 21 The proportion of the population living in each state in 1990 was the weighting factor, and STATISTICA software 22 estimated the regression models.

Fig 1 shows the relation between the measure of income inequality and age adjusted mortality. The scatterplot indicates a positive linear relation, with the District of Columbia being an apparent outlier. The range in income inequality between states was about 0.1. The regression coefficient indicates that a 0.1 unit increase in the Gini coefficient was associated with an increase of 1.6 deaths per 1000 population.

Age adjusted death rates by Gini coefficient for the 50 US states and the District of Columbia (DC), 1989-90 (y=1.831+15.705×x; R 2 =0.24; weighted regression). (Data sources US Public Health Service 15 and US Census Bureau 16 )

  • Download figure
  • Open in new tab
  • Download powerpoint

Fig 2 shows a positive, linear relation between education and age adjusted mortality. The observations cluster around the regression line except for the District of Columbia. The range in the education variable was about 20 percentage points. The related increase in age adjusted mortality was about 2.1 deaths per 1000 population.

Age adjusted death rates by educational attainment for the 50 US states and the District of Columbia (DC), 1989-90 (y=6.16+0.103×x; R 2 =0.51; weighted regression). (Data sources US Public Health Service 15 and US Census Bureau 20 )

Fig 3 presents the percentage of variation in age adjusted mortality explained by five regression specifications. All regression models were statistically significant at P<0.001. The two income measures accounted for 27.7% of the variation in age adjusted mortality. Lack of high school education by itself explained over half of the variation in the dependent variable. The regression coefficients for both income variables were non-significant when added to a model including the education measure: they accounted for no additional variation in the dependent variable when the education variable was controlled. The adjusted R 2 values slightly decreased with the addition of the income measures, since the adjustment corrects for redundancy. Deleting the District of Columbia from the analysis improved the fit of regression specifications, including education, in the model but did not substantively change the results shown in fig 3 .

Percentage of variation in age adjusted mortality explained by education and income variables for the 50 US states and District of Columbia, 1989-90

Subgroup analysis

A preliminary analysis of age specific mortality indicated that the findings might best reflect the experience of people aged ≥45 years. For the 15-44 year age group, the Gini coefficient was significant and positively related to age specific death rates, whereas the education variable was only marginally significant. Since the analysis did not restrict the age range of the independent variables to people aged 15-44, the results might be biased. Deaths for 15-44 year olds comprised 8.3% of all US deaths in 1989-90, with accidental and violent deaths among the leading causes.

The definition of the education variable excludes children. Therefore, I estimated all regressions with the dependent variable restricted to people aged ≥20 years. The results of the analysis paralleled those in fig 3 , with model fit reduced by 1 to 3 percentage points.

Gini coefficients for individual states were not available by householder's race or sex. As an alternative, I included the percentage of African-American and Latino people in populations in a regression model that included education, per capita income, and the Gini coefficient. The variable measuring the effect of belonging to economically depressed minorities was significant (b=0.03; t=3.26) and reduced the direct education effect to b=0.07 (t=2.97).

I also ran the regressions for each sex. The education and income variables predicted age adjusted mortality for males better (R 2 adj=0.54) than for females (R 2 adj=0.34). However, the results of the sex specific analyses were consistent with those in fig 3 .

This study had two main findings. Income inequality, as measured by the Gini coefficient, had no unique effect on US age adjusted mortality when the level of formal education was controlled for. Educational attainment, as measured by lack of completed high school education, was a more powerful predictor of differences in mortality than income inequality in US states.

Over a decade has passed since the 1990 US census was taken. Therefore, my findings may not be applicable today. When data on income inequality and vital statistics are released for individual states for the years 1999-2000 this concern can be examined.

The potential role of education has been overlooked in previous research on income inequality and mortality, 1 2 which focused more on the potentially contaminating effects of income and poverty. In my analysis I did not directly control for poverty, but the effect of poverty was not excluded. It was indirectly reflected in the per capita income and education measures.

Implications of results

Lack of high school education completely captured the income inequality effect and income level effect in my age adjusted analysis. This finding suggests that physical and social conditions associated with low levels of education may be sufficient for interpreting the relation between income inequality and mortality. My results therefore seem to support the idea that absolute deprivation rather than relative deprivation is important for influencing mortality.

One reviewer pointed out that this view might be too narrow. The income inequality measure might also express the “burden of relative deprivation” in society, as discussed by Marmot and Wilkinson. 23 Lack of high school education may indicate low status, which, by definition, implies a relative position in the social hierarchy. However, low educational status may indicate only lack of material resources and other adverse life circumstances. It remains to be seen whether low educational status produces the additional stressful, invidious hierarchical comparisons that lead to poorer health and greater mortality. Since aggregate data are not well suited for examining hypotheses at the individual level, my study cannot confirm or rule out the importance of psychosocial processes.

An expanded regression analysis (available on request) indicated that lack of high school education was related to lack of health insurance, belonging to economically depressed minority groups, working in jobs with high risk of injury, and smoking. This finding suggests that lack of material resources, occupational exposure to risk, and certain learnt health risk behaviour might be reflected in the large education-mortality effect.

Less educated people may be concentrated in areas that are more risky to life and health. Some research has suggested that these communities may lack sufficient investment in health related infrastructure such as access to health care, proper police protection, and healthy housing. 24 These potential risk factors are only indirectly assessed by the variables used in my study.

Lack of high school education may also represent lifetime effects of socioeconomic deprivation. Davey Smith et al found that socioeconomic conditions during childhood adversely affected adult mortality in a large, prospective study of adult Scottish men. 25 My study could not determine intergenerational effects of educational attainment. However, this path of research seems promising since considerable linkage between parents and offspring have been seen for educational attainment and for incomes in Britain 26 and in the United States. 27 28 Lack of high school education may also capture the lifetime effect of adverse social conditions increasing mortality. Income inequality is only one aspect of this broader experience.

Acknowledgments

I thank Drs Wilkinson, Davey Smith, and Altman for their valuable comments on an earlier version of this paper.

Funding My study was supported by my sabbatical leave granted by the University of Arkansas at Little Rock.

Competing interests None declared.

  • Kaplan GA ,
  • Balfour JL ,
  • Wolfson MC ,
  • Berthelot JM ,
  • Wilkinson RG
  • Wolfson M ,
  • Davey Smith G ,
  • US Census Bureau
  • Tenleckyi NE
  • Backlund E ,
  • Sorlie PD ,
  • US Public Health Service, Centers for Disease Control
  • Atkinson AB
  • Weinberg DH
  • Kawachi I ,
  • Kennedy BP ,
  • Lochner K ,
  • Prothrow-Stith D
  • McMurrer DP ,

research paper on multiple regression

  • Open access
  • Published: 24 May 2022

Multiple regression model to analyze the total LOS for patients undergoing laparoscopic appendectomy

  • Teresa Angela Trunfio 1 ,
  • Arianna Scala 2 ,
  • Cristiana Giglio 3 ,
  • Giovanni Rossi 4 ,
  • Anna Borrelli 4 ,
  • Maria Romano 5 &
  • Giovanni Improta 2 , 6  

BMC Medical Informatics and Decision Making volume  22 , Article number:  141 ( 2022 ) Cite this article

19k Accesses

28 Citations

Metrics details

The rapid growth in the complexity of services and stringent quality requirements present a challenge to all healthcare facilities, especially from an economic perspective. The goal is to implement different strategies that allows to enhance and obtain health processes closer to standards. The Length Of Stay (LOS) is a very useful parameter for the management of services within the hospital and is an index evaluated for the management of costs. In fact, a patient's LOS can be affected by a number of factors, including their particular condition, medical history, or medical needs. To reduce and better manage the LOS it is necessary to be able to predict this value.

In this study, a predictive model was built for the total LOS of patients undergoing laparoscopic appendectomy, one of the most common emergency procedures. Demographic and clinical data of the 357 patients admitted at “San Giovanni di Dio e Ruggi d’Aragona” University Hospital of Salerno (Italy) had used as independent variable of the multiple linear regression model.

The obtained model had an R 2 value of 0.570 and, among the independent variables, the significant variables that most influence the total LOS were Age, Pre-operative LOS, Presence of Complication and Complicated diagnosis.

This work designed an effective and automated strategy for improving the prediction of LOS, that can be useful for enhancing the preoperative pathways. In this way it is possible to characterize the demand and to be able to estimate a priori the occupation of the beds and other related hospital resources.

Peer Review reports

Introduction

The appendix is a protrusion of the large intestine, located where the large intestine joins the small intestine. The appendix performs some immunological functions, but it is not a fundamental organ [ 1 ]. When something, such as undigested food residues obstruct the internal lumen, it inflames, causing the "appendicitis".

In emergency surgery, one of the most common conditions that require a surgery is appendicitis [ 2 ]. Appendicitis is primarily a disease of adolescents and young adults with a peak incidence in the second and third decades of life. There is a slight male preponderance of 3:2 in teenagers and young adults. In adults, the incidence of appendicitis is approximately 1.4 times greater in men than in women [ 3 ]. In general, the risk for men and women is estimated at 8.6% and 6.7%, respectively [ 4 ]. Then, on 100,000 case of acute appendicitis, a range between 114.44 and 481.60 require a surgical procedure [ 5 ]. This value is a function of the socioeconomic level of the countries considered, in fact, the risk of appendicitis is rising sharply, especially in industrialized countries.

In the post-war period, thanks to the use of antibiotics and in particular penicillin, mortality was reduced (from over 40–2%). In the case of uncomplicated diagnosis, mortality is 0.08–0.4% while it rises to 12% in the case of perforation [ 6 ]. The diagnosis of acute appendicitis is predominantly clinical, in that is based on the accurate evaluation of the data provided by the anamnestic collection and on the patient's physical examination. It can be difficult, occasionally taxing the diagnostic skills of even the most experienced surgeon [ 7 ]. Early diagnosis is an essential condition for an effective treatment.

Appendectomy is a surgical procedure that can basically be performed in two ways: laparoscopic appendectomy (LA) and open appendectomy (OA). Both procedures can be decisive, and the choice is conditioned in the first place by the patient's age and the severity of appendicitis, also by the surgeon's skills and the availability of hospital resources [ 8 ].

Since its introduction in 1983, LA has quickly become a common and more adopted practice [ 9 ]. Nguyen et al. showed both an increased used of LA compared of OA and that patients undergoing LA have generally a no complicate diagnosis, a shorter length of stay (LOS) and fewer post-operative complications, without the increasing of healthcare costs [ 10 ]. Kwok KayYau et al., instead, showed the efficacy of LA in the complicated appendicitis [ 11 ]. LA proves once again to be feasible and safe, with a significantly shorter operative time, lower incidence of wound infection, and reduced LOS compared with OA.

The LOS—measured in days—is defined as the difference between the date of admission and the date of discharge of the patient. It is linked to the severity of the medical conditions, age of patient and any complication of the medical diagnosis, or the treatment received [ 12 ].

LOS is useful for planning admission and so a direct indicator of effectiveness and efficiency that has an impact on the organization and costs. For these reasons, in literature there are many works that have used LOS as an indicator of quality [ 13 , 14 , 15 ]. In all aspects of the healthcare sector, the extraction of clinical and organizational data for advanced analysis [ 16 , 17 , 18 , 19 ] and for process improvement [ 20 , 21 , 22 , 23 ] has proven to be a fundamental support in patient management.

LOS modeling is also not new in the literature. Verburg et al. [ 24 ] compared the performance of eight regression models when predicting intensive care unit LOS, failing to obtain optimal results for any of them, while Lee et al. [ 25 ] show the high performance of robust gamma mixed regression for the study of pediatric LOS. In addition to regression models, multiple linear regression was used to predict the LOS for patients undergoing valvuloplasty by considering their characteristics [ 26 ]. Austin et al. [ 27 ] use statistical analysis or analyzing LOS in a cohort of patients undergoing CABG surgery, while Scala et al. [ 28 ] show the benefits of implementing classifiers for predicting LOS [ 29 , 30 , 31 , 32 , 33 ].

In this study, a predictive model of the hospital stay of patients undergoing laparoscopic appendectomy was constructed to study how certain clinical and demographic variables affect the LOS prediction. The present research work is an extension of our previous work [ 34 ] in which the dataset considered was extended both in terms of years of observation and comorbidities considered, also evaluating the impact of comorbidities. The model used is multiple linear regression, which has proven effective in different healthcare implementations.

The dataset, used in this study, included the information of 357 patients who have undergone an appendectomy in the five years 2016–2020 at the University Hospital “San Giovanni di Dio e Ruggi d’Aragona” of Salerno (Italy). The following variables was extracted from the hospital information system QuaniSDO:

Gender (Male / Female);

Comorbidities;

Diagnostic Related Group (DRG);

Date of admission, discharge and LC procedure;

From these, the independent and dependent variables of the MLR model were obtained. In particular, from the analysis of DRG it was possible to identify if a patient had Complications during surgery or Complicated diagnosis. From the date, the pre-operative LOS (date of LC procedure—date of admission) and the total LOS was calculated. From the comorbidities, the following additional independent variables have been defined:

Presence of comorbidities (yes / no);

Heart Disease (yes / no);

Diabetes (yes / no);

Hypertension (yes / no);

Obesity (yes / no);

Peritonitis (yes / no);

Cancer (yes / no).

Table 1 shows the distribution of the features into the sample.

The frequency of the groups of identified comorbidities on the population was calculated (Table 2 ). Frequency is a measure of the frequency of a disease or health condition in a population at a particular point in time [ 35 ], in this case in the five years 2016–2020.

IBM SPSS (Statistical Package for Social Science) ver. 27 was used to build a MLR model used to predict the total LOS [ 36 ].

  • Multiple linear regression

In the last years, several data analytics methodologies have been proposed for supporting different applications [ 37 , 38 ]. One of the most used one is the Multiple Linear Regression, that is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. Multiple linear regression represents an extension of the simple linear regression model that uses just one explanatory variable. In this work, MLR model was implemented to predict the value of dependent variable Y (total LOS) starting from knowledge of several independent variables (Age, Gender, Pre-operative LOS, Complications during surgery, Complicated diagnosis, Presence of comorbidities, Heart Disease, Diabetes, Hypertension, Obesity, Peritonitis and Cancer).

The equation for a multiple linear regression is:

where Y is the total LOS, β 0 is intercept value, x i are the twelve independent variables (pre-operative LOS, presence of complications, complicated diagnosis, gender, age, presence of comorbidities, heart disease, diabetes, hypertension, obesity, peritonitis and cancer) and β i are the estimated regression coefficients of respective independent variables. \(\varepsilon\) is the model error, i.e. the variation of our estimate of Y with respect to the real value. Before creating the model, the following six hypotheses must be verified:

The linear relationship between the independent and dependent variable. It can be checked through the scatter plot.

Absence of multicollinearity. Multicollinearity determines important changes in the values of the regression coefficients. Tolerance = 1- \(R_{i}^{2}\) and Variance Inflation Factor (VIF) =  \(\frac{1}{{1 - R_{i}^{2} }}\) —where \(R_{i}^{2}\) is the proportion of the variation in the dependent variable that is predictable from the independent variables—are used to verify this assumption.

The independence of the residuals. In this case, the result of Durbin-Watson statistical test is analyzed.

The residuals have constant variance. It is possible to verify it by building the graph of "standardized residuals" against the "standardized predicted value".

The residuals are normally distributed. To verify this assumption a quantile–quantile (Q-Q) plot can be used.

Presence of outliers. The Cook's distance values always less than 1 guarantees the absence of outliers.

As a measure of the goodness of fit of a multiple regression model, the coefficient of determination, known as R 2 , is used. The linear determination index R 2 represents the fraction of variance of Y which is explainable by the X regressors included in the model.

R 2 shows how well the terms (data points) fit a curve or line but there is also Adjusted-R 2 that indicates how well terms fit a curve or line, but adjusts for the number of terms in a model. This is why in multiple linear regression with several predictors it is advisable to observe Adjusted-R 2 [ 39 ].

where n represents the total sample size and m is the number of predictors. In most cases it turns out: 0 ≥ R 2  ≥ 1. The \(R^{2}\) and \(\overline{{R^{2} }}\) tell whether the regressors are suitable for predicting the values of the dependent variable in the sample of data used. If \(R^{2}\) (or \(\overline{{R^{2} }}\) ) tends to one, the regressors produce good predictions of the dependent variable, if \(R^{2}\) (or \(\overline{{R^{2} }}\) ) tends to 0 the opposite is true. The level of significant α is equal to 0.05.

Before building the MLR model, the six hypotheses were tested. The result of Durbin-Watson test was 1.505 and it was between the acceptable range of [1.5; 2.5] to demonstrate the independence of residual. The Cook’s distance for each observation was less than 1, so there were not outliers in the dataset that negatively affect the estimate of the coefficients. For the 2nd assumption, Table 3 shows the values of VIF, and Tolerance obtained for each independent variable.

The VIF values were always less than 10 and the Tolerance values were always greater than 0.2, so the absence of multicollinearity was verified.

Figure  1 shows the Q-Q plot, a graph “observed value” against “expected normal value” used to test the normally distribution of the residual values.

figure 1

Normal Q-Q Plot of Standardized Residual

As can be seen from the Fig.  1 , the points are quite close to the line. There are few outliers, but which is proven not to affect the goodness of the coefficients estimation. In fact, Cook's distance was calculated for each point and the maximum value obtained was 0.8, which is well below the required threshold 1.

Figure  2 shows the graph of "standardized residuals" against the "standardized predicted value" used to verify that the variance of the residuals is constant.

figure 2

Plot of "standardized residuals" against the "standardized predicted value"

The variance of residuals was not constant across predicted values, so there was a moderate violation of homoscedasticity, which was however considered acceptable. In fact, Table 4 shows that the analysis of variance is significant, i.e. there is indeed a linear dependence between the dependent variable and the regressor variable (p-value < 0.05). Then, the MLR model was implemented. Table 4 shows the performance of the model.

The coefficient of determination (R 2 ) was greater than 0.5 so it can be considered a good preliminary model to represent the problem. The p-values below the alpha value are highlighted in bold.

Table 5 shows the coefficients of the model and the results of the t-test, used to study the significance of the regression coefficients (βi). P-values < 0.05 were considered statistically significant.

The p-value was less than 0.05 for the Pre-operative LOS, the Presence of complication, Complicated diagnosis and Age. Among these variables that significantly influence LOS, the pre-operative LOS has the highest coefficient in accordance with the definition of total LOS (pre-operative LOS + post-operative LOS).

The aim of this work was to build a predictive model, using the multiple linear regression, of the total LOS for patients undergoing a laparoscopic appendectomy at "San Giovanni di Dio e Ruggi d’Aragona" University Hospital of Salerno (Italy) in the five-year period 2016–2020. Starting from a group of selected information (Gender, Age, Comorbidities, Diagnostic Related Group (DRG), Date of admission, Date of discharge and Date of LC procedure) the independent variables of the model were obtained. In particular, the analysis of the comorbidities made it possible to divide patients into subgroups by categories of pathologies with higher frequency in our sample.

A simple model has been obtained with a value of R 2 equal to 0.570. The value of R 2 , even if slightly, exceeds the value of 0.5 that support its use for this task. In fact, the linear models have the advantage of being easy to understand and use during the activities carried out by healthcare staff. The results of t-test demonstrate that Pre-operative LOS, Presence of complication, Complicated Diagnosis and Age are the variables that most influence the total LOS. The Pre-operative LOS is a value that we expected because it is linked with the definition of LOS. The result of the influences is actually in line with what can be read from the literature on the topic. For example, Liu et al. [ 40 ] show how age is a factor influencing procedures related to 18 different DRGs. Remaining in the theme of appendectomy, Ponsiglione et al. [ 41 ] showed how in procedures performed in urgency there is a strong link between LOS and comorbidities, while Demir et al. [ 42 ] highlight how both postoperative and total LOS of the patients undergoing appendectomy are more likely to be affected by patients' demographic characteristics and clinical needs. In addition, other variables not included in this study have significant effects on LOS. For example, Crandall et al. [ 43 ] showed as the operative time of day was a surprisingly important determinant of hospital LOS while Cheong et al. [ 44 ] highlighted a significantly longer hospital stay was associated with open appendectomy, pediatric surgeon, and the Territories for simple appendicitis in pediatric patients.

The multi-year study showed a dependence of total LOS on age that was not evident in the previous model [ 30 ]. This information is important for the possible creation of pathways for specific age groups, for the management of complications or for the standardization of the pre-operative phase, as already done by the hospital for femur fracture in patients older than 65 years [ 45 ].

This work demonstrated that the MLR represents a valid preliminary support to characterize the demand and to be able to estimate a priori the occupation of the beds and the use of other hospital resources.

Although the work is novel in terms of sample size and number of comorbidities analyzed, it is not without limitations. In particular, the model is not validated through the use of datasets from other hospitals, the impact that other procedures, such as those related to possible complications, may have on LOS is not included, and the value of R. 2 is slightly above the 0.5 value and this makes it necessary to search for a more robust predictive model. For example, classification algorithms (such as Logistic Regression) could be a valid alternative [ 46 ].

In this work, the data of 357 patients undergoing LC at "San Giovanni di Dio e Ruggi d’Aragona" University Hospital of Salerno (Italy) in the five-year period 2016–2020 was study using MLR model, whose aim is to predict LOS on the basis of patients' clinical and demographic variables. Among the independent variables, Pre-operative LOS, presence of complication, complicated diagnosis and age are the variables that most influence the total LOS. The results are in line with what can be found in the scientific literature, in which the impact of age, complicated diagnoses, and complications is discussed for several clinical procedures including appendectomy. The model, in addition, has good performance that validates it as a prediction tool to be given for use by clinicians. The linear model, however, although very simple in its interpretation, could not be robust enough. Therefore, future developments will include validation of the model with multicenter studies as well as the use of advanced data processing tools.

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available for privacy reasons but are available from the corresponding author on reasonable request.

Abbreviations

Laparoscopic Appendectomy

Open Appendectomy

Length Of Stay

https://www.news-medical.net/health/Why-do-Humans-have-an-Appendix-(Italian).aspx .

Cervellin G, Mora R, Ticinesi A, et al. Epidemiology and outcomes of acute abdominal pain in a large urban Emergency Department: retrospective analysis of 5,340 cases. Ann Transl Med. 2016;4:362.

Article   Google Scholar  

Alvarado A. Clinical approach in the diagnosis of acute appendicitis. In: Garbuzenko D (ed) Current issues in the diagnostics and treatment of acute appendicitis. Intech Open;2018: p. 13–42.

Krzyzak M, Mulrooney SM. Acute appendicitis review: background, epidemiology, diagnosis, and treatment. Cureus. 2020;12(6):e8562. https://doi.org/10.7759/cureus.8562 .

Article   PubMed   PubMed Central   Google Scholar  

Salomon JA, Wang H, Freeman MK, Vos T, Flaxman AD, Lopez AD, Murray CJ. Healthy life expectancy for 187 countries, 1990–2010: a systematic analysis for the Global Burden Disease Study 2010. The Lancet. 2012;380(9859):2144–62.

Stein GY, Rath-Wolfson L, Zeidman A, et al. Sex differences in the epidemiology, seasonal variation, and trends in the management of patients with acute appendicitis. Langenbecks Arch Surg. 2012;397:1087–92. https://doi.org/10.1007/s00423-012-0958-0 .

Article   PubMed   Google Scholar  

Marudanayagam R, Williams GT, Rees BI. Review of the pathological results of 2660 appendicectomy specimens. J Gastroenterol. 2006;41(8):745–9. https://doi.org/10.1007/s00535-006-1855-5 .

Prystowsky JB, Pugh CM, Nagle AP. Appendicitis. Curr Probl Surg. 2005;42(10):694–742. https://doi.org/10.1067/j.cpsurg.2005.07.005 .

Mandrioli M, et al. Advances in laparoscopy for acute care surgery and trauma. World J Gastroenterol. 2016;22(2):668–80. https://doi.org/10.3748/wjg.v22.i2.668 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nguyen NT, et al. Trends in utilization and outcomes of laparoscopic versus open appendectomy. Am J Surg. 2004;188(6):813–20.

Yau KK, et al. Laparoscopic versus open appendectomy for complicated appendicitis. J Am Coll Surg. 2007;205(1):60–5.

McAleese P, Odling-Smee W. The effect of complications on length of stay. Ann Surg. 1994;220(6):740.

Article   CAS   Google Scholar  

McVeigh TP, et al. Assessing the impact of an ageing population on complication rates and in-patient length of stay. Int J Surg. 2013;11(9):872–5.

Moore L, et al. Derivation and validation of a quality indicator of acute care length of stay to evaluate trauma care. Ann Surg. 2014;260(6):1121–7.

Picone I, Latessa I, Fiorillo A, Scala A, Angela Trunfio T, Triassi M (2021) Predicting length of stay using regression and Machine Learning models in Intensive Care Unit: a pilot study. In: 2021 11th international conference on biomedical engineering and technology; p. 52–8.

Ponsiglione AM, Cesarelli G, Amato F, Romano M. Optimization of an artificial neural network to study accelerations of foetal heart rhythm. In: 2021 IEEE 6th international forum on research and technology for society and industry (RTSI); 2021. p. 159–64. https://doi.org/10.1109/RTSI50628.2021.9597213 .

Cesarelli M, Romano M, Bifulco P, Improta G, D’Addio G. An application of symbolic dynamics for FHRV assessment. Stud Health Technol Inform. 2012;180:123–7.

PubMed   Google Scholar  

Improta G, Ponsiglione AM, Parente G, Romano M, Cesarelli G, Rea T et al. Evaluation of medical training courses satisfaction: Qualitative analysis and analytic hierarchy process. In: European medical and biological engineering conference; p. 518–26; 2020. Springer, Cham.

Cesarelli G, Scala A, Vecchione D, Ponsiglione AM, Guizzi G. An innovative business model for a multi-echelon supply chain inventory management pattern. J Phys Conf Ser. 2021;1828(1):012082.

Improta G, Luciano MA, Vecchione D, Cesarelli G, Rossano L, Santalucia I, Triassi M. Management of the diabetic patient in the diagnostic care pathway. In: Jarm T, Cvetkoska A, Mahnič-Kalamiza S, Miklavcic D (eds) 8th European medical and biological engineering conference. EMBEC 2020. IFMBE proceedings, vol 80;2021. Springer, Cham. https://doi.org/10.1007/978-3-030-64610-3_88

Converso G, Improta G, Mignano M, Santillo LC. A simulation approach for agile production logic implementation in a hospital emergency unit. In: Intelligent software methodologies, tools and techniques, vol. 532, p. 623–34;2015. Springer.

Trunfio TA, Scala A, Borrelli A, Sparano M, Triassi M, Improta G. Application of the Lean Six Sigma approach to the study of the LOS of patients who undergo laparoscopic cholecystectomy at the San Giovanni di Dio and Ruggi d'Aragona University Hospital. In 2021 5th international conference on medical and health informatics (ICMHI 2021). Association for Computing Machinery, New York, NY, USA, 50–54;2021. https://doi.org/10.1145/3472813.3472823

Raiola E, Triassi M, Improta G, Di Cicco MV, Montella E, Ferraro A, Cerchione R, Centobelli P. Implementation of lean practices to reduce healthcare associated infections. Int J Healthc Technol Manag. 2020;18:51. https://doi.org/10.1504/IJHTM.2020.10039887 .

Verburg IWM, et al. Comparison of regression methods for modeling intensive care length of stay. PLoS ONE. 2014;9(10):e109684.

Lee AH, et al. A robustified modeling approach to analyze pediatric length of stay. Ann Epidemiol. 2005;15(9):673–7.

Scala A, Trunfio TA, De Coppi L, Rossi G, Borrelli A, Triassi M, Improta G. Regression models to study the total LOS related to valvuloplasty. Int J Environ Res Public Health. 2022;19(5):3117.

Austin PC, Rothwell DM, Tu JV. A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Method. 2002;3(2):107–33.

Scala A, Angela Trunfio T, Lombardi A, Giglio C, Borrelli A, Triassi M. A comparison of different Machine Learning algorithms for predicting the length of hospital stay for patients undergoing cataract surgery. In: 2021 International symposium on biomedical engineering and computational biology; p. 1–4.

Austin PC, Tu JV, Daly PA, Alter DA. The use of quantile regression in health care research: a case study examining gender differences in the timeliness of thrombolytic therapy. Stat Med. 2005;24(5):791–816.

Scala A, Trunfio TA, Borrelli A, Ferrucci G, Triassi M, Improta G. Modelling the hospital length of stay for patients undergoing laparoscopic cholecystectomy through a multiple regression model. In 2021 5th international conference on medical and health informatics (ICMHI 2021). Association for Computing Machinery, New York, NY, USA. P. 68–72; 2021. https://doi.org/10.1145/3472813.3472826 .

Trunfio TA, Maria Ponsiglione A, Ferrara A, Borrelli A, Gargiulo PA. Comparison of different regression and classification methods for predicting the length of hospital stay after cesarean sections. In: 2021 5th international conference on medical and health informatics. 2021.

Lukong AMY, Jafaru Y. Covid-19 pandemic challenges, coping strategies and resilience among healthcare workers: A multiple linear regression analysis. Afr J Health Nurs Midwif. 2021;4:16–27.

Google Scholar  

Turgeman L, May JH, Sciulli R. Insights from a machine learning model for predicting the hospital Length of Stay (LOS) at the time of admission. Expert Syst Appl. 2017;78:376–85.

Trunfio TA, Scala A, Giglio C, Rossi G, Borrelli A, Gargiulo P, Romano M. Modelling the hospital length of stay for patients undergoing laparoscopic appendectomy through a Multiple Regression Model. In 2021 International Symposium on Biomedical Engineering and Computational Biology (BECB 2021). Assoc Comput Mach. 2021;36:1–5. https://doi.org/10.1145/3502060.3503644 .

https://www.health-ni.gov.uk/articles/prevalence-statistics#:~:text=Prevalence%20is%20a%20measure%20of,within%20a%20particular%20time%20period .

IBM Corp. IBM SPSS statistics for windows; version 27.0; IBM Corp: Armonk, NY, USA, 2020.

Sperlí G. A deep learning based community detection approach. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing, p. 1107–1110. 2019. https://doi.org/10.1145/3297280.3297574 .

De Santo A, Galli A, Gravina M, Moscato V, Sperlì G. Deep Learning for HDD health assessment: an application based on LSTM. IEEE Trans Comput. 2020;71(1):69–80. https://doi.org/10.1109/TC.2020.3042053 .

Everitt BS, Skrondal A. The Cambridge dictionary of statistics. Cambridge: Cambridge University Press; 2010.

Book   Google Scholar  

Yingxin L, Jim PMC. Factors influencing patients’ length of stay. Aust Health Rev. 2001;24:63–70.

Maria Ponsiglione A., et al. Modeling the variation in length of stay for appendectomy and cholecystectomy interventions in the emergency general surgery. In: 2021 international symposium on biomedical engineering and computational biology. 2021.

Demir C, et al. The factors affecting length of stay of the patients undergoing appendectomy surgery in a military teaching hospital. Mil Med. 2007;172(6):634–9.

Crandall M, et al. Acute uncomplicated appendicitis: case time of day influences hospital length of stay. Surg Infect. 2009;10(1):65–9.

Cheong LHA, Emil S. Determinants of appendicitis outcomes in Canadian children. J Pediatr Surg. 2014;49(5):777–81.

Scala A, Ponsiglione AM, Loperto I, Della Vecchia A, Borrelli A, Russo G, Triassi M, Improta G. Lean six sigma approach for reducing length of hospital stay for patients with femur fracture in a University Hospital. Int J Environ Res Public Health. 2021;18:2843. https://doi.org/10.3390/ijerph18062843 .

Scala A, Loperto I, Carrano R, Federico S, Triassi M, Improta G. Assessment of proteinuria level in nephrology patients using a machine learning approach. In: 2021 5th international conference on medical and health informatics (ICMHI 2021). Association for Computing Machinery, New York, NY, USA, 13–16;2021. https://doi.org/10.1145/3472813.3472816 .

Download references

Acknowledgements

The authors thank the organizers of the 2021 International Symposium on Biomedical Engineering and Computational Biology (BECB 2021) for give us the opportunity to published the short version of this work. It was an important recognition that prompted us to continue and deepen our study.

Not applicable.

Author information

Authors and affiliations.

Department of Advanced Biomedical Sciences, University Hospital of Naples ‘Federico II’, Naples, Italy

Teresa Angela Trunfio

Department of Public Health, University of Naples “Federico II”, Naples, Italy

Arianna Scala & Giovanni Improta

University of Rome “La Sapienza”, Rome, Italy

Cristiana Giglio

“San Giovanni di Dio e Ruggi d’Aragona” University Hospital, Salerno, Italy

Giovanni Rossi & Anna Borrelli

Department of Electrical Engineering and Information Technology, University of Study of Naples “Federico II”, Naples, Italy

Maria Romano

Interdepartmental Center for Research in Healthcare Management and Innovation in Healthcare (CIRMIS), University of Naples “Federico II”, Naples, Italy

Giovanni Improta

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, A.B., G.R., M.R. and G.I.; methodology, A.S., C.G. and T.A.T.; validation, A.S. and T.A.T.: formal analysis, A.S. and T.A.T.; investigation, A.S., C.G. and T.A.T.; resources, A.B., M.R. and G.I.; data curation, A.L., A.S., C.G. and T.A.T.; writing—original draft preparation, A.S. and T.A.T; writing—review and editing, A.B., M.R. and G.I.; visualization, A.S. and T.A.T; supervision, A.B., M.R. and G.I.; project administration, A.B., M.R. and G.I. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Arianna Scala .

Ethics declarations

Ethics approval and consent to participate.

In compliance with the Declaration of Helsinki and with the Italian Legislative Decree 211/2003, Implementation of the 2001/20/CE directive, since no patients/children were involved in the study, the signed informed consent form and the ethical approval are not mandatory for these type of studies. Furthermore, in compliance with the regulations of the Italian National Institute of Health, our study is not reported among those needing assessment by the Ethical Committee of the Italian National Institute of Health. The hospital management authorised us to access and use the database and the hospital's medical director is listed as an author.

Consent to publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Trunfio, T.A., Scala, A., Giglio, C. et al. Multiple regression model to analyze the total LOS for patients undergoing laparoscopic appendectomy. BMC Med Inform Decis Mak 22 , 141 (2022). https://doi.org/10.1186/s12911-022-01884-9

Download citation

Received : 28 December 2021

Accepted : 16 May 2022

Published : 24 May 2022

DOI : https://doi.org/10.1186/s12911-022-01884-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Appendectomy
  • Length of stay
  • Public health

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

research paper on multiple regression

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Physiol

Estimation of Health-Related Physical Fitness Using Multiple Linear Regression in Korean Adults: National Fitness Award 2015–2019

Sung-woo kim.

1 Physical Activity and Performance Institute, Konkuk University, Seoul City, South Korea

Hun-Young Park

2 Department of Sports Medicine and Science, Graduate School, Konkuk University, Seoul City, South Korea

Hoeryong Jung

3 Department of Mechanical Engineering, Konkuk University, Seoul City, South Korea

4 Department of Physical Education, Konkuk University, Seoul City, South Korea

Associated Data

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Continuous health care and the measurement of health-related physical fitness (HRPF) is necessary for prevention against chronic diseases; however, HRPF measurements including laboratory methods may not be practical for large populations owing to constraints such as time, cost, and the requirement for qualified technicians. This study aimed to develop a multiple linear regression model to estimate the HRPF of Korean adults, using easy-to-measure dependent variables, such as gender, age, body mass index, and percent body fat. The National Fitness Award datasets of South Korea were used in this analysis. The participants were aged 19–64 years, including 319,643 male and 147,600 females. HRPF included hand grip strength (HGS), flexibility (sit and reach), muscular endurance (sit-ups), and cardiorespiratory fitness (estimated VO 2 max ). An estimation multiple linear regression model was developed using the stepwise technique. The outlier data in the multiple regression model was identified and removed when the absolute value of the studentized residual was ≥2. In the regression model, the coefficient of determination for HGS (adjusted R 2 : 0.870, P < 0.001), muscular endurance (adjusted R 2 : 0.751, P < 0.001), and cardiorespiratory fitness (adjusted R 2 : 0.885, P < 0.001) were significantly high. However, the coefficient of determination for flexibility was low (adjusted R 2 : 0.298, P < 0.001). Our findings suggest that easy-to-measure dependent variables can predict HGS, muscular endurance, and cardiorespiratory fitness in adults. The prediction equation will allow coaches, athletes, healthcare professionals, researchers, and the general public to better estimate the expected HRPF.

Introduction

Physical fitness is defined as a physiological state of wellbeing in which one can perform daily activities without strain, or that provides the basis for exercise performance. Health-related physical fitness (HRPF) includes components related to a health condition, such as musculoskeletal and cardiorespiratory fitness (CRF; Liguori and American College of Sports Medicine, 2020 ).

Health-related physical fitness and physical activity (PA) level are often used together, with physical fitness generally considered a more accurate measurement of PA level than self-reported assessments ( Williams, 2001 ). PA involves body movements caused by skeletal muscle contractions that increase energy consumption beyond the basic level ( Meredith and Welk, 2010 ; Liguori and American College of Sports Medicine, 2020 ). Systematic research on the association between PA and health conditions began six decades ago, and since then, the scientific literature has confirmed the relationship between these two areas ( Liguori and American College of Sports Medicine, 2020 ). Physical fitness was reported to be similar to PA in terms of its association with morbidity and mortality ( Blair and Brodney, 1999 ; Erikssen, 2001 ). However, physical fitness predicts health outcomes more strongly than PA ( Blair et al., 2001 ; Williams, 2001 ; Myers et al., 2004 ). Previous studies have shown at least a 50% decrease in mortality among individuals with a high physical fitness level compared to those with a low physical fitness level ( Myers et al., 2004 ). In addition to serving as a prognostic and diagnostic health indicator in clinical settings, CRF has been used as an indicator of regular exercise ( Lin et al., 2015 ). Warburton et al. reported that the physiological functions of the human body and HRPF continuously decrease with aging, leading to an increased risk for chronic diseases ( Warburton et al., 2006 ). Among the HRPF components, the CRF index’s maximal oxygen uptake decreases by about 3–6% due to aging ( Fleg et al., 2005 ). High levels of HRPF maintained from adulthood can reduce musculoskeletal, cardiovascular, and metabolic diseases such as osteoporosis, sarcopenia, hypertension, and diabetes ( Carnethon et al., 2003 ; Katzmarzyk et al., 2004 ; Barry et al., 2014 ; Kim et al., 2019b ). The HRPF is an indirect health indicator of the body, and continuous care is important. Therefore, all of the previous study findings establish the need to include HRPF testing in health condition monitoring systems ( Ortega et al., 2008 ). Furthermore, the World Health Organization suggested that regular physical fitness and PA testing should be examined as a public health priority ( World Health Organization [WHO], 2010 ). To prevent chronic diseases, continuous healthcare is necessary, which requires the evaluation of HRPF. However, measurements of HRPF are often not practical or feasible to perform in daily life. Additionally, laboratory methods can accurately measure physical fitness, but may not be a feasible approach for entire populations owing to cost, time constraints, and the need for qualified technicians and sophisticated devices.

The American College of Sports Medicine suggested that physical health is a measurable result of an individual’s PA and exercise habits, which is why many healthcare providers value the accurate and precise measurement of HRPF ( Liguori and American College of Sports Medicine, 2020 ). Common HRPF tests include the isometric hand grip strength (HGS) test for measuring muscle strength ( Bäckman et al., 1995 ), the sit and reach test for flexibility ( Mier, 2011 ), the sit-up test for abdominal muscular endurance ( Chen et al., 2020b ), and the graded exercise test for cardiorespiratory endurance ( Beltz et al., 2016 ; Kim et al., 2019b ). The association between HRPF and health conditions has been established in several studies ( Mendes et al., 2016 ; Chrismas et al., 2019 ; Chen et al., 2020a ). Recently, technological advances in health care and sports science have provided coaches, athletes, healthcare professionals, and researchers with efficient, reliable, and economical means to record health-related and exercise performance data ( Seshadri et al., 2017 ; Aroganam et al., 2019 ; Kim et al., 2019a ; Ray et al., 2019 ). The connected gains of novel analytical techniques, portable and reliable devices, and comprehensive software programs suggest that research on health promotion will increase in the future ( Loncar-Turukalo et al., 2019 ). Several predictive equations have been developed to estimate HRPF to increase utility for field-based research ( Esco et al., 2008 ; Shenoy et al., 2012 ; Lopes et al., 2018 ; Zaccagni et al., 2020 ). These previous studies generally linked HRPF parameters to laboratory evaluations. However, there were differences in the equation’s estimation reliability due to sample size, the number of independent variables, differences in measurement methods, and statistical analysis methods.

Therefore, our study aimed to develop a multiple linear regression model to predict HRPF parameters (e.g., HGS, flexibility, muscular endurance, and CRF) using easy-to-measure dependent variables [e.g., gender, age, body mass index (BMI), and percent body fat] in Korean adults.

Materials and Methods

The National Fitness Award (NFA) datasets of South Korea were used in this analysis. The NFA is a nationwide test in 75 sites that assesses the physical fitness of the general population in South Korea. This study included male and female (age: 19–64 years) who participated in the NFA from 2015 to 2019. Among a total of 457,942 adults, we excluded participants who had no data on their dependent variables ( n = 640) and had no data on their HRPF parameters ( n = 669). Finally, a total of 456,633 adults (male: n = 210,613, female: n = 246,020) were included in the analysis. Male and female were divided in the ratio of 7:3 using the Bernoulli trial. Approximately 70% of the divided data (total: n = 319,643, male: n = 147,600, female: n = 172,043) were used in the development of the HRPF estimation formula with gender, age, BMI, and percent body fat, and approximately 30% of the data (total: n = 136,990, male: n = 63,013, female: n = 73,977) were used for the validity test. The power test was performed using G ∗ Power 3.1.9.2 (Franz Faul, University of Kiel, Kiel, Germany) at the tails of two, the H1 ρ 2 of 0.3, the H0 ρ 2 of 0, the significant level of 0.05 (α = 0.05), the power of 0.9, and the number of predictors of 4 for all statistical tests. G ∗ Power showed that 51 subjects had sufficient power for this study. The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of Kunkuk University (7001355-202101-E-132). All individuals provided informed consent before enrollment. The population characteristics are presented in Table 1 .

Characteristics of the study population.

Measurement of Dependent Variables

Height was measured to the nearest 0.1 cm using a stadiometer (Seca, Seca Corporation, Columbia, MD, United States). Body weight and percent body fat were measured using bioelectrical impedance analysis equipment (Inbody 720, Inbody, Seoul, Korea) ( Jeong et al., 2020 ). BMI was calculated by dividing body weight (kg) by height squared (m 2 ).

Health-Related Physical Fitness Parameters

All HRPF parameters were measured by certified health and physical fitness instructors. The HRPF assessment for adults included HGS, flexibility (sit and reach), muscular endurance (sit-ups), and CRF (estimated VO 2 max ). Descriptions of the tests are as follows:

HGS (kg): Isometric muscle strength was assessed using a hand dynamometer (GRIP-D 5101, Takei, Niigata, Japan). Participants held the dynamometer with their preferred hand and squeezed it as forcefully as possible. All participants were tested twice, and the best result was recorded to the nearest 0.1 kg.

Sit-and-reach (cm): The participants sat on a mat and placed their feet in front of the measurement board with their legs fully extended. Participants were directed to gradually reach forward with both hands overlapped and push the bar as far as possible, holding this position for approximately 3 s. The best score was recorded after two trials and recorded to the nearest 0.1 cm.

Sit-ups (number of times): The participants laid on a mat with their knees bent at 90° and their feet held down by a partner. After being instructed to begin, they raised their upper body until their elbows touched the knees, and then returned to the initial position where both shoulders were in contact with the mat. Their hands were required to remain placed crosswise on the chest during the test. The total number of accurately performed and complete sit-ups was recorded.

Estimated VO 2 max (ml/kg/min): A graded exercise treadmill test with Bruce protocol ( Bruce et al., 1973 ) was applied to measure a VO 2 max . All participants began walking at a speed of 2.7 km/h, at an inclination of 10%. The speed was increased 1.3–1.4 km/h at 3 min intervals, and the incline was increased by 2% with each stage. The graded exercise test was performed on a treadmill (TM55 treadmill, Quinton Cardiology Systems, Inc., Seattle, WA, United States). Heart rate was measured using a heart rate monitor (Quinton Q-Stress, Quinton Cardiology Systems, Inc., Bothell, WA, United States). The participants were expected to reach three of the following criteria: (1) heart rate reserve >85%; (2) heart rate did not increase even when the stage increased; (3) rating of perceived exertion >17 (range: 6–20); (4) request to stop by the participant. The VO 2 max was calculated using the Bruce formula: 6.70 − 2.82 × (1: male, 2: female) + (0.056 × exercise maintaining time (s)) ( Bruce et al., 1973 ).

Statistical Analysis

The mean and standard deviation were calculated for all measured parameters. The normality of distribution of all outcome variables was verified using the Kolmogorov–Smirnov test. To perform multiple linear regression analysis, the β-value (the regression coefficient) was used to verify if the independent variables had explanatory power ( Park et al., 2020 ). In this work we used the stepwise mode of regression analysis, which is indicated when multiple independent variables are taken as predictors ( Shepperd and MacDonell, 2012 ; Bardsiri et al., 2014 ). The stepwise regression technique aims to maximize the estimated power with a minimum number of independent variables. Multiple linear regression analysis with the stepwise technique predicted HRPF parameters (HGS, flexibility, muscular endurance, and CRF) using dependent variables (e.g., gender, age, body mass index, and percent body fat). In addition, we rigorously conformed to the basic assumptions of the regression model: linearity, independence, autocorrelation, homoscedasticity, continuity, normality, and outliers. The outlier data in the multiple regression model were identified and removed when the absolute value of the studentized residual (SRE) was ≥2. The validity of the regression model was tested using approximately 30% of the total data, which had already been divided through the Bernoulli trial, and were not included in the development of the regression model. The validation test calculated the predicted values of the HRPF parameters using the regression equation, and the mean error and standard errors of estimation (SEE) were calculated using formulas 1 and 2. Two-tailed Pearson-correlation analysis was performed to estimate the relationships between measured and predicted HRPF parameters. The Statistical Package for the Social Sciences (SPSS) version 25.0 (IBM Corporation, Armonk, NY, United States) was used for analysis, and the level of significance was set at 0.05.

Formula 1. The calculation formula for the mean error

Formula 2. The calculation formula for the standard errors of estimation.

For each multiple regression model developed, the F-test was used to validate the significance of the model. Multiple regression analyses have shown that the regression coefficients for the selected independent variable were statistically significant. Multiple regression analyses for each model included coefficients of determination ( R 2 ), adjusted coefficients of determination (adjusted R 2 ), and SEE. The correlations between the dependent variables and HRPF parameters are shown in Table 2 .

Correlation coefficients between dependent variables and HRPF parameters for the estimating regression model.

Performance Evaluation of Regression Models and Regression Equations

The detailed results of the multiple regression analysis using HRPF parameters are shown in Table 3 . The estimated explanatory power of HGS regression models was 71.0%, and SEE was 5.60 kg ( F = 194,597.062, P < 0.001). Further, the explanatory power of the sit and reach regression models was 15.5%, and SEE was 8.60 cm ( F = 14,568.080, P < 0.001). The explanatory power of sit-ups regression models was 55.5%, and SEE was 10.63 n ( F = 98,806.560, P < 0.001). In addition, the explanatory power of estimated VO 2 max regression models was 72.0%, and SEE was 3.56 ml/kg/min ( F = 131291.452, P < 0.001).

Estimated regression equations predicting HRPF parameters.

Performance Evaluation of Regression Models and Regression Equations Without Outlier Data

Table 4 shows the results of the multiple regression analysis using HRPF parameters without outlier data. The explanatory power of HGS regression models (SRE 27: n = 253,339) was 87.0%, and SEE was 3.27 kg ( F = 422009.836, P < 0.001). Moreover, the explanatory power of the developed sit and reach regression models (SRE 31: n = 263,737) was 29.8%, and SEE was 5.64 cm ( F = 28,019.748, P < 0.001). The explanatory power of sit-ups regression models (SRE 34: n = 268,182) was 75.1%, and SEE was 7.44 n ( F = 202,721.241, P < 0.001). In addition, the explanatory power of estimated VO 2 max regression models (SRE 44: n = 151,314) was 88.5%, and SEE was 1.77 ml/kg/min ( F = 290,332.119, P < 0.001).

Estimated regression equations predicting HRPF parameters without outlier data.

Regression Model Validity

The validity of the developed regression models was calculated using data not included in multiple regression analyses. In all regression models of HRPF parameters, the mean error was −38.13 to 3.36% (HGS: −4.33%, sit and reach: −14.92%, sit-ups: −38.13%, and estimated VO 2 max : 3.36%), and SEE was higher than the developed regression model ( Table 5 ).

Validity of estimating the regression model.

Relationship Between Measured and Predicted HRPF Parameters

Table 6 displays the relationship between the measured and predicted HRPF parameters. Measured HRPF parameters were positively related with predicted HGS ( r = 0.841, P < 0.01), sit and reach ( r = 0.391, P < 0.01), sit-ups ( r = 0.746, P < 0.01), and estimated VO 2 max ( r = 0.848, P < 0.01), as seen in Figure 1 .

Relationship between measured and predicted HRPF parameters.

An external file that holds a picture, illustration, etc.
Object name is fphys-12-668055-g001.jpg

Relationship between measured or estimated, and predicted HRPF. (A) Hand grip strength. (B) Sit and reach. (C) Sit-ups. (D) VO 2 max . Significant correlation between measured or estimated and predicted variables, ** P < 0.01.

Over the years, the components of HRPF have been established in various ways in scientific research ( Meredith and Welk, 2010 ). Previous studies describe HRPF as having a multidimensional structure despite the many different definitions ( Meredith and Welk, 2010 ). Some European studies consider HRPF to include body composition, musculoskeletal fitness, CRF, and skill-related fitness (agility, speed, and coordination) ( Artero et al., 2011 ; Ruiz et al., 2011 ; Secchi et al., 2014 ). Other studies consider only body composition, CRF, musculoskeletal fitness, and flexibility ( Pillsbury et al., 2013 ); or body composition, CRF, muscle strength, and flexibility as components of HRPF ( Castillo-Garzón et al., 2006 ). However, the American College of Sports Medicine recommends five factors: body composition, flexibility, muscular strength, muscular endurance, and CRF ( Liguori and American College of Sports Medicine, 2020 ). Therefore, multiple regression analysis using the stepwise technique predicted the HRPF parameters (HGS, flexibility, muscular endurance, and CRF) of the American College of Sports Medicine criteria using dependent variables (e.g., gender, age, body mass index, and percent body fat).

Many researchers have conducted studies to evaluate health conditions and exercise performance using HRPF, while assuming that the HRPF parameter is a reliable healthcare index. For healthcare, the development of tools or equipment that can easily measure and evaluate HRPF in daily life will be useful. Previous studies developed equations with relatively small sample sizes or samples with limited age ranges ( Esco et al., 2008 ; Shenoy et al., 2012 ; Lopes et al., 2018 ; Zaccagni et al., 2020 ). This study aimed to develop a multiple regression model for estimating the HRPF parameters in Korean adults using easy-to-measure dependent variables. Before performing multiple regressions to estimate HRPF parameters, it is essential to eliminate outliers because they increase predictive errors. The absolute value of the studentized residual was used to eliminate outliers in this study. The coefficient of determination of the HRPF parameters in the developed multiple regression models was high, except for flexibility. The mean explanatory power of the sit and reach regression model in our study was 29.8%.

The HGS used to evaluate total muscle strength measures the ability of hand muscles to produce force (tension) using a hand dynamometer ( Mitsionis et al., 2009 ). The relevance of HGS measurements continues to grow due to their clinical and epidemiological application for sarcopenia diagnosis, as suggested by the European Working Group ( Cruz-Jentoft et al., 2010 ), or as a nutrition status indication and their association with morbidity and mortality ( Norman et al., 2011 ). HGS has been studied in relation with various anthropometric factors ( Alahmari et al., 2017 ; Eidson et al., 2017 ; Lopes et al., 2018 ; Zaccagni et al., 2020 ). In the current study, the mean explanatory power of the HGS regression model (37.138 − (10.190 × gender male = 1; female = 2 ) + (0.988 × BMI) − (0.457 × percent body fat) − (0.042 × age)) was 87.0% (adjusted R 2 ). Alahmari et al. (2017) showed that three variables (i.e., age, hand length, and forearm circumference) predicted 42.7% (adjusted R 2 ) of what constitutes the HGS of healthy adult males (aged: 20–74 years; n = 116) in Saudi Arabia. Furthermore, Zaccagni et al. (2020) reported that the independent variables sex, upper arm muscle area, arm fat index, fat mass, and fat free mass accounted for 74.6% (adjusted R 2 ) of the variance of HGS in young adults (aged: 18–30 years; total: n = 544; male: n = 356; female: n = 188). Lopes et al. (2018) showed that 71% (adjusted R 2 ) of the variability in the dominant HGS could be explained by gender, forearm circumference, and hand length (−15.490 + (10.787 × gender male = 1; female = 0 ) + (0.558 × forearm circumference) + (1.763 × hand length)). In addition, 70% (adjusted R 2 ) of the variability in the nondominant HGS was explained by gender and hand length (−9.887 + (12.832 × gender male = 1; female = 0 ) + (2.028 × hand length)) in young adult and middle-aged participants (aged: 20–60 years; total: n = 80; male: n = 40; female: n = 40). Our study confirmed that the regression model formulation developed is more accurate and straightforward than the predictive power of previous studies.

The two most important trunk muscle abilities have been presented as trunk muscle strength and muscular endurance in both the athletic and general populations ( Granacher et al., 2013 ). Trunk muscle strength and muscular endurance testing in clinical fields have been important in injury rehabilitation and prevention programs ( Jackson et al., 1998 ; del Pozo-Cruz et al., 2013 ). Sit-ups test are known to evaluate strength and muscular endurance in the abdomen ( Morrow et al., 2015 ; Liguori and American College of Sports Medicine, 2020 ). Esco et al. showed that 63.7% ( R 2 ) of the variability in sit-ups could be explained by height, push-ups, skinfolds at the thigh, and skinfolds at the subscapularis (1.651 + (0.368 × push-ups) + (0.495 × height) − (0.277 × skinfolds at the thigh) − (0.336 × skinfolds at the subscapularis)) in healthy adults (aged: 18–48 years; total: n = 100; male: n = 40; female: n = 60) ( Esco et al., 2008 ). The sit-ups regression model’s (62.443 − (1.015 × percent body fat) − (0.392 × age) + (0.783 × BMI) − (5.287 × gender male = 1; female = 2 )) mean explanatory power estimated in our study was 75.1% (adjusted R 2 ).

Cardiorespiratory fitness is an essential component of health and physical fitness, and is affected by the respiratory, cardiovascular, and skeletal muscle systems ( Liguori and American College of Sports Medicine, 2020 ). The gold standard measurement of CRF is VO 2 max when performing a maximum graded exercise test ( Liguori and American College of Sports Medicine, 2020 ). However, while VO 2 max is the most accurate way to evaluate CRF, testing requires expensive equipment, space to accommodate equipment, and trained personnel. Previous studies developed a method to predict VO 2 max without exercise using multiple regression analysis ( Bradshaw et al., 2005 ; Shenoy et al., 2012 ). The non-exercise regression equations provide convenient estimates of CRFs without performing maximum or submaximal exercise tests ( Bradshaw et al., 2005 ). Shenoy et al. showed that 79.9% (adjusted R 2 ) of the variability in VO 2 max could be explained by gender, perceived functional ability, and body surface area (−1.541 + (1.096 × gender male = 1; female = 0 ) + (0.081 × perceived functional ability) + (1.084 × body surface area)) in healthy young Indian adults (aged: 18–27 years; total: n = 120; male: n = 60; female: n = 60) ( Shenoy et al., 2012 ). Bradshaw et al. (2005) showed that 87% ( R 2 ) of the variability in VO 2 max could be explained by gender, age, BMI, perceived functional ability, and PA rating (48.073 + (6.178 × gender male = 1; female = 0 ) – (0.246 × age) – (0.619 × BMI) + (0.712 × perceived functional ability) + (0.671 × PA rating)) in adults (aged: 18–65 years; total: n = 100; male: n = 50; female: n = 50). In our study, the mean explanatory power of the estimated VO 2 max regression model (61.068 − (0.197 × percent body fat) − (5.920 × gender male = 1; female = 2 ) − (0.133 × age) − (0.305 × BMI)) was 88.5% (adjusted R 2 ). Accordingly, we obtained similar or higher regression coefficient than previous studies by using independent variables that are more accessible to measure, and a larger sample size. Therefore, we consider the results of this study straightforward and accurate.

Limitations

This study had some limitations. The role of HRPF and nutrition in decreasing the progression of chronic diseases is growing more important ( Gil et al., 2015 ). Nutrition was described as a major modifiable behavior, and HRPF has also been defined as an essential health-related indication ( Camões and Lopes, 2008 ). Previous studies have shown that improvements in HRPF and nutritional factors could prevent functional limitations related to aging, lead to healthier and independent aging processes ( Strandberg et al., 2017 ; Wickramasinghe et al., 2020 ). However, the association with HRPF parameters could not be evaluated because the NFA database did not provide nutrition information. We only included adults between the ages of 19 and 64 in our analysis. Therefore, the multiple regression equation developed in the present study does not apply to older adults. In the future, a multi-regression equation development study will be necessary to predict the functional physical fitness of older adults.

This study demonstrated that the variability of HGS, muscular endurance, and CRF in healthy adults could be explained by gender, age, BMI, and percent body fat. A multi-regression equation could be developed based on these demographic and anthropometric variables. Since this multi-regression equation requires only a simple parameter measurement, it could be time-efficient, inexpensive, and realistic for large groups in clinical practice. The prediction equation will allow coaches, athletes, healthcare professionals, researchers, and the general public to better estimate the expected HRPF in order to improve the data interpretation.

Data Availability Statement

Ethics statement.

The studies involving human participants were reviewed and approved by Institutional Review Board of Kunkuk University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author Contributions

S-WK, HJ, JL, and H-YP: conception and study design. S-WK, HJ, and H-YP: statistical analysis. H-YP: investigation. S-WK and H-YP: data interpretation and writing–review and editing. S-WK: writing–original draft preparation. KL: supervision. All authors have read and approved the final manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Funding. This research was supported by the Sports Promoting Fund of the Korea Sports Promotion Foundation (KSPO) from the Ministry of Culture, Sports and Tourism and Konkuk University (KU) Research Professor Program.

  • Alahmari K. A., Silvian S. P., Reddy R. S., Kakaraparthi V. N., Ahmad I., Alam M. M. (2017). Hand grip strength determination for healthy males in Saudi Arabia: a study of the relationship with age, body mass index, hand length and forearm circumference using a hand-held dynamometer. J. Int. Med. Res. 45 540–548. 10.1177/0300060516688976 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Aroganam G., Manivannan N., Harrison D. (2019). Review on wearable technology sensors used in consumer sport applications. Sensors (Basel) 19 : 1983 . 10.3390/s19091983 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Artero E. G., España-Romero V., Castro-Piñero J., Ortega F. B., Suni J., Castillo-Garzon M. J., et al. (2011). Reliability of field-based fitness tests in youth. Int. J. Sports Med. 32 159–169. 10.1055/s-0030-1268488 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bäckman E., Johansson V., Häger B., Sjöblom P., Henriksson K. G. (1995). Isometric muscle strength and muscular endurance in normal persons aged between 17 and 70 years. Scand. J. Rehabil. Med. 27 109–117. [ PubMed ] [ Google Scholar ]
  • Bardsiri V. K., Jawawi D. N. A., Hashim S. Z. M., Khatibi E. (2014). A flexible method to estimate the software development effort based on the classification of projects and localization of comparisons. Empiric. Softw. Eng. 19 857–884. 10.1007/s10664-013-9241-4 [ CrossRef ] [ Google Scholar ]
  • Barry V. W., Baruth M., Beets M. W., Durstine J. L., Liu J., Blair S. N. (2014). Fitness vs. fatness on all-cause mortality: a meta-analysis. Prog. Cardiovasc. Dis. 56 382–390. 10.1016/j.pcad.2013.09.002 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Beltz N. M., Gibson A. L., Janot J. M., Kravitz L., Mermier C. M., Dalleck L. C. (2016). Graded exercise testing protocols for the determination of VO(2)max: historical perspectives, progress, and future considerations. J. Sports Med. (Hindawi Publ. Corp.) 2016 : 3968393 . 10.1155/2016/3968393 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Blair S. N., Brodney S. (1999). Effects of physical inactivity and obesity on morbidity and mortality: current evidence and research issues. Med. Sci. Sports Exerc. 31(Suppl 11) S646–S662. 10.1097/00005768-199911001-00025 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Blair S. N., Cheng Y., Holder J. S. (2001). Is physical activity or physical fitness more important in defining health benefits? Med. Sci. Sports Exerc. 33(Suppl 6) S379–S399. 10.1097/00005768-200106001-00007 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bradshaw D. I., George J. D., Hyde A., LaMonte M. J., Vehrs P. R., Hager R. L., et al. (2005). An accurate VO2max nonexercise regression model for 18-65-year-old adults. Res. Q. Exerc. Sport 76 426–432. 10.1080/02701367.2005.10599315 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bruce R. A., Kusumi F., Hosmer D. (1973). Maximal oxygen intake and nomographic assessment of functional aerobic impairment in cardiovascular disease. Am. Heart J. 85 546–562. 10.1016/0002-8703(73)90502-4 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Camões M., Lopes C. (2008). Dietary intake and different types of physical activity: full-day energy expenditure, occupational and leisure-time. Public Health Nutr. 11 841–848. 10.1017/s1368980007001309 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Carnethon M. R., Gidding S. S., Nehgme R., Sidney S., Jacobs D. R., Jr., Liu K. (2003). Cardiorespiratory fitness in young adulthood and the development of cardiovascular disease risk factors. JAMA 290 3092–3100. 10.1001/jama.290.23.3092 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Castillo-Garzón M. J., Ruiz J. R., Ortega F. B., Gutiérrez A. (2006). Anti-aging therapy through fitness enhancement. Clin. Interv. Aging 1 213–220. 10.2147/ciia.2006.1.3.213 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chen H. L., Lee P. F., Chang Y. C., Hsu F. S., Tseng C. Y., Hsieh X. Y., et al. (2020a). The association between physical fitness performance and subjective happiness among taiwanese adults. Int. J. Environ. Res. Public Health 17 : 3774 . 10.3390/ijerph17113774 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chen P. H., Chen W., Wang C. W., Yang H. F., Huang W. T., Huang H. C., et al. (2020b). Association of physical fitness performance tests and anthropometric indices in taiwanese adults. Front. Physiol. 11 : 583692 . 10.3389/fphys.2020.583692 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chrismas B. C. R., Majed L., Kneffel Z. (2019). Physical fitness and physical self-concept of male and female young adults in Qatar. PLoS One 14 : e0223359 . 10.1371/journal.pone.0223359 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cruz-Jentoft A. J., Baeyens J. P., Bauer J. M., Boirie Y., Cederholm T., Landi F., et al. (2010). Sarcopenia: European consensus on definition and diagnosis: report of the European working group on Sarcopenia in older people. Age Ageing 39 412–423. 10.1093/ageing/afq034 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • del Pozo-Cruz B., Gusi N., Adsuar J. C., del Pozo-Cruz J., Parraca J. A., Hernandez-Mocholí M. (2013). Musculoskeletal fitness and health-related quality of life characteristics among sedentary office workers affected by sub-acute, non-specific low back pain: a cross-sectional study. Physiotherapy 99 194–200. 10.1016/j.physio.2012.06.006 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Eidson C. A., Jenkins G. R., Yuen H. K., Abernathy A. M., Brannon M. B., Pung A. R., et al. (2017). Investigation of the relationship between anthropometric measurements and maximal handgrip strength in young adults. Work 57 3–8. 10.3233/wor-172537 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Erikssen G. (2001). Physical fitness and changes in mortality: the survival of the fittest. Sports Med. 31 571–576. 10.2165/00007256-200131080-00001 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Esco M. R., Olson M. S., Williford H. (2008). Relationship of push-ups and sit-ups tests to selected anthropometric variables and performance results: a multiple regression study. J. Strength Cond. Res. 22 1862–1868. 10.1519/JSC.0b013e318181fd03 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Fleg J. L., Morrell C. H., Bos A. G., Brant L. J., Talbot L. A., Wright J. G., et al. (2005). Accelerated longitudinal decline of aerobic capacity in healthy older adults. Circulation 112 674–682. 10.1161/circulationaha.105.545459 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gil Á, Martinez de Victoria E., Olza J. (2015). Indicators for the evaluation of diet quality. Nutr. Hosp. 31(Suppl 3) 128–144. 10.3305/nh.2015.31.sup3.8761 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Granacher U., Gollhofer A., Hortobágyi T., Kressig R. W., Muehlbauer T. (2013). The importance of trunk muscle strength for balance, functional performance, and fall prevention in seniors: a systematic review. Sports Med. 43 627–641. 10.1007/s40279-013-0041-1 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jackson A. W., Morrow J. R., Jr., Brill P. A., Kohl H. W., III, Gordon N. F., Blair S. N. (1998). Relations of sit-up and sit-and-reach tests to low back pain in adults. J. Orthop. Sports Phys. Ther. 27 22–26. 10.2519/jospt.1998.27.1.22 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jeong D., Park S., Kim H., Kwon O. (2020). Association of carotenoids concentration in blood with physical performance in Korean adolescents: the 2018 National fitness award project. Nutrients 12 : 1821 . 10.3390/nu12061821 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Katzmarzyk P. T., Church T. S., Blair S. N. (2004). Cardiorespiratory fitness attenuates the effects of the metabolic syndrome on all-cause and cardiovascular disease mortality in men. Arch. Intern. Med. 164 1092–1097. 10.1001/archinte.164.10.1092 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kim J., Campbell A. S., de Ávila B. E., Wang J. (2019a). Wearable biosensors for healthcare monitoring. Nat. Biotechnol. 37 389–406. 10.1038/s41587-019-0045-y [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kim S. W., Jung S. W., Seo M. W., Park H. Y., Song J. K. (2019b). Effects of bone-specific physical activity on body composition, bone mineral density, and health-related physical fitness in middle-aged women. J. Exerc. Nutr. Biochem. 23 36–42. 10.20463/jenb.2019.0030 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Liguori G., and American College of Sports Medicine (2020). ACSM’s Guidelines For Exercise Testing and Prescription. Bltomore, MD: Lippincott Williams & Wilkins. [ Google Scholar ]
  • Lin X., Zhang X., Guo J., Roberts C. K., McKenzie S., Wu W. C., et al. (2015). Effects of exercise training on cardiorespiratory fitness and biomarkers of cardiometabolic health: a systematic review and meta-analysis of randomized controlled trials. J. Am. Heart Assoc. 4 : e002014 . 10.1161/jaha.115.002014 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Loncar-Turukalo T., Zdravevski E., Machado da Silva J., Chouvarda I., Trajkovik V. (2019). Literature on wearable technology for connected health: scoping review of research trends, advances, and barriers. J. Med. Internet Res. 21 : e14017 . 10.2196/14017 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lopes J., Grams S. T., da Silva E. F., de Medeiros L. A., de Brito C. M. M., Yamaguti W. P. (2018). Reference equations for handgrip strength: normative values in young adult and middle-aged subjects. Clin. Nutr. 37 914–918. 10.1016/j.clnu.2017.03.018 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mendes R., Sousa N., Themudo-Barata J., Reis V. (2016). Impact of a community-based exercise programme on physical fitness in middle-aged and older patients with type 2 diabetes. Gac. Sanit. 30 215–220. 10.1016/j.gaceta.2016.01.007 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Meredith M. D., Welk G. (2010). Fitnessgram and Activitygram Test Administration Manual-Updated , 4th Edn. Champaign, IL: Human Kinetics. [ Google Scholar ]
  • Mier C. M. (2011). Accuracy and feasibility of video analysis for assessing hamstring flexibility and validity of the sit-and-reach test. Res. Q. Exerc. Sport 82 617–623. 10.1080/02701367.2011.10599798 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mitsionis G., Pakos E. E., Stafilas K. S., Paschos N., Papakostas T., Beris A. E. (2009). Normative data on hand grip strength in a Greek adult population. Int. Orthop. 33 713–717. 10.1007/s00264-008-0551-x [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Morrow J. R., Jr., Mood D., Disch J., Kang M. (2015). Measurement and Evaluation in Human Performance, 5E. Champaign, IL: Human kinetics. [ Google Scholar ]
  • Myers J., Kaykha A., George S., Abella J., Zaheer N., Lear S., et al. (2004). Fitness versus physical activity patterns in predicting mortality in men. Am. J. Med. 117 912–918. 10.1016/j.amjmed.2004.06.047 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Norman K., Stobäus N., Gonzalez M. C., Schulzke J. D., Pirlich M. (2011). Hand grip strength: outcome predictor and marker of nutritional status. Clin. Nutr. 30 135–142. 10.1016/j.clnu.2010.09.010 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ortega F. B., Ruiz J. R., Castillo M. J., Sjöström M. (2008). Physical fitness in childhood and adolescence: a powerful marker of health. Int. J. Obes. (Lond.) 32 1–11. 10.1038/sj.ijo.0803774 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Park H. Y., Jung W. S., Hwang H., Kim S. W., Kim J., Lim K. (2020). Predicting the resting metabolic rate of young and middle-aged healthy Korean adults: a preliminary study. Phys Act Nutr 24 9–13. 10.20463/pan.2020.0002 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Pillsbury L., Oria M., Pate R. (2013). Fitness Measures and Health Outcomes in Youth. Washingdon, DC: The National Academic Press. [ PubMed ] [ Google Scholar ]
  • Ray T., Choi J., Reeder J., Lee S. P., Aranyosi A. J., Ghaffari R., et al. (2019). Soft, skin-interfaced wearable systems for sports science and analytics. Curr. Opin. Biomed. Eng. 9 47–56. 10.1016/j.cobme.2019.01.003 [ CrossRef ] [ Google Scholar ]
  • Ruiz J. R., Castro-Piñero J., España-Romero V., Artero E. G., Ortega F. B., Cuenca M. M., et al. (2011). Field-based fitness assessment in young people: the ALPHA health-related fitness test battery for children and adolescents. Br. J. Sports Med. 45 518–524. 10.1136/bjsm.2010.075341 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Secchi J. D., García G. C., España-Romero V., Castro-Piñero J. (2014). Physical fitness and future cardiovascular risk in argentine children and adolescents: an introduction to the ALPHA test battery. Arch. Argent Pediatr. 112 132–140. 10.5546/aap.2014.132 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Seshadri D. R., Drummond C., Craker J., Rowbottom J. R., Voos J. E. (2017). Wearable devices for sports: new integrated technologies allow coaches, physicians, and trainers to better understand the physical demands of athletes in real time. IEEE Pulse 8 38–43. 10.1109/mpul.2016.2627240 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shenoy S., Tyagi B., Sandhu J., Sengupta D. (2012). Development of non-exercise based VO2max prediction equation in college-aged participants in India. J. Sports Med. Phys. Fitness 52 465–473. [ PubMed ] [ Google Scholar ]
  • Shepperd M., MacDonell S. (2012). Evaluating prediction systems in software project estimation. Inform. Softw. Technol. 54 820–827. 10.1016/j.infsof.2011.12.008 [ CrossRef ] [ Google Scholar ]
  • Strandberg T., Levälahti E., Ngandu T., Solomon A., Kivipelto M., Lehtisalo J., et al. (2017). Health-related quality of life in a multidomain intervention trial to prevent cognitive decline (FINGER). Eur. Geriatr. Med. 8 164–167. 10.1016/j.eurger.2016.12.005 [ CrossRef ] [ Google Scholar ]
  • Warburton D. E., Nicol C. W., Bredin S. S. (2006). Health benefits of physical activity: the evidence. CMAJ 174 801–809. 10.1503/cmaj.051351 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wickramasinghe K., Mathers J. C., Wopereis S., Marsman D. S., Griffiths J. C. (2020). From lifespan to healthspan: the role of nutrition in healthy ageing. J. Nutr. Sci. 9 : e33 . 10.1017/jns.2020.26 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Williams P. T. (2001). Physical fitness and activity as separate heart disease risk factors: a meta-analysis. Med. Sci. Sports Exerc. 33 754–761. 10.1097/00005768-200105000-00012 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • World Health Organization [WHO] (2010). Global Recommendations on Physical Activity for Health. Geneva: World Health Organization. [ PubMed ] [ Google Scholar ]
  • Zaccagni L., Toselli S., Bramanti B., Gualdi-Russo E., Mongillo J., Rinaldo N. (2020). Handgrip strength in young adults: association with anthropometric variables and laterality. Int. J. Environ. Res. Public Health 17 : 4273 . 10.3390/ijerph17124273 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Au: subscript i roman?

LEARN STATISTICS EASILY

LEARN STATISTICS EASILY

Learn Data Analysis Now!

LEARN STATISTICS EASILY LOGO 2

How to Report Results of Multiple Linear Regression in APA Style

You will learn How to Report Results of Multiple Linear Regression , accurately reporting coefficients, significance levels, and assumptions using APA style.

Introduction

Multiple linear regression is a fundamental statistical method to understand the relationship between one dependent variable and two or more independent variables. This approach allows researchers and analysts to predict the dependent variable’s outcome based on the independent variables’ values, providing insights into complex relationships within data sets. The power of multiple linear regression lies in its ability to control for various confounding factors simultaneously, making it an invaluable tool in fields ranging from social sciences to finance and health sciences.

Reporting the results of multiple linear regression analyses requires precision and adherence to established guidelines, such as those provided by the American Psychological Association (APA) style. The importance of reporting in APA style cannot be overstated, as it ensures clarity, uniformity, and comprehensiveness in research documentation. Proper reporting includes:

  • Detailed information about the regression model used.
  • The significance of the predictors.
  • The fit of the model.
  • Any assumptions or conditions that were tested.

Adhering to APA style enhances the readability and credibility of research findings, facilitating their interpretation and application by a broad audience.

This guide will equip you with the knowledge and skills to effectively report multiple linear regression results in APA style, ensuring your research communicates scientific inquiries.

  • Detail assumptions check like multicollinearity with VIF scores.
  • Report the adjusted R-squared to express model fit.
  • Identify significant predictors with t-values and p-values in your regression model.
  • Include confidence intervals for a comprehensive understanding of predictor estimates.
  • Explain model diagnostics with residual plots for validity.

 width=

Ad description. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Step-by-Step Guide with Examples

1. objective of regression analysis.

Initiate by clearly stating the purpose of your multiple linear regression (MLR) analysis. For example, you might explore how environmental factors (X1, X2, X3) predict plant growth (Y).  Example:   “This study aims to assess the impact of sunlight exposure (X1), water availability (X2), and soil quality (X3) on plant growth rate (Y).”

2. Sample Size and Power

Discuss the significance of your sample size. A larger sample provides greater power for a robust MLR analysis.  Example:  “With a sample size of 200 plants, we ensure sufficient power to detect significant predictors of growth, minimizing type II errors.”

*Considering the importance of the power of the statistical test, calculating the sample size is a crucial step for accurately determining the adequate sample size needed to identify the estimated relationship.

3. Checking and Reporting Model Assumptions

  • Linearity : Verify each independent variable’s relationship with the dependent variable is linear.  Example:   “Scatterplots of sunlight exposure, water availability, and soil quality against plant growth revealed linear trends.”
  • Normality of Residuals : Assess using the Shapiro-Wilk test.  Example:   “The Shapiro-Wilk test confirmed the residuals’ normality, W = .98, p = .15.”
  • Homoscedasticity : Evaluate with the Breusch-Pagan test.  Example:   “Homoscedasticity was confirmed, with a Breusch-Pagan test result of χ² = 5.42, p = 0.14.”
  • Independence of Errors : Use the Durbin-Watson statistic.  Example:   “The Durbin-Watson statistic of 1.92 suggests no autocorrelation, indicating independent errors.”

4. Statistical Significance of the Regression Model

Present the F-statistic, degrees of freedom, and its significance (p-value) to demonstrate the model’s overall fit.  Example:   “The model was significant, F(3,196) = 12.57, p < 0.001, indicating at least one predictor significantly affects plant growth.”

5. Coefficient of Determination

Report the adjusted R² to show the variance explained by the model.  Example:   “The model explains 62% of the variance in plant growth, with an adjusted R² of 0.62.”

6. Statistical Significance of Predictors

Detail each predictor’s significance through t-tests.  Example:   “Sunlight exposure was a significant predictor, t(196) = 5.33, p < 0.001, indicating a positive effect on plant growth.”

7. Regression Coefficients and Equation

Provide the regression equation with unstandardized coefficients.  Example:   “The regression equation was Y = 2.5 + 0.8X1 + 0.5X2 – 0.2X3, where each hour of sunlight (X1) increases growth by 0.8 units…”

8. Discussion of Model Fit and Limitations

Reflect on how well the model fits the data and its limitations.  Example:   “While the model fits well (Adjusted R² = 0.62), it’s crucial to note that it does not prove causation, and external factors not included in the model may also affect plant growth.”

9. Additional Diagnostics and Visualizations

Incorporate diagnostics like VIF for multicollinearity and visual aids.  Example:   “VIF scores were below 5 for all predictors, indicating no multicollinearity concern. Residual plots showed random dispersion, affirming model assumptions.”

“In our exploration of the determinants of final exam scores in a university setting, we employed a multiple linear regression model to assess the contributions of study hours (X1), class attendance (X2), and student motivation (X3). The model, specified as Y = β0 + β1X1 + β2X2 + β3X3 + ε, where Y represents final exam scores, aimed to provide a comprehensive understanding of how these variables collectively influence academic performance.

Assumptions Check:  Before examining the predictive power of our model, a thorough assessment of its foundational assumptions was undertaken to affirm the integrity of our analysis. Scatterplot examinations scrutinized each predictor’s relationship with the dependent variable for linearity, revealing no deviations from linear expectations. The Shapiro-Wilk test substantiated the normality of the residuals (W = .98, p = .15), thereby satisfying the normality criterion. Homoscedasticity, the uniform variance of residuals across the range of predicted values, was confirmed via the Breusch-Pagan test (χ² = 5.42, p = 0.14). Furthermore, the Durbin-Watson statistic stood at 1.92, effectively ruling out autocorrelation among residuals and attesting to the independence of errors. The Variance Inflation Factor (VIF) for each predictor was well below the threshold of 5, dispelling multicollinearity concerns. Collectively, these diagnostic tests validated the key assumptions underpinning our multiple linear regression model, providing a solid groundwork for the subsequent analysis.

Model Summary : The overall fit of the model was statistically significant, as indicated by an F-statistic of 53.24 with a p-value less than .001 (F(3,196) = 53.24, p < .001), suggesting that the model explains a significant portion of the variance in exam scores. The adjusted R² value of .43 further illustrates that our model can account for approximately 43% of the final exam scores variability, highlighting the included predictors’ substantial impact.

Coefficients and Confidence Intervals :

  • The intercept, β0, was estimated at 50 points, implying an average exam score baseline when all independent variables are held at zero.
  • Study Hours (X1) : Each additional hour of study was associated with a 2.5 point increase in exam scores (β1 = 2.5), with a 95% confidence interval of [1.9, 3.1], underscoring the value of dedicated study time.
  • Class Attendance (X2) : Regular attendance contributed an additional 1.8 points to exam scores per class attended (β2 = 1.8), with the confidence interval ranging from 1.1 to 2.5, reinforcing the importance of class participation.
  • Student Motivation (X3) : Motivation emerged as a significant factor, with a 3.2-point increase in scores for heightened motivation levels (β3 = 3.2) and a confidence interval of [2.4, 4.0], suggesting a profound influence on academic success.

Model Diagnostics : The diagnostic checks, including the analysis of residuals, confirmed the model’s adherence to the assumptions of linear regression. The absence of discernible patterns in the residual plots affirmed the model’s homoscedasticity and linearity, further solidifying the reliability of our findings.

In conclusion, our regression analysis elucidates the critical roles of study hours, class attendance, and student motivation in determining final exam scores. The robustness of the model, evidenced by the stringent checks and the significant predictive power of the included variables, provides compelling insights into effective academic strategies. These findings validate our initial hypotheses and offer valuable guidance for educational interventions to enhance student outcomes.

These results, particularly the point estimates and their associated confidence intervals, provide robust evidence supporting the hypothesis that study hours, class attendance, and student motivation are significant predictors of final exam scores. The confidence intervals offer a range of plausible values for the true effects of these predictors, reinforcing the reliability of the estimates.

In this comprehensive guide, we’ve navigated the intricacies of reporting multiple linear regression results in APA style, emphasizing the critical components that must be included to ensure clarity, accuracy, and adherence to standardized reporting conventions. Key points such as the importance of presenting a clear model specification, conducting thorough assumption checks, detailing model summaries and coefficients, and interpreting the significance of predictors have been highlighted to assist you in crafting a report that stands up to academic scrutiny and contributes valuable insights to your field of study.

Accurate reporting is paramount in scientific research. It conveys findings and upholds the integrity and reproducibility of the research process. By meticulously detailing each aspect of your multiple linear regression analysis, from the initial model introduction to the final diagnostic checks, you provide a roadmap for readers to understand and potentially replicate your study. This level of transparency is crucial for fostering trust in your conclusions and encouraging further exploration and discussion within the scientific community.

Moreover, the practical example is a template for effectively applying these guidelines, illustrating how theoretical principles translate into practice. By following the steps outlined in this guide, researchers can enhance the impact and reach of their studies, ensuring that their contributions to knowledge are recognized, understood, and built upon.

Recommended Articles

Explore more on statistical reporting by diving into our extensive collection of APA style guides and examples on our blog.

  • How to Report Chi-Square Test Results in APA Style: A Step-By-Step Guide
  • How to Report One-Way ANOVA Results in APA Style: A Step-by-Step
  • How to Report Simple Linear Regression Results in APA Style
  • Generalized Linear Models: A Comprehensive Introduction

How to Report Pearson Correlation Results in APA Style

  • Multiple Linear Regression – an overview (External Link)
  • How to Report Cohen’s d in APA Style
  • Master Cohen’s d in APA Style (Story)
  • APA Style T-Test Reporting Guide

Frequently Asked Questions (FAQs)

Multiple linear regression extends simple linear regression by incorporating two or more predictors to explain the variance in a dependent variable, offering a more comprehensive analysis of complex relationships.

Use multiple linear regression to understand the impact of several independent variables on a single outcome and when these variables are expected to interact with each other in influencing the dependent variable.

Key steps include testing for linearity, examining residual plots for homoscedasticity and normality, checking VIF scores for multicollinearity, and using the Durbin-Watson statistic to assess the independence of residuals.

Coefficients represent the expected change in the dependent variable for a one-unit change in the predictor, holding all other predictors constant. Positive coefficients indicate a direct relationship, while negative coefficients suggest an inverse relationship.

Adjusted R-squared provides a more accurate measure of the model’s explanatory power by adjusting for the number of predictors, preventing overestimating variance explained in models with multiple predictors.

Confidence intervals offer a range of plausible values for each coefficient, providing insights into the precision of the estimates and the statistical significance of predictors.

Consider combining highly correlated variables, removing some, or using techniques like principal component analysis to reduce multicollinearity without losing critical information.

Residual analysis can reveal patterns that suggest violations of linear regression assumptions, guiding modifications to the model, such as transforming variables or adding interaction terms.

P-values can be misleading in the presence of multicollinearity, when sample sizes are very large or small, or when data do not meet the assumptions of linear regression, emphasizing the importance of comprehensive diagnostic checks.

Ensure your graphs are clear, labeled accurately, and include necessary details like confidence intervals or regression lines. Follow APA guidelines for figure presentation to maintain consistency and readability in your report.

Similar Posts

Linear Regression with Scikit-Learn: A Comprehensive Guide

Linear Regression with Scikit-Learn: A Comprehensive Guide

Master linear regression with scikit-learn with our guide, and elevate your data science skills to predict and analyze effectively.

How to Calculate Residuals in Regression Analysis?

How to Calculate Residuals in Regression Analysis?

Master calculating residuals in regression analysis to refine model accuracy and gain deeper data insights. An essential guide.

How to Report Pearson Correlation Results in APA Style

Learn how to report correlation in APA style, mastering the key steps and considerations for clearly communicating research findings.

What’s Regression Analysis? A Comprehensive Guide for Beginners

What’s Regression Analysis? A Comprehensive Guide for Beginners

Discover what’s regression analysis, its types, key concepts, applications, and common pitfalls in our comprehensive guide for beginners.

Linear Regression Slope Calculator: Your Essential Tool for Data Analysis

Linear Regression Slope Calculator: Your Essential Tool for Data Analysis

Discover the essentials of linear regression analysis with our Linear Regression Slope Calculator: your tool for data science and statistics.

Assumptions in Linear Regression: A Comprehensive Guide

Assumptions in Linear Regression: A Comprehensive Guide

Discover assumptions in linear regression, learn to validate them using real-world examples, and enhance your data analysis skills.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

research paper on multiple regression

Supply chain socially sustainability practices and their impact on supply chain performance: a study from the Indian automobile industry

  • Original Research
  • Published: 29 April 2024

Cite this article

research paper on multiple regression

  • Satyendra Kumar Sharma 1 ,
  • Sajeev Abraham George 2 ,
  • Praveen Ranjan Srivastava   ORCID: orcid.org/0000-0001-7467-5500 3 ,
  • Fauzia Jabeen 4 &
  • Cisem Lafci 5  

While sustainability has been a well-researched area in academic literature, the performance impact of its social dimensions remains largely unexplored, especially in the context of emerging economies. The aim of this research paper is to test and validate the dimensions of supply chain social sustainability (SCSS) that firms should focus on and to examine the relationships between these practices and supply chain performance, both short term and long term. This paper adopts a questionnaire-based survey research approach in the context of Indian automobile industry. Empirical validation of the conceptual model developed was carried out using Confirmatory Factor Analysis. Multiple regression was used to test the relationships between SCSS practices and supply chain performance. This study finds empirical support to the proposition that a firm’s initiatives on SCSS dimensions of safety, labour rights, ethical practices and welfare initiatives for people and their communities provide performance benefits to them and to their partners in the supply chains. Regression analysis revealed that safety (0.339) and labour rights (0.601) contribute to both short term and long term performance for the supply chain. While ethical practices have a positive impact on short term performance, welfare initiatives only provide long term qualitative benefits. SCSS is in evolving concept and adopting the right mix of factors can help firms to achieve sustainability in all three dimensions of the triple-bottom-line framework (People, Planet and Profit).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research paper on multiple regression

Agami, N., Saleh, M., & Rasmy, M. (2012). A Hybrid Dynamic Framework for Supply Chain performance improvement. IEEE Systems Journal, 6 (3), 469–478.

Article   Google Scholar  

Ahen, F., & Amankwah-Amoah, J. (2018). Institutional voids and the philanthropization of CSR practices: Insights from developing economies. Sustainability, 10 (7), 2400.

Ahmadi, H. B., Kusi-Sarpong, S., & Rezaei, J. (2017). Assessing the social sustainability of supply chains using Best Worst Method Resources. Conservation and Recycling, 126 , 99–106.

Amaeshi, K., Adegbite, E., Ogbechie, C., Idemudia, U., Kan, K. A. S., Issa, M., & Anakwue, O. I. (2016). Corporate social responsibility in SMEs: A shift from philanthropy to institutional works? Journal of Business Ethics, 138 , 385–400.

Andersen, M., & Skjoett-Larsen, T. (2009). Corporate social responsibility in global supply chains. Supply Chain Management: An International Journal, 14 (2), 75–86.

Arzu Akyuz, G., & Erman Erkan, T. (2010). “ Supply chain performance measurement: A literature review. International Journal of Production Research, 48 (17), 5137–5155.

Ashby, A., Leat, M., & Hudson-Smith, M. (2012). Making connections: A review of supply chain management and sustainability literature. Supply Chain Management: An International Journal, 17 (5), 497–516.

Bai, C., Dallasega, P., Orzes, G., & Sarkis, J. (2020). Industry 4.0 technologies assessment: A sustainability perspective. International Journal of Production Economics, 229 , 107776. https://doi.org/10.1016/j.ijpe.2020.107776

Ballet, J., Bazin, D., & Mahieu, F. R. (2020). A policy framework for social sustainability: Social cohesion, equity and safety. Sustainable Development, 28 (5), 1388–1394.

Barrientos, S. (2008). Contract labor: The “‘Achilles heel’ of corporate codes in commercial value chains.” Development and Change, 39 (6), 977–990.

Beamon, B. M. (1999). Measuring supply chain performance. International Journal of Operations & Production Management, 19 (3), 275–292.

Carter, C. R. (2000). Ethical issues in international buyer–supplier relationships: A dyadic examination. Journal of Operations Management, 18 (2), 191–208.

Carter, C. R., & Jennings, M. M. (2004). The role of purchasing in corporate social responsibility: Structural equation analysis. Journal of Business Logistics., 25 (1), 145–186.

Carter, C. R., & Liane Easton, P. (2011). Sustainable supply chain management: Evolution and future directions. International Journal of Physical Distribution & Logistics Management, 41 (1), 46–62.

Carter, C. R., & Rogers, D. S. (2008). A framework of sustainable supply chain management: Moving toward new theory. International Journal of Physical Distribution & Logistics Management, 38 (5), 360–387.

Chan, F. T. (2003). Performance measurement in a supply chain. The International Journal of Advanced Manufacturing Technology, 21 , 534–548.

Chardine-Baumann, E., & Botta-Genoulaz, V. (2014). “A framework for sustainable performance assessment of supply chain management practices. Computers & Industrial Engineering, 76 , 138–147.

Chin, T. A., & Tat, H. H. (2015). Does gender diversity moderate the relationship between supply? chain management practice and performance in the electronic manufacturing services industry? International Journal of Logistics Research & Applications, 18 (1), 35–45.

Ciliberti, F. (2011). CSR codes and the principal-agent problem in supply chains: Four case studies. Journal of Cleaner Production, 19 (8), 885–894.

Ciliberti, F., Pontrandolfo, P., & Scozzi, B. (2008). Investigating corporate social responsibility in supply chains: A SME perspective. Journal of Cleaner Production, 16 (15), 1579–1588.

Closs, D. J., Speier, C., & Meacham, N. (2011). Sustainability to support end-to-end value chains: The role of supply chain management. Journal of the Academy of Marketing Science, 39 (1), 101–116. https://doi.org/10.1007/s11747-010-0207-4

Croom, S., Vidal, N., Spetic, W., Marshall, D., & McCarthy, L. (2018). Impact of social sustainability orientation and supply chain practices on operational performance. International Journal of Operations & Production Management, 38 (12), 2344–2366.

Dyllick, T., & Hockerts, K. (2002). Beyond the business case for corporate sustainability. Business Strategy and the Environment, 11 (2), 130–141.

Eizenberg, E., & Jabareen, Y. (2017). Social sustainability: A new conceptual framework. Sustainability, 9 (1), 68.

Elkington, J. (1998). Partnerships from cannibals with forks: The triple bottom line of 21st century business. Environmental Quality Management, 8 (1), 37–51.

Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18 (1), 39–50.

Golini, R., & Gualandris, J. (2018). An empirical examination of the relationship between globalisation, integration and sustainable innovation within manufacturing networks. International Journal of Operations & Production Management, Vo., 38 (3), 874–894.

Govindan, K., Shaw, M., & Majumdar, A. (2021). Social sustainability tensions in multi-tier supply chain: A systematic literature review towards conceptual framework development. Journal of Cleaner Production, 279 , 123075. https://doi.org/10.1016/j.jclepro.2020.123075

Gunasekaran, A., Patel, C., & McGaughey, R. E. (2004). A framework for supply chain performance measurement. International Journal of Production Economics, 87 (3), 333–347.

Hall, J., Matos, S., & Silvestre, B. (2012). Understanding why firms should invest in sustainable supply chains: A complexity approach. International Journal of Production Research, 50 (5), 1332–1348.

Huq, F. A., Stevenson, M., & Zorzini, M. (2014). Social sustainability in developing country suppliers: An exploratory study in the ready-made garments industry of Bangladesh. International Journal of Operations & Production Management, 34 (5), 610–638.

Hutchins, M. J., & Sutherland, J. W. (2008). An exploration of measures of social sustainability and their application to supply chain decisions. Journal of Cleaner Production, Vo., 16 (15), 1688–1698.

Kazançoglu, Y., Ozturkoglu, Y., Mangla, S. K., Ozbiltekin-Pala, M., & Ishizaka, A. (2022). A proposed framework for multi-tier supplier performance in sustainable supply chains. International Journal of Production Research, 61 (14), 1–23.

Google Scholar  

Khan, S. A., Agyemang, M., Ishizaka, A., Zaman, S. I., Ali, S. M., & Laval, J. (2021). Barriers and overcoming strategies to multi-tier sustainable supply chain management: An explorative study in an emerging economy. International Journal of Sustainable Engineering, 14 (6), 1484–1495.

Klassen, R. D., & Vereecke, A. (2012). Social issues in supply chains: Capabilities link responsibility, risk (opportunity), and performance. International Journal of Production Economics, 140 (1), 103–115.

Krause, D. R., Vachon, S., & Klassen, R. D. (2009). Special topic forum on sustainable supply chain management: Introduction and reflections on the role of purchasing management. Journal of Supply Chain Management, 45 (4), 18–25.

Kumar, A., Mangla, S. K., Luthra, S., & Ishizaka, A. (2019). Evaluating the human resource related soft dimensions in green supply chain management implementation. Production Planning & Control, 30 (9), 699–715.

Lima-junior, F. R., Cesar, L., & Carpinetti, R. (2017). Computers & industrial engineering quantitative models for supply chain performance evaluation : A literature review. Computers & Industrial Engineering, 113 (July), 333–346. https://doi.org/10.1016/j.cie.2017.09.022

Ling, L., Qin, S., & Chen, X. (2011). Ensuring supply chain quality performance through applying the SCOR model. International Journal of Production Research, 49 (1), 33–57.

Lund-Thomsen, P., & Lindgreen, A. (2014). Corporate social responsibility in global value chains: Where are we now and where are we going? Journal of Business Ethics, 123 (1), 11–22.

Lund-Thomsen, P., Nadvi, K., Chan, A., Khara, N., & Xue, H. (2012). Labour in global value chains: Work conditions in football manufacturing in China, India and Pakistan. Development and Change, 43 (6), 1211–1237. https://doi.org/10.1111/j.1467-7660.2012.01798.x

Mangla, S. K., Kusi-Sarpong, S., Luthra, S., Bai, C., Jakhar, S. K., & Khan, S. A. (2020). Operational excellence for improving sustainable supply chain performance. Resources, Conservation, and Recycling, 162 , 105025.

Mani, V., Agrawal, R., Gunasekaran, A., Papadopoulos, T., Dubey, R., & Childe, S. (2016a). Social sustainability in the supply chain: construct development and measurement validation. Ecological Indicators, 71 , 270–279.

Mani, V., Agrawal, R., & Sharma, V. (2015). Social sustainability practices in the supply chain of Indian manufacturing industries. International Journal of Automation and Logistics, 1 (3), 211–233.

Mani, V., Agrawal, R., Sharma, V., & Kavitha, T. N. (2016c). Socially sustainable business practices in Indian manufacturing industries: A study of two companies. International Journal of Logistics Systems and Management, 24 (1), 18–44. https://doi.org/10.1504/IJLSM.2016.075661

Mani, V., Gunasekaran, A., & Delgado, C. (2018). Enhancing supply chain performance through supplier social sustainability: an emerging economy perspective. International Journal of Production Economics, 195 , 259–272.

Mani, V., Gunasekaran, A., Papadopoulos, T., Dubey, R., & Benjamin, H. (2016b). Supply Chain social sustainability for developing nations: Evidence from India. Resources, Conservation & Recycling, 111 , 42–44.

Mani, V., Jabbour, C. J. C., & Mani, K. T. (2020). Supply chain social sustainability in small and medium manufacturing enterprises and firms’ performance: Empirical evidence from an emerging Asian economy. International Journal of Production Economics, 227 , 107656. https://doi.org/10.1016/j.ijpe.2020.107656

Melnyk, B. M., Fineout-Overholt, E., Gallagher-Ford, L., & Kaplan, L. (2012). The state of evidence-based practice in US nurses: Critical implications for nurse leaders and educators. Journal of Nursing Administration, 42 (9), 410–417. https://doi.org/10.1097/NNA.0b013e3182664e0a

Mena, S., Leede, M. D., Baumann, D., Black, N., Lindeman, S., & McShane, L. (2010). Advancing the business and human rights agenda: Dialogue, empowerment, and constructive engagement. Journal of Business Ethics, 93 (1), 161–188.

Meseguer-Sánchez, V., Gálvez-Sánchez, F. J., López-Martínez, G., & Molina-Moreno, V. (2021). Corporate social responsibility and sustainability. A bibliometric analysis of their interrelations. Sustainability, 13 (4), 1–18. https://doi.org/10.3390/su13041636

Miles, M. P., & Munilla, L. S. (2004). The potential impact of social accountability certification on marketing: A short note. Journal of Business Ethics, 50 (1), 1–11.

Moktadir, M. A., Ali, S. M., Rajesh, R., & Paul, S. K. (2018). Modeling the interrelationships among barriers to sustainable supply chain management in leather industry. Journal of Cleaner Production, 181 , 631–651.

Morais, D. O., & Silvestre, B. S. (2018). Advancing social sustainability in supply chain management: Lessons from multiple case studies in an emerging economy. Journal of Cleaner Production, 199 , 222–235.

Nakamba, C. C., Chan, P. W., & Sharmina, M. (2017). How does social sustainability feature in studies of supply chain management? A review and research agenda. Supply Chain Management: An International Journal, 22 (6), 522–541.

Naz, F., Bögenhold, D. (2020). Transformation of Labour Market and Gender Patterns of Work. In: Unheard Voices. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-54363-1_2

Pagell, M., & Gobeli, D. (2009). How plant managers’ experiences and attitudes toward sustainability relates to operational performance. Production & Operations Management, 18 (3), 278–299.

Pagell, M., Wu, Z., & Wasserman, M. E. (2010). Thinking differently about purchasing portfolios: An assessment of sustainable sourcing. Journal of Supply Chain Management, 46 (1), 57–73.

Popovic, T., Barbosa-Póvoa, A., Kraslawski, A., & Carvalho, A. (2018). Quantitative Indicators or social sustainability assessment of supply chains. Journal of Cleaner Production, 180 , 748–768.

Preuss, L., & Brown, D. (2012). Business policies on human rights: An analysis of their content and prevalence among FTSE 100 firms. Journal of Business Ethics, 109 (3), 289–299.

Rabl, T., & Kühlmann, T. M. (2008). Understanding corruption in organisations development and empirical assessment of an action model. Journal of Business Ethics, 82 (2), 477–495.

Rajak, S., & Vinodh, S. (2015). Application of fuzzy logic for social sustainability performance evaluation: A case study of an Indian automotive component manufacturing organization. Journal of Cleaner Production, 108 , 1184–1192.

Rose-Ackerman, S., & Palifka, B. J. (2016). Corruption and Government: Causes, Consequences, and Reform . Cambridge University Press.

Book   Google Scholar  

Ross, S. M., & Kapitan, S. (2018). Balancing self/collective-interest: Equity theory for prosocial consumption. European Journal of Marketing, 52 (3/4), 528–549.

Sancha, C., Gimenez, C., & Sierra, V. (2016). Achieving a socially responsible supply chain through assessment and collaboration. Journal of Cleaner Production, 112 , 1934–1947.

Seuring, S., & Müller, M. (2008). From a literature review to a conceptual framework for sustainable supply chain management. Journal of Cleaner Production, 16 (15), 1699–1710.

Shepherd, C., & Günter, H. (2010). Measuring Supply Chain Performance: Current Researchand Future Directions. In J. Fransoo, T. Waefler, & J. Wilson (Eds.), Behavioral Operations in Planning and Scheduling. Springer.

Silvestre, B. S. (2016). Sustainable supply chain management: Current debate and future directions. Gestão & Produção, 23 (2), 235–249.

Silvestre, B. S., Monteiro, M. S., Viana, F. L. E., & de Sousa-Filho, J. M. (2018). Challenges for sustainable supply chain management: When stakeholder collaboration becomes conducive to corruption. Journal of Cleaner Production, 194 , 766–776.

Smid, S. C., & Rosseel, Y. (2020). SEM with small samples: Two-step modeling and factor score regression versus Bayesian estimation with informative priors. In Small sample size solutions (pp. 239–254). Routledge

Sodhi, M. S., & Tang, C. S. (2018). Corporate social sustainability in supply chains: A thematic analysis of the literature. International Journal of Production Research, 56 (1–2), 882–901.

Torugsa, N. A., O’Donohue, W., & Hecker, R. (2013). Proactive CSR: an empirical analysis of the role of its economic, social and environmental dimensions on the association between capabilities and performance. Journal of Business Ethics, 115 , 383–402.

Tsolakis, N., Niedenzu, D., Simonetto, M., Dora, M., & Kumar, M. (2021). Supply network design to address United Nations Sustainable Development Goals: A case study of blockchain implementation in Thai fish industry. Journal of Business Research, 131 , 495–519.

Varsei, M., Soosay, C., Fahimnia, B., & Sarkis, J. (2014). Framing sustainability performance of supply chains with multidimensional indicators. Supply Chain Management: An International Journal, 19 (3), 242–257.

Winter, M., & Knemeyer, A. M. (2013). Exploring the integration of sustainability and supply chain management: Current state and opportunities for future inquiry. International Journal of Physical Distribution & Logistics Management, 43 (1), 18–38.

Wood, D. J. (1991). Corporate social performance revisited. Academy of Management Review, 16 (4), 691–718.

Yakovleva, N., Sarkis, J., & Sloan, T. (2012). Sustainable benchmarking of supply chains: The case of the food industry. International Journal of Production Research, 50 (5), 1297–1317.

Yawar, S. A., & Seuring, S. (2017). Management of social issues in supply chains: A literature review exploring social issues, actions and performance outcomes. Journal of Business Ethics, 141 (3), 621–643.

Zhou, M., Govindan, K., & Xie, X. (2020). How fairness perceptions, embeddedness, and knowledge sharing drive green innovation in sustainable supply chains: An equity theory and network perspective to achieve sustainable development goals. Journal of Cleaner Production, 260 , 120950.

Download references

Author information

Authors and affiliations.

Birla Institute of Technology and Science, Pilani, Pilani campus, Rajasthan, India

Satyendra Kumar Sharma

SP Jain Institute of Management & Research, Mumbai, India

Sajeev Abraham George

IIM Rohtak: Indian Institute of Management Rohtak, Rohtak, Haryana, India

Praveen Ranjan Srivastava

College of Business at Abu Dhabi University, Abu Dhabi, United Arab Emirates

Fauzia Jabeen

International Logistics Management Department, Yasar University, Izmir, Turkey

Cisem Lafci

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Praveen Ranjan Srivastava .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Sharma, S.K., George, S.A., Srivastava, P.R. et al. Supply chain socially sustainability practices and their impact on supply chain performance: a study from the Indian automobile industry. Ann Oper Res (2024). https://doi.org/10.1007/s10479-024-05991-w

Download citation

Received : 12 May 2022

Accepted : 08 April 2024

Published : 29 April 2024

DOI : https://doi.org/10.1007/s10479-024-05991-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Social sustainability
  • Supply chains
  • Confirmatory factor analysis
  • Supply chain performance
  • Find a journal
  • Publish with us
  • Track your research

At this time, we recommend all  Penn-affiliated  travel to Israel, West Bank, Gaza, and Lebanon be deferred.  If you are planning travel to any of these locations, please reach out to [email protected] for the most up to date risk assessment and insurance exclusions. As a reminder, it is required that all Penn-affiliated trips are registered in  MyTrips .  If you have questions, please contact  [email protected]

Utility Navigation

Utility links.

  • University of Pennsylvania
  • Office of the Provost
  • Penn Global

Secondary Nav Penn Global

  • For Penn Students
  • For Penn Faculty
  • For Alumni & Friends

Primary Nav Penn Global

Drawer menu penn global.

  • Back to main menu
  • Our Strategic Framework
  • Perry World House
  • Penn Biden Center
  • Penn Abroad
  • International Student & Scholar Services
  • Global Support Services
  • Penn in Africa
  • Penn in China
  • 2022 PLAC Symposium
  • Pulitzer International Reporting Student Fellowship
  • Connect with PLAC
  • Penn in Oceania
  • Penn in the Middle East
  • Penn in Northern America
  • Global at Penn's Schools
  • Global Centers & Programs
  • Global Engagement Fund
  • China Research and Engagement Fund
  • India Research and Engagement Fund
  • Holman Africa Research and Engagement Fund
  • Apply for a Convening Grant
  • Apply for a Research Grant
  • Manage My Grant
  • Grants Database

PENN GLOBAL RESEARCH & ENGAGEMENT GRANT PROGRAM 2024 Grant Program Awardees

Basic page sidebar menu penn global.

In 2024, Penn Global will support 24 new faculty-led research and engagement projects at a total funding level of $1.5 million.

The Penn Global Research and Engagement Grant Program prioritizes projects that bring together leading scholars and practitioners across the University community and beyond to develop new insight on significant global issues in key countries and regions around the world, a core pillar of Penn’s global strategic framework. 

PROJECTS SUPPORTED BY THE HOLMAN AFRICA RESEARCH AND ENGAGEMENT FUND

  • Global Medical Physics Training & Development Program  Stephen Avery, Perelman School of Medicine
  • Developing a Dakar Greenbelt with Blue-Green Wedges Proposal  Eugenie Birch, Weitzman School of Design
  • Emergent Judaism in Sub-Saharan Africa  Peter Decherney, School of Arts and Sciences / Sara Byala, School of Arts and Sciences
  • Determinants of Cognitive Aging among Older Individuals in Ghana  Irma Elo, School of Arts and Sciences;  Iliana Kohler, School of Arts and Sciences
  • Disrupted Aid, Displaced Lives Guy Grossman, School of Arts and Sciences
  • A History of Regenerative Agriculture Practices from the Global South: Case Studies from Ethiopia, Kenya, and Zimbabwe Thabo Lenneiye, Kleinman Energy Center / Weitzman School of Design
  • Penn Computerized Neurocognitive Battery Use in Botswana Public Schools Elizabeth Lowenthal, Perelman School of Medicine
  • Podcasting South African Jazz Past and Present Carol Muller, School of Arts and Sciences
  • Lake Victoria Megaregion Study: Joint Lakefront Initiative Frederick Steiner, Weitzman School of Design
  • Leveraging an Open Source Software to Prevent and Contain AMR Jonathan Strysko, Perelman School of Medicine
  • Poverty reduction and children's neurocognitive growth in Cote d'Ivoire Sharon Wolf, Graduate School of Education
  • The Impacts of School Connectivity Efforts on Education Outcomes in Rwanda  Christopher Yoo, Carey Law School

PROJECTS SUPPORTED BY THE INDIA RESEARCH AND ENGAGEMENT FUND

  • Routes Beyond Conflict: A New Approach to Cultural Encounters in South Asia  Daud Ali, School of Arts and Sciences
  • Prioritizing Air Pollution in India’s Cities Tariq Thachil, Center for the Advanced Study of India / School of Arts and Sciences
  • Intelligent Voicebots to Help Indian Students Learn English Lyle Ungar, School of Engineering and Applied Sciences

PROJECTS SUPPORTED BT THE CHINA RESEARCH AND ENGAGEMENT FUND

  • Planning Driverless Cities in China Zhongjie Lin, Weitzman School of Design

PROJECTS SUPPORTED BY THE GLOBAL ENGAGEMENT FUND 

  • Education and Economic Development in Nepal Amrit Thapa, Graduate School of Education
  • Explaining Climate Change Regulation in Cities: Evidence from Urban Brazil Alice Xu, School of Arts and Sciences
  • Nurse Staffing Legislation for Scotland: Lessons for the U.S. and the U.K.  Eileen Lake, School of Nursing
  • Pathways to Education Development & Their Consequences: Finland, Korea, US Hyunjoon Park, School of Arts and Sciences
  • Engaged Scholarship in Latin America: Bridging Knowledge and Action Tulia Falleti, School of Arts and Sciences
  • Organizing Migrant Communities to Realize Rights in Palermo, Sicily  Domenic Vitiello, Weitzman School of Design
  • Exploiting Cultural Heritage in 21st Century Conflict   Fiona Cunningham, School of Arts and Sciences
  • Center for Integrative Global Oral Health   Alonso Carrasco-Labra, School of Dental Medicine

This first-of-its-kind Global Medical Physics Training and Development Program (GMPTDP) seeks to serve as an opportunity for PSOM and SEAS graduate students to enhance their clinical requirement with a global experience, introduce them to global career opportunities and working effectively in different contexts, and strengthens partnerships for education and research between US and Africa. This would also be an exceptional opportunity for pre-med/pre-health students and students interested in health tech to have a hands-on global experience with some of the leading professionals in the field. The project will include instruction in automated radiation planning through artificial intelligence (AI); this will increase access to quality cancer care by standardizing radiation planning to reduce inter-user variability and error, decreasing workload on the limited radiation workforce, and shortening time to treatment for patients. GMPTDP will offer a summer clinical practicum to Penn students during which time they will also collaborate with UGhana to implement and evaluate AI tools in the clinical workflow.

The proposal will address today’s pressing crises of climate change, land degradation, biodiversity loss, and growing economic disparities with a holistic approach that combines regional and small-scale actions necessary to achieve sustainability. It will also tackle a key issue found across sub-Saharan Africa, many emerging economies, and economically developed countries that struggle to control rapid unplanned urbanization that vastly outpaces the carrying capacity of the surrounding environment.

The regional portion of the project will create a framework for a greenbelt that halts the expansion of the metropolitan footprint. It will also protect the Niayes, an arable strip of land that produces over 80% of the country’s vegetables, from degradation. This partnership will also form a south-south collaboration to provide insights into best practices from a city experiencing similar pressures.

The small-scale portion of the project will bolster and create synergy with ongoing governmental and grassroots initiatives aimed at restoring green spaces currently being infilled or degraded in the capital. This will help to identify overlapping goals between endeavors, leading to collaboration and mobilizing greater funding possibilities instead of competing over the same limited resources. With these partners, we will identify and design Nature-based Solutions for future implementation.

Conduct research through fieldwork to examine questions surrounding Jewish identity in Africa. Research will be presented in e.g. articles, photographic images, and films, as well as in a capstone book. In repeat site-visits to Uganda, South Africa, Ghana, and Zimbabwe, we will conduct interviews with and take photographs of stakeholders from key communities in order to document their everyday lives and religious practices.

The overall aim of this project is the development of a nationally representative study on aging in Ghana. This goal requires expanding our network of Ghanian collaborators and actively engage them in research on aging. The PIs will build on existing institutional contacts in Ghana that include:

1). Current collaboration with the Navrongo Health Research Center (NCHR) on a pilot data collection on cognitive aging in Ghana (funded by a NIA supplement and which provides the matching funds for this Global Engagement fund grant application);

2) Active collaboration with the Regional Institute for Population Studies (RIPS), University of Ghana. Elo has had a long-term collaboration with Dr. Ayaga Bawah who is the current director of RIPS.

In collaboration with UNHCR, we propose studying the effects of a dramatic drop in the level of support for refugees, using a regression discontinuity design to survey 2,500 refugee households just above and 2,500 households just below the vulnerability score cutoff that determines eligibility for full rations. This study will identify the effects of aid cuts on the welfare of an important marginalized population, and on their livelihood adaptation strategies. As UNHCR faces budgetary cuts in multiple refugee-hosting contexts, our study will inform policymakers on the effects of funding withdrawal as well as contribute to the literature on cash transfers.

The proposed project, titled "A History of Regenerative Agriculture Practices from the Global South: Case Studies from Ethiopia, Kenya, and Zimbabwe," aims to delve into the historical and contemporary practices of regenerative agriculture in sub-Saharan Africa. Anticipated Outputs and Outcomes:

1. Research Paper: The primary output of this project will be a comprehensive research paper. This paper will draw from a rich pool of historical and contemporary data to explore the history of regenerative agriculture practices in Ethiopia, Kenya, and Zimbabwe. It will document the indigenous knowledge and practices that have sustained these regions for generations.

2. Policy Digest: In addition to academic research, the project will produce a policy digest. This digest will distill the research findings into actionable insights for policymakers, both at the national and international levels. It will highlight the benefits of regenerative agriculture and provide recommendations for policy frameworks that encourage its adoption.

3. Long-term Partnerships: The project intends to establish long-term partnerships with local and regional universities, such as Great Lakes University Kisumu, Kenya. These partnerships will facilitate knowledge exchange, collaborative research, and capacity building in regenerative agriculture practices. Such collaborations align with Penn Global's goal of strengthening institutional relationships with African partners.

The Penn Computerized Neurocognitive Battery (PCNB) was developed at the University of Pennsylvania by Dr. Ruben C. Gur and colleagues to be administered as part of a comprehensive neuropsychiatric assessment. Consisting of a series of cognitive tasks that help identify individuals’ cognitive strengths and weaknesses, it has recently been culturally adapted and validated by our team for assessment of school-aged children in Botswana . The project involves partnership with the Botswana Ministry of Education and Skills Development (MoESD) to support the rollout of the PCNB for assessment of public primary and secondary school students in Botswana. The multidisciplinary Penn-based team will work with partners in Botswana to guide the PCNB rollout, evaluate fidelity to the testing standards, and track student progress after assessment and intervention. The proposed project will strengthen a well-established partnership between Drs. Elizabeth Lowenthal and J. Cobb Scott from the PSOM and in-country partners. Dr. Sharon Wolf, from Penn’s Graduate School of Education, is an expert in child development who has done extensive work with the Ministry of Education in Ghana to support improvements in early childhood education programs. She is joining the team to provide the necessary interdisciplinary perspective to help guide interventions and evaluations accompanying this new use of the PCNB to support this key program in Africa.

This project will build on exploratory research completed by December 24, 2023 in which the PI interviewed about 35 South Africans involved in jazz/improvised music mostly in Cape Town: venue owners, curators, creators, improvisers.

  • Podcast series with 75-100 South African musicians interviewed with their music interspersed in the program.
  • 59 minute radio program with extended excerpts of music inserted into the interview itself.
  • Create a center of knowledge about South African jazz—its sound and its stories—building knowledge globally about this significant diasporic jazz community
  • Expand understanding of “jazz” into a more diffuse area of improvised music making that includes a wide range of contemporary indigenous music and art making
  • Partner w Lincoln Center Jazz (and South African Tourism) to host South Africans at Penn

This study focuses on the potential of a Megaregional approach for fostering sustainable development, economic growth, and social inclusion within the East African Community (EAC), with a specific focus on supporting the development of A Vision for An Inclusive Joint Lakefront across the 5 riparian counties in Kenya.

By leveraging the principles of Megaregion development, this project aims to create a unified socio-economic, planning, urbanism, cultural, and preservation strategy that transcends county boundaries and promotes collaboration further afield, among the EAC member countries surrounding the Lake Victoria Basin.

Anticipated Outputs and Outcomes:

1. Megaregion Conceptual Framework: The project will develop a comprehensive Megaregion Conceptual Framework for the Joint Lakefront region in East Africa. This framework, which different regions around the world have applied as a way of bridging local boundaries toward a unified regional vision will give the Kisumu Lake region a path toward cooperative, multi-jurisdictional planning. The Conceptual Framework will be both broad and specific, including actionable strategies, projects, and initiatives aimed at sustainable development, economic growth, social inclusion, and environmental stewardship.

2. Urbanism Projects: Specific urbanism projects will be proposed for key urban centers within the Kenyan riparian counties. These projects will serve as tangible examples of potential improvements and catalysts for broader development efforts.

3. Research Publication: The findings of the study will be captured in a research publication, contributing to academic discourse and increasing Penn's visibility in the field of African urbanism and sustainable development

Antimicrobial resistance (AMR) has emerged as a global crisis, causing more deaths than HIV/AIDS and malaria worldwide. By engaging in a collaborative effort with the Botswana Ministry of Health’s data scientists and experts in microbiology, human and veterinary medicine, and bioinformatics, we will aim to design new electronic medical record system modules that will:

Aim 1: Support the capturing, reporting, and submission of microbiology data from sentinel surveillance laboratories as well as pharmacies across the country

Aim 2: Develop data analytic dashboards for visualizing and characterizing regional AMR and AMC patterns

Aim 3: Submit AMR and AMC data to regional and global surveillance programs

Aim 4: Establish thresholds for alert notifications when disease activity exceeds expected incidence to serve as an early warning system for outbreak detection.

  Using a novel interdisciplinary approach that bridges development economics, psychology, and neuroscience, the overall goal of this project is to improve children's development using a poverty-reduction intervention in Cote d'Ivoire (CIV). The project will directly measure the impacts of cash transfers (CTs) on neurocognitive development, providing a greater understanding of how economic interventions can support the eradication of poverty and ensure that all children flourish and realize their full potential. The project will examine causal mechanisms by which CTs support children’s healthy neurocognitive development and learning outcomes through the novel use of an advanced neuroimaging tool, functional Near Infrared Spectroscopy (fNIRS), direct child assessments, and parent interviews.

The proposed research, the GIGA initiative for Improving Education in Rwanda (GIER), will produce empirical evidence on the impact of connecting schools on education outcomes to enable Rwanda to better understand how to accelerate the efforts to bring connectivity to schools, how to improve instruction and learning among both teachers and students, and whether schools can become internet hubs capable of providing access e-commerce and e-government services to surrounding communities. In addition to evaluating the impact of connecting schools on educational outcomes, the research would also help determine which aspects of the program are critical to success before it is rolled out nationwide.

Through historical epigraphic research, the project will test the hypothesis that historical processes and outcomes in the 14th century were precipitated by a series of related global and local factors and that, moreover, an interdisciplinary and synergistic analysis of these factors embracing climatology, hydrology, epidemiology linguistics and migration will explain the transformation of the cultural, religious and social landscapes of the time more effectively than the ‘clash of civilizations’ paradigm dominant in the field. Outputs include a public online interface for the epigraphic archive; a major international conference at Penn with colleagues from partner universities (Ghent, Pisa, Edinburgh and Penn) as well as the wider South Asia community; development of a graduate course around the research project, on multi-disciplinary approaches to the problem of Hindu-Muslim interaction in medieval India; and a public facing presentation of our findings and methods to demonstrate the path forward for Indian history. Several Penn students, including a postdoc, will be actively engaged.  

India’s competitive electoral arena has failed to generate democratic accountability pressures to reduce toxic air. This project seeks to broadly understand barriers to such pressures from developing, and how to overcome them. In doing so, the project will provide the first systematic study of attitudes and behaviors of citizens and elected officials regarding air pollution in India. The project will 1) conduct in-depth interviews with elected local officials in Delhi, and a large-scale survey of elected officials in seven Indian states affected by air pollution, and 2) partner with relevant civil society organizations, international bodies like the United Nations Environment Program (UNEP), domain experts at research centers like the Public Health Foundation of India (PHFI), and local civic organizations (Janagraaha) to evaluate a range of potential strategies to address pollution apathy, including public information campaigns with highly affected citizens (PHFI), and local pollution reports for policymakers (Janagraaha).

The biggest benefit from generative AI such as GPT, will be the widespread availability of tutoring systems to support education. The project will use this technology to build a conversational voicebot to support Indian students in learning English. The project will engage end users (Indian tutors and their students) in the project from the beginning. The initial prototype voice-driven conversational system will be field-tested in Indian schools and adapted. The project includes 3 stages of development:

1) Develop our conversational agent. Specify the exact initial use case and Conduct preliminary user testing.

2) Fully localize to India, addressing issues identified in Phase 1 user testing.

3) Do comprehensive user testing with detailed observation of 8-12 students using the agent for multiple months; conduct additional assessments of other stakeholders.

The project partners with Ashoka University and Pratham over all three stages, including writing scholarly papers.

Through empirical policy analysis and data-based scenario planning, this project actively contributes to this global effort by investigating planning and policy responses to autonomous transportation in the US and China. In addition to publishing several research papers on this subject, the PI plans to develop a new course and organize a forum at PWCC in 2025. These initiatives are aligned with an overarching endeavor that the PI leads at the Weitzman School of Design, which aims to establish a Future Cities Lab dedicated to research and collaboration in the pursuit of sustainable cities.

This study aims to fill this gap through a more humanistic approach to measuring the impact of education on national development. Leveraging a mixed methods research design consisting of analysis of quantitative data for trends over time, observations of schools and classrooms, and qualitative inquiry via talking to people and hearing their stories, we hope to build a comprehensive picture of educational trends in Nepal and their association with intra-country development. Through this project we strive to better inform the efforts of state authorities and international organizations working to enhance sustainable development within Nepal, while concurrently creating space and guidance for further impact analyses. Among various methods of dissemination of the study’s findings, one key goal is to feed this information into writing a book on this topic.

Developing cities across the world have taken the lead in adopting local environmental regulation. Yet standard models of environmental governance begin with the assumption that local actors should have no incentives for protecting “the commons.” Given the benefits of climate change regulation are diffuse, individual local actors face a collective action problem. This project explores why some local governments bear the costs of environmental regulation while most choose to free-ride. The anticipated outputs of the project include qualitative data that illuminate case studies and the coding of quantitative spatial data sets for studying urban land-use. These different forms of data collection will allow me to develop and test a theoretical framework for understanding when and why city governments adopt environmental policy.

The proposed project will develop new insights on the issue of legislative solutions to the nurse staffing crisis, which will pertain to many U.S. states and U.K. countries. The PI will supervise the nurse survey data collection and to meet with government and nursing association stakeholders to plan the optimal preparation of reports and dissemination of results. The anticipated outputs of the project are a description of variation throughout Scotland in hospital nursing features, including nurse staffing, nurse work environments, extent of adherence to the Law’s required principles, duties, and method, and nurse intent to leave. The outcomes will be the development of capacity for sophisticated quantitative research by Scottish investigators, where such skills are greatly needed but lacking.  

The proposed project will engage multi-cohort, cross-national comparisons of educational-attainment and labor-market experiences of young adults in three countries that dramatically diverge in how they have developed college education over the last three decades: Finland, South Korea and the US. It will produce comparative knowledge regarding consequences of different pathways to higher education, which has significant policy implications for educational and economic inequality in Finland, Korea, the US, and beyond. The project also will lay the foundation for ongoing collaboration among the three country teams to seek external funding for sustained collaboration on educational analyses.

With matching funds from PLAC and CLALS, we will jointly fund four scholars from diverse LAC countries to participate in workshops to engage our community regarding successful practices of community-academic partnerships.

These four scholars and practitioners from Latin America, who are experts on community-engaged scholarship, will visit the Penn campus during the early fall of 2024. As part of their various engagements on campus, these scholars will participate after the workshops as key guest speakers in the 7th edition of the Penn in Latin America and the Caribbean (PLAC) Conference, held on October 11, 2024, at the Perry World House. The conference will focus on "Public and Community Engaged Scholarship in Latin America, the Caribbean, and their Diasporas."

Palermo, Sicily, has been a leading center of migrant rights advocacy and migrant civic participation in the twenty-first century. This project will engage an existing network of diverse migrant community associations and anti-mafia organizations in Palermo to take stock of migrant rights and support systems in the city. Our partner organizations, research assistants, and cultural mediators from different communities will design and conduct a survey and interviews documenting experiences, issues and opportunities related to various rights – to asylum, housing, work, health care, food, education, and more. Our web-based report will include recommendations for city and regional authorities and other actors in civil society. The last phase of our project will involve community outreach and organizing to advance these objectives. The web site we create will be designed as the network’s information center, with a directory of civil society and services, updating an inventory not current since 2014, which our partner Diaspore per la Pace will continue to update.

This interdisciplinary project has four objectives: 1) to investigate why some governments and non-state actors elevated cultural heritage exploitation (CHX) to the strategic level of warfare alongside nuclear weapons, cyberattacks, political influence operations and other “game changers”; 2) which state or non-state actors (e.g. weak actors) use heritage for leverage in conflict and why; and 3) to identify the mechanisms through which CHX coerces an adversary (e.g. catalyzing international involvement); and 4) to identify the best policy responses for non-state actors and states to address the challenge of CHX posed by their adversaries, based on the findings produced by the first three objectives.

Identify the capacity of dental schools, organizations training oral health professionals and conducting oral health research to contribute to oral health policies in the WHO Eastern Mediterranean region, identify the barriers and facilitators to engage in OHPs, and subsequently define research priority areas for the region in collaboration with the WHO, oral health academia, researchers, and other regional stakeholders.

3539 Locust Walk University of Pennsylvania Philadelphia, PA 19104

[email protected]

©2024 University of Pennsylvania, Philadelphia, PA 19104   

Footer Menu

  • Report Accessibility Issues and Get Help
  • Privacy Policy

IMAGES

  1. Introduction to Multiple Linear Regression

    research paper on multiple regression

  2. Presenting the Results of a Multiple Regression Analysis

    research paper on multiple regression

  3. Results of Multiple Linear Regression Analysis

    research paper on multiple regression

  4. 📗 Research Paper on Multiple Regression Analysis

    research paper on multiple regression

  5. PPT

    research paper on multiple regression

  6. Types of Multiple Regression analysis

    research paper on multiple regression

VIDEO

  1. Multivariable Regression Examples part I Johns Hopkins Un

  2. Multivariable Regression part I Johns Hopkins University

  3. Multiple Regression in SPSS

  4. Does AI really help you to write an academic paper?

  5. Multiple Regression in R (Part 1)

  6. Multiple Linear Regression

COMMENTS

  1. (PDF) Multiple Regression: Methodology and Applications

    This is paper presented a multiple linear regression model and logistic regression model, according to assumptions of both models. The paper depended on logistic regression model because the ...

  2. A Study on Multiple Linear Regression Analysis

    In this study, data for multilinear regression analysis is occur from Sakarya University Education Faculty student's lesson (measurement and evaluation, educational psychology, program development, counseling and instructional techniques) scores and their 2012- KPSS score. Assumptions of multilinear regression analysis- normality, linearity, no ...

  3. Anxiety, Affect, Self-Esteem, and Stress: Mediation and ...

    Multiple linear regression analyses were used in order to examine moderation effects between anxiety, stress, self-esteem and affect on depression. The analysis indicated that about 52% of the variation in the dependent variable (i.e., depression) could be explained by the main effects and the interaction effects ( R 2 = .55, adjusted R 2 = .51 ...

  4. Multiple linear regression

    When we use the regression sum of squares, SSR = Σ ( ŷi − Y−) 2, the ratio R2 = SSR/ (SSR + SSE) is the amount of variation explained by the regression model and in multiple regression is ...

  5. multiple linear regression Latest Research Papers

    This paper describes the multiple linear regression algorithm involved in this research and the AHMES learning (AL) algorithm improved by the Q-learning algorithm. The simulation test results of the upgraded AHMES show the effectiveness of these algorithms. Download Full-text.

  6. PDF Multiple Linear Regression (2nd Edition) Mark Tranmer Jen Murphy Mark

    In both cases, we still use the term 'linear' because we assume that the response variable is directly related to a linear combination of the explanatory variables. The equation for multiple linear regression has the same form as that for simple linear regression but has more terms: = 0 +. 1 +. 2 + ⋯ +.

  7. PDF Fundamentals of Multiple Regression

    The value of t .025 is found in a t-table, using the usual df of t for assessing statistical significance of a regression coefficient (N - the num-ber of X's - 1), and is the value that leaves a tail of the t-curve with 2.5% of the total probability. For instance, if df = 30, then t.025 = 2.042.

  8. Multiple Regression in L2 Research: A Methodological Synthesis and

    Consequently, and similar to quantitative traditions in sister-disciplines such as education and psychology (see Skidmore & Thompson, 2010), second language researchers have turned increasingly to multiple regression. The present study employs research synthetic techniques to describe and evaluate the use of this procedure in the field.

  9. Multiple Linear Regression

    The formula for a multiple linear regression is: = the predicted value of the dependent variable. = the y-intercept (value of y when all other parameters are set to 0) = the regression coefficient () of the first independent variable () (a.k.a. the effect that increasing the value of the independent variable has on the predicted y value ...

  10. A Comprehensive Study of Regression Analysis and the Existing

    This paper examines and compares various regression models and machine learning algorithms. The selected techniques include multiple linear regression (MLR), ridge regression (RR), least absolute shrinkage and selection operator (LASSO) regression, multilayer perceptron (MLP), radial basis function (RBF), decision tree (DT), support vector ...

  11. Introduction to Multivariate Regression Analysis

    These questions can in principle be answered by multiple linear regression analysis. In the multiple linear regression model, Y has normal distribution with mean. The model parameters β 0 + β 1 + +β ρ and σ must be estimated from data. β 0 = intercept. β 1 β ρ = regression coefficients.

  12. PDF Multiple Regression Analysis

    5A.4 Multiple Regression Research 5A.4.1 Research Problems Suggesting a Regression Approach If the research problem is expressed in a form that either specifies or implies prediction, multiple regression analysis becomes a viable candidate for the design. Here are some examples of research objectives that imply a regression design:

  13. Education, income inequality, and mortality: a multiple regression

    Objective: To test whether the relation between income inequality and mortality found in US states is because of different levels of formal education. Design: Cross sectional, multiple regression analysis. Setting: All US states and the District of Columbia (n=51). Data sources: US census statistics and vital statistics for the years 1989 and 1990. Main outcome measure: Multiple regression ...

  14. A Multiple Linear Regression Approach For Estimating the Market Value

    Abstract—In this paper, market values of the football players in the forward positions are estimated using multiple. linear regression by including the physical and performance factors in 2017-2018 season. Players from 4 major. leagues of Europe are examined, and by applying Breusch - Pagan test for homoscedasticity, a reasonable regression.

  15. Multiple regression model to analyze the total LOS for patients

    Multiple linear regression. In the last years, several data analytics methodologies have been proposed for supporting different applications [37, 38]. One of the most used one is the Multiple Linear Regression, that is a statistical technique that uses several explanatory variables to predict the outcome of a response variable.

  16. Estimation of Health-Related Physical Fitness Using Multiple Linear

    To perform multiple linear regression analysis, the β-value (the regression coefficient) was used to verify if the independent variables had explanatory power (Park et al., 2020). In this work we used the stepwise mode of regression analysis, which is indicated when multiple independent variables are taken as predictors ( Shepperd and ...

  17. Applications of Multiple Regression in Psychological Research

    Applications of the multiple-regression model. The regression model can be used in one of two general ways, referred to by some (e.g., Pedhazur, 1997) as explanation and prediction. The distinction between these approaches is akin to the distinction between confirmatory and exploratory analyses.

  18. PDF Multiple Linear Regression

    1.1 Overview. A multiple linear regression analysis is carried out to predict the values of a dependent variable, Y, given a set of p explanatory variables (x1,x2,....,xp). In these notes, the necessary theory for multiple linear regression is presented and examples of regression analysis with census data are given to illustrate this theory.

  19. Multiple Regression Analysis

    A quantitative paper states a hypothesis and tests it using statistical tools ... the final sample size, and the goals of the research, a hierarchical multiple regression analysis was used. 1 Hierarchical multiple regression estimates the statistical relationship between a set of independent variables and individual or grouped dependent ...

  20. How to Report Results of Multiple Linear Regression

    Initiate by clearly stating the purpose of your multiple linear regression (MLR) analysis. For example, you might explore how environmental factors (X1, X2, X3) predict plant growth (Y). Example: "This study aims to assess the impact of sunlight exposure (X1), water availability (X2), and soil quality (X3) on plant growth rate (Y).".

  21. Supply chain socially sustainability practices and their ...

    This paper adopts a questionnaire-based survey research approach in the context of Indian automobile industry. Empirical validation of the conceptual model developed was carried out using Confirmatory Factor Analysis. Multiple regression was used to test the relationships between SCSS practices and supply chain performance.

  22. 2024 Grant Program Awardees

    In addition to publishing several research papers on this subject, the PI plans to develop a new course and organize a forum at PWCC in 2025. These initiatives are aligned with an overarching endeavor that the PI leads at the Weitzman School of Design, which aims to establish a Future Cities Lab dedicated to research and collaboration in the ...