Development of a Frailty Ladder Using Rasch Analysis: If the Shoe Fits

Nancy E. Mayo, PhD1,2,3, Mylène Aubertin-Leheudre, PhD4,5, Kedar Mate, PhD1, Sabrina Figueiredo, PhD6, Julio Flavio Fiore Jr., PhD2,7,8, Mohammad Auais, PhD9, Susan C. Scott, MSc2, José A. Morais, MD1,2

1Department of Medicine, McGill University, Montreal, QC
2Centre for Outcomes Research and Evaluation, McGill University Health Centre Research Institute, Montreal, QC
3School of Physical and Occupational Therapy, McGill University, Montreal, QC
4Département des Sciences de l’activité physique, Faculté des Sciences, Université du Québec à Montréal, Montréal, QC
5Centre de recherche de l’institut universtaire de Gériatrie de Montréal, Montréal, QC
6Health Care Quality Program, The George Washington University, School of Medicine and Health Sciences, Washington, DC
7Department of Surgery, McGill University, Montreal, QC
8Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University, Montreal, QC
9School of Rehabilitation Therapy, Faculty of Health Sciences, Queen’s University, Kingston, ON




The current measurement approach to frailty is to create an index of frailty status, rather than measure it. The purpose of this study is to test the extent to which a set of items identified within the frailty concept fit a hierarchical linear model (e.g., Rasch model) and form a true measure reflective of the frailty construct.


A sample was assembled from three sources: community organization for at-risk seniors (n=141); colorectal surgery group assessed post-surgery (n=47); and hip fracture assessed post-rehabilitation (n=46). The 234 individuals (age 57 to 97) contributed 348 measurements. The frailty construct was defined according to the named domains within commonly used frailty indices, and items drawn to reflect the frailty came from self-report measures. Performance tests were tested for the extent to which they fit the Rasch model.


Of the 68 items, 29 fit the Rasch model: 19 self-report items on physical function and 10 performance tests, including one for cognition; patient reports of pain, fatigue, mood, and health did not fit; nor did body mass index (BMI) nor any item representing participation.


Items that are typically identified as reflecting the frailty concept fit the Rasch model. The Frailty Ladder would be an efficient and statistically robust way of combining results of different tests into one outcome measure. It would also be a way of identifying which outcomes to target in a personalized intervention. The rungs of the ladder, the hierarchy, could be used to guide treatment goals.

Key words: frailty, Rasch analysis, measurement theory


Frailty is a topic of major importance for clinical care and research in aging, geriatric medicine, physical therapy, rehabilitation, and nursing. Interest covers the spectrum from biology, detection, prevention, to treatment.(18) But underlying all of these areas is uncertainty about whether frailty is a construct that can be measured or an entity that can only be classified.

There is strong evidence for a frailty syndrome or phenotype(9) that is caused by four converging situations in an older person: systemic illness (which is the accumulation of acute and chronic illness from those on the list for accumulation of deficits), poor genes, poor lifestyle, and poor environment.(1,1014) When these converge with aging, they create the “perfect storm” that is frailty. As a syndrome or phenotype, it is a classification and the most common way classifying someone as frail or not is if they manifest three of five physical criteria: slowness, weakness, exhaustion, inactivity, and weight loss.(9) This approach renders frailty as a unidimensional, formative construct that is quantified by counting these physical manifestations.

The classification approach is not particularly useful, since being classified as “frail” does not incur special treatments or services as does being classified as having cancer, or cognitive impairment, or osteoporosis which places people on special treatment and follow-up protocols or makes them eligible for specific medications.

Current definitions of frailty suggest an entity that can be measured. In 2010, Gobbens et al. defined frailty as: “A dynamic state affecting an individual who experiences losses in one or more domains of human functioning (physical, psychological, social), which is caused by the influence of a range of variables and which increases the risk of adverse outcomes.”(15) In 2013, Rodríguez-Mañas et al. defined frailty as “a multidimensional syndrome characterized by decreased reserve and diminished resistance to stressors.”(16) The most recent definition of frailty has been proposed by the Canadian Frailty Network:(17) “Frailty is a state of increased vulnerability, with reduced physical reserve and loss of function across multiple body systems. This reduces ability to cope with normal or minor stresses, which can cause rapid and dramatic changes in health.” Reserve can be measured directly using physiological measures such as muscle mass or tests that have normative data, distance walked in six minutes, for example.(18)

These definitions set up frailty as a reflective construct, something to be measured. Measurement was defined in 1978 by Nunnally(19) as “rules for assigning numbers to objects in such a way as to represent quantities of attributes”. Some hundred years earlier Lord Kelvin,(20) responsible for identifying the temperature which defined absolute zero, commented that: “To measure is to know” and “If you cannot measure it, you cannot improve it.” Since the 1960s there has been interest in measuring attributes of people on measurement scales that have similar biophysical properties to temperature.(2123)

While frailty is considered distinct from disability,(16) 85% of the language used to describe frailty(24) is found within the World Health Organization’s International Classification of Functioning, Disability and Health (ICF).(25) The link between frailty and disability was illustrated through a recent review of 79 measures used to identify frailty.(26) Of the 25 single-domain frailty instruments, 24 could be mapped to the Body Function, and Activity and Participation components of the ICF. Of the 54 multi-domain frailty instruments, all could be mapped to the ICF with 67% (n = 35) of these instruments linking to at least three components of the ICF and 10% to all five. Van Damme et al., in a 2021 review,(27) identified that the 67 screening measures for frailty covered four domains and all of them included physical function; 73% covered psychological, 52% social, and 78% contextual factors. Almost half (43%) included all four domains.

Another conceptual model that is relevant in the field of frailty is proposed by Guaraldi and Mills on intrinsic capacity.(28) This is a positive construct that leads to a construct that is more actionable in terms of prevention, intervention, and measurement. Intrinsic capacity is something to strive for, whereas frailty is something to avoid. The foundation of intrinsic capacity is function in the broadest sense of this construct and defined by the World Health Organization, of which the ICF is core component.

A key observation from the 2016 review by Azzopardi et al.(26) is that virtually all of the 79 measures reviewed quantified frailty as a count of the number of detrimental health indicators such as co-morbidities, medications, abnormal results on blood tests, social situation, and specific functional limitations. The count was used to classify people into frailty categories. From a measurement perspective, the items were used to define the construct, frailty. This formative conceptual model underlies the creation of indices; the value of the index cannot change unless the items change.(29) Targeting the items will change the value on the index, but will do little for the underlying construct.

A reflective conceptual model underlies latent constructs, rather than indices that are formed by the items. In a reflective model, the construct is reflected in the items; many items could reflect the construct, changing the construct will change the items but not the reverse.(2931) Of these two models, formative models are useful for discrimination, to classify people into groups like frail, not frail, and to predict future health events. To evaluate change, true measures are required. For interventions targeting frailty, the measurement strategy has been to use multiple measures of single indicators of physical, emotional, and cognitive function and report on these separately,(3) with the inherent conceptual and statistical limitations to this approach. Many of these single indicators are those that form the frailty phenotype—strength and gait speed, as key examples.

Many interventions have attempted to change the building blocks of frailty such as strength and gait speed,(3,32,33) and those indicators that were targeted by the intervention did change but the impact on frailty classification was small or nil.

This paper addresses whether frailty could be considered under a reflective conceptual model, where the items related to the phenotypical indicators of frailty are seen as a reflection of frailty rather than forming it. A multitude of items could be chosen to reflect the frailty construct, and the value or quantity of frailty is inferred from observing how people interact with items known to reflect the frailty construct.

One necessary, but not sufficient, requirement of a reflective construct is that it fits a true latent model, such as the Rasch model, named for the Danish mathematician, George Rasch (1929–1980), who provided a method for constructing a measure of by transforming ordinal observations onto an interval scale. The outcome of a Rasch analysis, when the data fit the model, is a unidimensional measure on which items and people are organized hierarchically, by difficulty and ability respectively, on the same measurement scale in natural logarithm linear units or logits. Items that fit a Rasch model would form a measure with a total score that is sufficient to determine that person’s value on the underlying construct. The total score is a legitimate representation of the person’s “ability”, and can be subjected to mathematical transformations such as those needed to estimate change.(34) A feature of a reflective model is that changing the construct will change the items.(30) The evidence for this is plentiful as there are many examples of older persons who become ill or injured and deteriorate and accumulate many of the frailty indicators only to have these reversed when the source of the illness is identified and treated.(28) While therapy for the consequences of frailty may hasten the recovery of robustness, the fundamental change arises from treating the underlying causes of frailty. If these are not addressed, targeting the functional consequences will have some effect on function but not necessarily reduce frailty. However, that is not to suggest that targeting function will have no impact on frailty, as physical therapies can build reserve.(35,36)

The purpose of this study is to test the extent to which a set of items identified within the frailty concept fit a hierarchical near model and form a latent construct which is reflective of the frailty construct.


Cross-sectional and longitudinal data arising from 234 seniors (>55 years) with or at risk for age-related disability or receiving in- or out-patient care at a tertiary care health centre were analyzed for fit to the Rasch model. The sample was assembled from three sources: at risk community based source (n=141);(37) colorectal surgery group assessed post-surgery (n=47);(38) and hip fracture assessed post-discharge from rehabilitation (n=46 measured 1 – to 3 times).(39) All projects had ethical approval from their respective organizations.

The frailty construct was defined according to the named domains within commonly used frailty indices,(26,40) including the domains of mobility, strength, activity, energy, nutritional status, mood, balance, incontinence, cognition, and independence, as examples. This sample of seniors had been assessed within the context of research projects addressing senior’s health and the assessments included items reflecting the frailty domains. Table 1 lists the measures that were the source of the frailty items along with their original scoring.

TABLE 1 Measures used for sources of frailty items


The analysis was carried out within the framework of Rasch measurement theory,(41) an experimental paradigm that tests the extent to which a set of the items form a real measure. The results are reported as recommended by Pallant and Tennant.(42) The ordering of the categories of the rating scales is tested empirically, and if ordering is not met, further experimentation is required before inferences from the ratings are made. Rasch analysis was used to estimate the extent the frailty items fit the Rasch model. If the data fit the model, there is support for a reflective construct but fit is not sufficient; theory must support that construct causes the items, not the reverse.(43)Table 2 lists the iterative steps conducted through Rasch analysis. The partial credit model was used with the RUMM2030 software (


TABLE 2 Definition of steps taken to fit the data to the Rasch model


From the 234 people in this study, there were 348 valid measurements, as some people contributed more than one measurement. The mean age (SD) of the at-risk community sample was 71 (5.5) years; the mean ages of the hip fracture and colorectal surgery groups were 73 (6.3), and 77 (9.2) years, respectively. The age ranged from 57 to 97 and, except for the colorectal surgery group with 45% women, approximately 70% were women. A total of 68 items were available for analyses with the 36 items from the RAND-36(44) common to the at-risk community, and the colorectal samples and hip fracture sample having 12 of the 36 items plus all the items from the LEFS. Walking capacity was measured with the 6MWT in two of the data sets and with gait speed in the hip fracture data set. Missing data were generated when items differed across data sources.

All ordinal variables were coded so that the higher value indicated more robustness (less frailty). All continuous items were categorized into as many ordinal categories as the distribution would support, usually 10, and rescored until completely ordered thresholds were obtained and with higher values indicating more robustness.

Of the 68 items tested (see Table 1), 29 fit the Rasch model: two of 10 items from the Physical Function Index of the RAND-36,(44) 17 of 20 items from the Lower Extremity Functional Scale (LEFS),(45) and 10 performance tests including one for cognition.(46) For the LEFS, no item retained the five response categories; one item retained four response categories (getting in and out of a car); two items supported three response categories (stairs and standing for one hour); the rest of the fitting items could only support a binary scale. Of the two items retained from the RAND-36, limitation in vigorous activities did not need to be rescored, but the item for moderate activities could only support a binary scale.

Of the 39 items that did not fit, 34 came from the RAND-36, including eight of the items from the Physical Function Index; none of the items from the other seven domains of General Health Perceptions, Pain, Vitality, Mental Health, Role Physical, Role Emotional, or Social role fit. Only three of the 18 LEFS items did not fit, the ones indicating very high functioning (running). The two measures of mobility, fast and comfortable gait speed and Timed-Up-and-Go (TUG), were highly correlated and one super-item was made by combining the two. Leg strength (right and left leg) was redundant with functional measures of strength (Stair Test and Sit-to-Stand). Body mass index (BMI) did not fit.

Grip strength had residual correlations with the sit-to-stand test and with the self-report item from the RAND-36, limitation in doing moderate activities with examples such as moving a table, pushing a vacuum cleaner, bowling, or playing golf, all of which require hand strength; however, the equating tests did not indicate multidimensionality, so both items were retained.

There was differential item functioning (DIF) by sex for lean muscle mass and grip strength as any given value on these tests would indicate more robustness for women than for men. These items were split.

Fit to the model was demonstrated by a chi-square test for goodness of fit associated with a non-significant p value of .283 and a person-separation index of 0.907. Unidimensionality was also supported, although the usual method of testing for unidimensionality (see Table 1) could not be carried out when there is missing data as in this analysis of three merged data sets. Instead, we tested unidimensionality within two groups of variables without missing data and found that fewer than 5% of these equating tests were significant.

Table 3 provides item locations averaged over thresholds, indicating average difficulty of the items. Those items with a logit score less than 0 would be considered “easy” and failing these items would indicate more frailty; items with a logit score of >1 would be considered “difficult” and passing these would indicate more robustness or less frailty.

TABLE 3 Mean (over all thresholds) location order of each frailty indicator


Figure 1 shows the threshold map of the 29 items. Along the x-axis is the value of the latent frailty construct; the colored bars represent, for each item, the threshold that has to be passed for a given score (on a logit scale) on the latent construct.



FIGURE 1 Threshold map for Frailty Ladder items

At the most frail end of the scale (−6 logits) would be people who endorse having extreme difficulty getting in and out of a car and who have more than a little difficulty moving in bed (easy items). The next rung of the ladder, at around −4 logits, are people who report any degree of difficulty sitting for one hour. The self-report items are all at the lower end of the scale, and the first of the performance items are at approximately −1 logits, the slowest values on the TUG, and FVC of 30–59% predicted. At the most robust end of the scale (difficult items) is the longest 6MWT distance, the strongest grip strength for either men or women, and confirmation of having only a little or no difficulty running on even ground.

Figure 2 shows the person-threshold map indicating targeting of the items to the people. The pink bars on the top indicate the distribution of the study subjects along the latent construct, and the blue bars represent the location of item-threshold along the same construct. The mean on the subjects without extremes was 0.734 (with extremes: 0.65) which is close to the optimal of 0, with a standard deviation (SD) of 2.189 (with extremes: 2.183), larger than the desired SD of 1.0. The distribution of the people ranged from <−6.0 to >+5; the distribution of the items covered a similar wide range. The items and people are not optimally targeted, as many more people were at the more robust end of the frailty scale and fewer at the frailer end, yielding a skewed distribution.



FIGURE 2 Item-threshold map for frailty items showing targeting of the items and persons

A simple scoring algorithm, based on the ordinal structure of the data, correlated highly with the logit scale. This provided a more interpretable scoring algorithm with a theoretical range from 0 to 53. This scoring system and the levels of the items for Version 1 of the Frailty Ladder are given in Table 4.

TABLE 4 The Frailty Ladder Version 1a



The analysis found that the items that are typically identified as reflecting the frailty concept fit with the Rasch model, satisfying one of the necessary criteria for a reflective model. The items that fit the Rasch model covered the majority of domains identified from frailty indices including self-report physical limitations, performance of physical tests, and cognition. These findings suggest that disability is a necessary—but not a sufficient—reflection of frailty,(47) as not all people with disabilities (stroke, Multiple Sclerosis, arthritis, for examples) despite observable disability are frail. Another necessary, but not sufficient, contributor to frailty would be age, but there was no consensus on a particular age cut-off.(16) The observation that the patient-reported outcomes related to fatigue, mood, and health perception did not fit the model support that these are not necessary for frailty. Indeed, there are many older people who report fatigue, or low mood, or low health perception, but who are not “frail”. The same argument would apply to participation—it is not necessary for frailty; some quite robust people may not engage in activities and some frail people may still be engaged. Continence was not tested in this model but would not be necessary for frailty as there are many reasons for incontinence beyond frailty. Nutritional status was not tested in this model beyond BMI and so further testing of markers of nutritional status is warranted. The observation that a cognitive item fit with physical function and performance items suggests that the frailty latent is not just purely physical function. An extensive review by Bortone et al.(48) summarized how measures of gait fit with other frailty constructs, including cognition. Here the MoCA(46) fit as a total score, so some unpacking of the cognitive domain is warranted to see which cognitive items fit.

The observation that both self-report and performance-based measures fit together suggests that the source of the information does not define the construct. This was also demonstrated by Theou et al.(49) who observed, using data from some 4,900 people enrolled in The Irish LongituDinal study on Ageing (TILDA), that the best prediction of health outcomes came when both self-report and performance-based measures were combined.

The other necessary component for a reflective model is evidence of causal relationship from the construct to the items. Figure 3 presents such a model linking frailty causes to frailty consequences which can now be measured with the Frailty Ladder. The causes of frailty are systemic illnesses, unhealthy lifestyle, genes, and the environment.(1,14,50) If these causes were prevented or treated, then frailty would be avoided or reversed, and this reversal would be reflected in the items included in Version 1 of the Frailty Ladder. The reverse is not true: targeting the items will not necessarily change frailty, but physical capacity and performance would likely improve.



FIGURE 3 Theoretical model of frailty causes and consequences

Measurement of the effects of frailty interventions is a challenge. A systematic review of 47 studies of exercise interventions targeting frailty(3) found that each study used a multitude of outcome indicators, analyzed separately.

This preliminary work indicates that the Frailty Ladder would be an efficient and statistically robust way of combining the results of different frailty tests into one outcome measure. It would also be a way of identifying what outcomes to target in a personalized intervention targeting frailty. The rungs of the Frailty Ladder, the hierarchy, could be used to guide treatment goals.

The practicality for a clinical setting is that any questionnaire items or tests could be used to identify where a patient is located on the ladder. Not all tests need to be administered. For example, one test that covers a wide range of the linear continuum is the TUG, administered at a comfortable pace and fast as possible. A typical procedure with measures such as this is to test people on the middle item (logit 0) which is grip strength. If the person’s grip strength surpasses this threshold, a harder test can be administered. If muscle mass is available, this can also be used to situate the person on the ladder. Questionnaire items are only useful up to a certain point (logit 0). If a person has extreme or quite a bit of difficulty going up or down 10 stairs or more than a little difficulty standing for an hour (items from the LEFS), then the person is at the mid-point of the Frailty Ladder. To situate the person higher up, performance tests are needed. A feature of the Rasch model is that, when the data fit the model, any subset of items can provide the same information as all.

Interestingly, the components of the most widely used index of frailty, Fried’s frailty phenotype,(9) (weight loss, exhaustion, slow gait, weak grip, and low level of activity) are represented in the Frailty Ladder. Still, all of the items are at the lowest end of the scale, situated at 7.5 to <27 out of a total score of 53 (−3.5 to 0 on the logit scale). The level of activity that best fit the model of frailty shown here was <240 min of physical activity per week. The range of activity considered “sedentary” is <150 min of moderate or higher intensity activity per week.(51) Thus, Fried’s frailty phenotype, while highly sensitive, may be detecting people who may not be able to rebound out of frailty. Detecting people who are descending into frailty and intervening to stop this descent would be a better physical therapeutic and public health approach, and this ladder provides a way of identifying people entering this trajectory.

Strengths and Limitations

There are both strengths and limitations associated with this study. Having three data sets was a strength in that a wide range of abilities was covered, which is a requirement for creating a measure. We particularly selected data sets that had a variety of functional items, including a data set with tests of pulmonary function which are rarely collected in the geriatric context. In this way we made a unique contribution to the field by showing that pulmonary function and cognition fit with the frailty construct. The main limitation of this approach is that no one data source had all the necessary data, but by using Rasch analysis, we were able to create mathematical links between the data sets because of common items and a hierarchy among the items. This was a proof-of-concept study taking advantage of rich data sets with a wide variety of measures. Replication of the model in independent data sets is required.


Decades of research on frailty have consolidated our understanding of frailty, but measurement has not kept up with modern psychometric developments. This study bridges the measurement gap by enhancing our capacity to develop and test effective interventions to prevent and reverse frailty. Based on the results of this study, we propose that frailty is: “A dynamic health state experienced by an older person who, in the face of systemic illness or health threat, fails to maintain physiological and functional benchmarks expected for age and sex.”


Not applicable.


We have read and understood the Canadian Geriatrics Journal’s policy on conflicts of interest disclosure and declare there are none.


This study was funded by the Canadian Frailty Network.


1 Vina J, Tarazona-Santabalbina FJ, Perez-Ros P, Martinez-Arnau FM, Borras C, Olaso-Gonzalez G, et al. Biology of frailty: Modulation of ageing genes and its importance to prevent age-associated loss of function. Mol Aspects Med. 2016;50:88–108.
cross-ref  pubmed  

2 Cesari M, Costa N, Hoogendijk EO, Vellas B, Canevelli M, Perez-Zepeda MU. How the Frailty Index may support the allocation of health care resources: an example from the INCUR Study. J Am Med Dir Assoc. 2016;17(5):448–50.
cross-ref  pubmed  

3 Theou O, Stathokostas L, Roland KP, Jakobi JM, Patterson C, Vandervoort AA, et al. The effectiveness of exercise interventions for the management of frailty: a systematic review. J Aging Res. 2011;2011:569194.
cross-ref  pubmed  

4 Theou O, Jones GR, Jakobi JM, Mitnitski A, Vandervoort AA. A comparison of the relationship of 14 performance-based measures with frailty in older women. Appl Physiol Nutr Metab. 2011;36(6):928–38.
cross-ref  pubmed  

5 Cesari M, Prince M, Thiyagarajan JA, De Carvalho IA, Bernabei R, Chan P, et al. Frailty: An Emerging Public Health Priority. J Am Med Dir Assoc. 2016;17(3):188–92.
cross-ref  pubmed  

6 Cesari M, Nobili A, Vitale G. Frailty and sarcopenia: from theory to clinical implementation and public health relevance. Eur J Intern Med. 2016;35:1–9.
cross-ref  pubmed  

7 Vina J, Salvador-Pascual A, Tarazona-Santabalbina FJ, Rodriguez-Manas L, Gomez-Cabrera MC. Exercise training as a drug to treat age associated frailty. Free Radic Biol Med. 2016; 98:159–64.
cross-ref  pubmed  

8 Marzetti E, Calvani R, Tosato M, Cesari M, Di BM, Cherubini A, et al. Physical activity and exercise as countermeasures to physical frailty and sarcopenia. Aging Clin Exp Res. 2017;29(1):35–42.
cross-ref  pubmed  

9 Fried LP, Tangen CM, Walston J, Newman AB, Hirsch C, Gottdiener J, et al. Frailty in older adults: evidence for a phenotype. J Gerontol A Biol Sci Med Sci. 2001;56(3):M146–M57.
cross-ref  pubmed  

10 Andrew MK, Mitnitski AB, Rockwood K. Social vulnerability, frailty and mortality in elderly people. PLoS One. 2008;3(5):e2232.
cross-ref  pubmed  

11 Brinkman S, Voortman T, Kiefte-de Jong JC, van Rooij FJA, Ikram MA, Rivadeneira F, et al. The association between lifestyle and overall health, using the frailty index. Arch Gerontol Geriatr. 2018;76:85–91.
cross-ref  pubmed  

12 Livshits G, Ni Lochlainn M, Malkin I, Bowyer R, Verdi S, Steves CJ, et al. Shared genetic influence on frailty and chronic widespread pain: a study from TwinsUK. Age Ageing. 2018;47(1):119–25.

13 Lang PO, Michel JP, Zekry D. Frailty syndrome: a transitional state in a dynamic process. Gerontology. 2009;55(5):539–49.
cross-ref  pubmed  

14 Levasseur M, Genereux M, Bruneau JF, Vanasse A, Chabot E, Beaulac C, et al. Importance of proximity to resources, social support, transportation and neighborhood security for mobility and social participation in older adults: results from a scoping study. BMC Public Health. 2015;15(1):503.
cross-ref  pubmed  

15 Gobbens RJ, Luijkx KG, Wijnen-Sponselee MT, Schols JM. In search of an integral conceptual definition of frailty: opinions of experts. J Am Med Dir Assoc. 2010;11(5):338–43.
cross-ref  pubmed  

16 Rodriguez-Manas L, Feart C, Mann G, Vina J, Chatterji S, Chodzko-Zajko W, et al. Searching for an operational definition of frailty: a Delphi method based consensus statement: the frailty operative definition-consensus conference project. J Gerontol A Biol Sci Med Sci. 2013;68(1):62–67.

17 Canadian Frailty Network. What is frailty? Available from:

18 Enright PL, McBurnie MA, Bittner V, Tracy RP, McNamara R, Arnold A, et al. The 6-min walk test - A quick measure of functional status in elderly adults. Chest. 2003;123(2):387–98.
cross-ref  pubmed  

19 Nunnally JC. Psychometric theory, 2nd edition. New York: Mc Graw-Hill; 1978.

20 Saxon D. In praise of Lord Kelvin [feature]. Physics World. Published Dec. 17, 2007. Available from:

21 Rasch G. Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press; 1980.

22 Masters GN. The key to objective measurement in the psychosocial sciences. Australian Council for Educational Research; 2001.

23 Conrad KJ, Smith EV. International Conference on objective measurement: applications of Rasch analysis in health care. Med Care. 2004;42(1).
cross-ref  pubmed  

24 Hogan DB, MacKnight C, Bergman H. Models, definitions, and criteria of frailty. Aging Clin Exp Res. 2003;15(3 Suppl):1–29.

25 World Health Organization. International classification of functioning, disability and health. Geneva: WHO; 2001.

26 Azzopardi RV, Vermeiren S, Gorus E, Habbig AK, Petrovic M, Van Den Noortgate N, et al. Linking frailty instruments to the International Classification of Functioning, Disability, and Health: a systematic review. J Am Med Dir Assoc. 2016;17(11):1066.

27 Van Damme JK, Lemmon K, Oremus M, Neiterman E, Stolee P. Understanding frailty screening: a domain mapping exercise. Can Geriatr J. 2021;24(2):154–61.
cross-ref  pubmed  

28 Guaraldi G, Milic J. The interplay between frailty and intrinsic capacity in aging and HIV infection. AIDS Res Hum Retroviruses. 2019;35(11–12):1013–22.
cross-ref  pubmed  

29 Stenner AJ, Burdick DS, Stone MH. Formative and reflective models: can a Rasch analysis tell the difference? Rasch Measure Transact. 2008;22(1):1152–53.

30 Stenner AJ. Point-biserial fit indices. Rasch Measure Transact. 1995;9(1):416.

31 Peterson CH, Gischlar KL, Peterson NA. Item construction using reflective, formative, or Rasch measurement models: implications for group work. J Special Group Work. 2017;42(1):17–32.

32 Suikkanen S, Soukkio P, Aartolahti E, Kääriä S, Kautiainen H, Hupli MT, et al. Effect of 12-month supervised, home-based physical exercise on functioning among persons with signs of frailty: a randomized controlled trial. Arch Phys Med Rehabil. 2021;102(12):2283–90.
cross-ref  pubmed  

33 Daryanti Saragih I, Yang YP, Saragih IS, Batubara SO, Lin CJ. Effects of resistance bands exercise for frail older adults: a systematic review and meta-analysis of randomised controlled studies. J Clin Nurs. 2022;31(1–2):43–61.

34 Hobart J, Cano S. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods. Health Technol Assess. 2009;13(12):1–200.

35 Mayo NE, Moriello C, Scott SC, Dawes D, Auais M, Chasen M. Pedometer-facilitated walking intervention shows promising effectiveness for reducing cancer fatigue: a pilot randomized trial. Clin Rehabil. 2014;28(12):1198–209.
cross-ref  pubmed  

36 Carli F, Charlebois P, Stein B, Feldman L, Zavorsky G, Kim DJ, et al. Randomized clinical trial of prehabilitation in colorectal surgery. Br J Surg. 2010;97(8):1187–97.
cross-ref  pubmed  

37 Barbat-Artigas S, Pion CH, Leduc-Gaudet JP, Rolland Y, Aubertin-Leheudre M. Exploring the role of muscle mass, obesity, and age in the relationship between muscle quality and physical function. J Am Med Dir Assoc. 2014;15(4):303.

38 Fiore JF, Jr., Castelino T, Pecorelli N, Niculiseanu P, Balvardi S, Hershorn O, et al. Ensuring early mobilization within an enhanced recovery program for colorectal surgery: a randomized controlled trial. Ann Surg. 2017;266(2):223–31.

39 Auais M, Morin SN, Finch L, Ahmed S, Mayo N. Toward a meaningful definition of recovery after hip fracture: comparing two definitions for community-dwelling older adults. Arch Phys Med Rehabil. 2018;99(6):1108–15.

40 Theou O, Brothers TD, Mitnitski A, Rockwood K. Operationalization of frailty using eight commonly used scales and comparison of their ability to predict all-cause mortality. J Am Geriatr Soc. 2013;61(9):1537–51.
cross-ref  pubmed  

41 Andrich D. Rasch models for ordered response categories. Enc Stat Behav Sci. 2005;4:1698–707.

42 Pallant JF, Tennant A. An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS). Br J Clin Psychol. 2007;46(Pt 1):1–18.
cross-ref  pubmed  

43 de Vet HC, Terwee CB, Mokkink LKD. Measurement in medicine: a practical guide. Cambridge, UK: Cambridge University Press; 2011.

44 Hays RD, Morales LS. The RAND-36 measure of health-related quality of life. Ann Med. 2001;33(5):350–57.
cross-ref  pubmed  

45 Binkley JM, Stratford PW, Lott SA, Riddle DL, North American Orthopaedic Rehabilitation Research Network. The Lower Extremity Functional Scale (LEFS): scale development, measurement properties, and clinical application. Phys Ther. 1999;79(4):371–83.

46 Nasreddine ZS, Phillips NA, Bedirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 2005;53(4):695–99.
cross-ref  pubmed  

47 Mayo NE, Bronstein D, Scott SC, Finch LE, Miller S. Necessary and sufficient causes of participation post-stroke: practical and philosophical perspectives. Qual Life Res. 2014;23(1):39–47.

48 Bortone I, Sardone R, Lampignano L, Castellana F, Zupo R, Lozupone M, et al. How gait influences frailty models and health-related outcomes in clinical-based and population-based studies: a systematic review. J Cachexia Sarcopenia Muscle. 2021;12(2): 274–97.
cross-ref  pubmed  

49 Theou O, O’Connell MD, King-Kallimanis BL, O’Halloran AM, Rockwood K, Kenny RA. Measuring frailty using self-report and test-based health measures. Age Ageing. 2015;44(3):471–77.
cross-ref  pubmed  

50 Lang PO, Michel JP, Zekry D. Frailty syndrome: a transitional state in a dynamic process. Gerontology. 2009;55(5):539–49.
cross-ref  pubmed  

51 Owen N, Healy GN, Matthews CE, Dunstan DW. Too much sitting: the population health science of sedentary behavior. Exerc Sport Sci Rev. 2010;38(3):105–13.
cross-ref  pubmed  

52 Nightingale EJ, Pourkazemi F, Hiller CE. Systematic review of timed stair tests. J Rehabil Res Dev. 2014;51(3):335–50.
cross-ref  pubmed  

53 Bohannon RW. Reference values for the five-repetition sit-to-stand test: a descriptive meta-analysis of data from elders. Percept Motor Skills. 2006;103(1):215–22.
cross-ref  pubmed  

54 Bohannon RW, Larkin PA, Cook AC, Gear J, Singer J. Decrease in timed balance test scores with aging. Phys Ther. 1984;64(7): 1067–70.
cross-ref  pubmed  

55 Bohannon RW. Reference values for the timed up and go test: a descriptive meta-analysis. J Geriatr Phys Ther. 2006;29(2):64–68.
cross-ref  pubmed  

56 Bohannon RW. Test-retest reliability of hand-held dynamometry during a single session of strength assessment. Phys Ther. 1986;66(2):206–09.
cross-ref  pubmed  

Correspondence to: Nancy E Mayo, bsc(pt), msc, phd, Department of Medicine School of Physical and Occupational Therapy, McGill University, Division of Clinical Epidemiology Division of Geriatrics, McGill University Health Center (MUHC), Center for Outcomes Research and Evaluation (CORE), MUHC-Research Institute, 5252 de Maisonneuve, Office 2B:43, Montreal, QC, H4A 3S5,

(Return to Top)

Canadian Geriatrics Journal, Vol. 26, No. 1, March 2023