Article Text


Evaluation of studies of prognosis
  1. Ellen Fineout-Overholt, RN, PhD,
  2. Bernadette Mazurek Melnyk, RN, PhD, CPNP, FAAN
  1. Center for Nursing Research and Evidence-Based Practice, University of Rochester School of Nursing, Rochester, New York, USA

    Statistics from

    When patients first receive a diagnosis of a disease or condition, their initial questions often focus on “what can be done?”—that is, questions of treatment. Patients also want to know what will happen to them in the short and long term, in terms of disease progression, survival, and quality of life. These are questions of prognosis. For example, the family of a patient who has had a first ischaemic stroke will want to know if the patient will die, if current disabilities such as paralysis or aphasia will continue and for how long, what kind of life the patient can expect to have after discharge from hospital, and whether the patient is likely to have a recurrent stroke. The answers to some of these questions will likely influence decision making about treatment. If a patient is likely to die in the short term, families may be unwilling to initiate invasive treatments or those associated with pain or other adverse effects. Similarly, some conditions, such as the common cold, are self limiting and will resolve in time without treatment. In such cases, patients will often forgo treatment, especially if it is costly or has unpleasant side effects.

    Nurses, in various contexts, may be faced with questions of prognosis. It is therefore important for nurses to understand how to assess and interpret evidence related to disease prognosis. This users’ guide will focus on the critical appraisal of studies of prognosis. The specific questions that will guide this appraisal, initially outlined by Laupacis et al,1 are summarised below.

    Questions to help critically appraise studies of prognosis

    Are the results valid?

    1. Was there a representative and well defined sample of patients at a similar point in the course of the disease?

    2. Was follow up sufficiently long and complete?

    3. Were objective and unbiased outcome criteria used?

    4. Did the analysis adjust for important prognostic factors?

    What are the results?

    1. How large is the likelihood of the outcome event(s) in a specified period of time?

    2. How precise are the estimates of likelihood?

    Will the results help me in caring for my patients?

    1. Were the study patients similar to my own?

    2. Will the results lead directly to selecting or avoiding therapy?

    3. Are the results useful for reassuring or counselling patients?


    Prognosis refers to the expected outcomes of a disease or condition and the probability with which they are likely to occur.1,2 Expanding the definition further, prognosis includes the effects of a disease or condition over time and the estimated chance of recovery or ongoing associated morbidity, given a set of variables, which are called prognostic factors or prognostic indicators. Prognostic factors are variables that predict which patients are likely to do better or worse over time.2 For example, the Perth Community Stroke Study examined the factors that predicted death and disability at 5 years in patients with a first ever stroke who survived the first 30 days.3 Patients were assessed at baseline for 26 variables. At 5 years, 45% of patients had died, and 36% had new disabilities. Factors that predicted death or disability (ie, prognostic factors) included age, moderate or severe hemiparesis, and disability at baseline. More specifically, patients who had moderate hemiparesis at baseline were almost 3 times more likely to die or be disabled at 5 years, whereas those with severe hemiparesis were over 4 times more likely to die or be disabled. Thus, prognostic factors can help us to predict which patients are more or less likely to experience a given outcome.


    Questions of prognosis can be addressed by case-control studies or cohort studies. As well, randomised controlled trials implicitly address questions of prognosis, as each arm of a trial (treatment and control) can be seen as a cohort study.2

    Let’s consider the following question: which patients are most likely to die 30 days after a first acute myocardial infarction (MI). A case-control study design might involve identifying a group of patients with a first MI who had died (cases) and a group who had survived (controls), and then identifying the characteristics (prognostic factors) that distinguish between the 2 groups (eg, age or sex). Limitations of case-control studies include the risk that selection of cases and controls may be biased such that the groups differ systematically in unknown ways.1 Furthermore, retrospective collection of data on prognostic factors relies on the accuracy of people’s memories or the accuracy of medical charts.1 Such limitations decrease the strength of the evidence in guiding clinical decision making.2 Prognostic questions are best addressed using cohort study designs, which are not subject to the same problems as case-control studies. In our example, a cohort study would involve identifying a group of patients (cohort) at the time of their first MI, collecting baseline data on various characteristics that might be associated with the outcome (mortality), and then following up the cohort over time to see which patients die and which survive. Cohort studies may also include a control group. In our example, the control group could include people who have not had a stroke and are followed up over the same time period.


    You work in a paediatric primary care facility. Your first patient of the day is an 8 month old girl, Amy, who was recently discharged from hospital after an episode of meningitis. Amy has now been brought to your clinic for follow up. Her parents are concerned about whether Amy is likely to have any developmental problems or disabilities as a result of the meningitis. You don’t know the answer, but offer to find out.

    You begin by formulating a question as a basis for your search: are young children who have meningitis likely to have long term neurological, cognitive, behavioural, or developmental sequelae? To save time, you decide to search Evidence-Based Nursing online because the content includes only studies and reviews that meet specific methodological criteria. You begin searching by typing the terms “meningitis” and “prognosis” into the “Word(s) Anywhere in Article” field and identify an abstract4 of a study by Bedford et al5 on infants in England and Wales who had meningitis and were followed up for 5 years. As you begin to read the article, you use the questions summarised in the box to assess the quality of the study and the relevance of the findings to your question.


    Was there a representative and well defined sample of patients at a similar point in the course of the disease?

    It is important to have a representative sample of patients in order to minimise bias. Bias refers to systematic differences from the truth.2 In a prognosis study, bias can lead to systematic overestimates or underestimates of the likelihood of specific outcomes.2 For example, if patients were recruited from tertiary care centres, which typically deal with patients who have rare or severe conditions, the sample would not likely be representative of patients presenting in primary care settings.2 Authors should clearly indicate how a sample was selected and the criteria used to diagnose the condition.2

    It is also important that patients included in a prognosis study all have a similar prognostic risk so that meaningful conclusions can be drawn about the expected outcomes. Prognosis study samples should comprise an inception cohort of patients who are at a similar, clearly described point in the disease process. Inception cohorts often include patients with a first onset of a disease or condition (eg, a first ever MI) or those who have recently been diagnosed. The stage of a disease will clearly influence outcomes. For example, studies of 5 year mortality rates in women with breast cancer could include women diagnosed with different stages of breast cancer. We might expect higher 5 year mortality rates for women diagnosed with advanced stage cancer than those diagnosed at an earlier stage. When it is not possible to achieve a homogeneous sample (eg, participants are at disparate points in their illness trajectories), the authors should report the data by disease stage or some other indicator of severity (eg, Apache II scores).

    Let’s return to our clinical example and consider the study by Bedford et al5 on 5 year follow up of infants with meningitis. The sample comprised 1717 children (index children) who had survived an episode of acute meningitis during their first year and 1485 age and sex matched controls identified from the general practices of each index child. An earlier report of this study described the identification and selection of the sample.6 566 consultant paediatricians were sent monthly cards asking if they had managed any cases of acute infantile meningitis during the previous month. Clinical and laboratory information was collected on standard forms sent to the paediatricians and consultant microbiologist involved in the management of the case. Death certificates for all infants recorded as having died of meningitis were obtained from the Office of Population Census and Surveys. Initial inclusion criterion for the study was “intention to treat” by the paediatrician. Infants who had viable bacteraemia, viruses, or detectable bacterial antigen in the cerebrospinal fluid (CSF) or white cell counts in the CSF > 20 × 106 /l were included. Infants who had clinical conditions that were highly suggestive of meningitis but were too ill to have a lumbar puncture were also included. Infants with spina bifida and ventricular shunt infections were excluded.

    Thus, the original sample included children with confirmed meningitis (defined by objective criteria) except for those too ill for lumbar puncture. The children were initially identified by treating paediatricians and followed up through their general practitioners. The control group of age and sex matched children was selected from the general practices attended by index children. This suggests that the sample is representative of children in general practice settings. Although it was not explicitly stated that only infants with a first case of meningitis were included (inception cohort), this was probably the case given that all infants had contracted the disease before the age of 1 year.

    Was follow up sufficiently long and complete?

    The follow up period should be long enough to detect the outcomes of interest.1 That is, the appropriate length of follow up will depend on the outcome of interest. For example, to determine the risk of severe disability in patients with rheumatoid arthritis, a 10 year follow up period would yield more meaningful results than a 6 month follow up. In contrast, severity of West Nile infection after a bite from an infected mosquito will be evident within a few days. In addition to the length of follow up, readers of prognosis studies need to consider the completeness of follow up.1 If a large percentage of patients from the original sample are not available for follow up, the likelihood of bias may increase. That is, participants who are not available for follow up may have systematically higher or lower risks of particular outcomes than those who are available for follow up.2 Study participants may become unavailable during follow up because they move to different geographic locations, lose interest in participating in the study, or because they die. Study authors need to account for all patients included in the original sample and to provide information about the characteristics of patients who are lost to follow up. Patients who die need to be identified through death certificates or health databases. Excluding patients who die, or are lost to follow up for other reasons, would underestimate the positive or negative outcomes of disease.

    Applying these criteria to the study by Bedford et al, we see that infants who had meningitis in their first year were followed up to 5 years of age. The outcomes of interest, cognitive or behavioural disabilities, are likely to be identifiable by this time, particularly with the onset of formal schooling. Indeed, Bedford et al5 included information on type of schooling to determine degree of disability. The initial sample included 1880 children with meningitis, of whom 163 died, and 1485 children in the control group. Data were available at 5 year follow up for 1584 of the 1717 surviving children (92%) who had meningitis and 1391 of 1485 children in the control group (94%). The authors accounted for all children included in the original sample, specifying the reasons for missing data in both the index children and the control group. Reasons included emigration, loss to follow up, and lack of response by both parents and general practitioners to questionnaires. The high follow up rates of 92% and 94% help to minimise the possibility of bias resulting from large numbers of participants not being included in the analysis. (As an aside, Evidence-Based Nursing only abstracts prognosis studies that include ⩾80 follow up in order to minimise the possibility of bias).

    Were objective and unbiased outcome criteria used?

    Outcomes should be defined at the beginning of a prognosis study, and objective measures should be used when possible.2 Objectivity of outcomes can be described along a continuum of judgment.1 Some outcomes, such as death, are objective; they are easily measured and require no judgment—a person is either dead or alive. Other outcomes, such as disability or quality of life, are more difficult to quantify, and their measurement may be subject to liberal judgment by outcome assessors.1 The assessment of these more subjective outcomes could be influenced by knowledge of which prognostic factors were present at baseline. For example, a person assessing disability in patients with rheumatoid arthritis may be influenced by knowledge of the patient’s previous activity level, believing that those who were less active are more likely to have severe disability. To minimise the possibility of bias, it is especially important that those people assessing more subjective outcomes be blinded to the prognostic indicators of participants or that self administered questionnaires be used.1 Blinding of outcome assessors may not be needed when outcomes are objective or equivocal (eg, death).1

    Returning to the study by Bedford et al,5 we note that the main outcome of interest was disability. Data were collected using questionnaires completed by general practitioners and families of participants. General practitioners were asked about the child’s neuromotor development, learning, vision, hearing, speech and language, behaviour, and seizure disorders. Parents reported on their child’s health, development, and schooling. The questionnaires were specifically developed for this study, and no information was given about testing of the reliability or validity of the questionnaires.

    Obviously, general practitioners and parents were aware of whether the child had had an episode of meningitis and of the presence of specific prognostic factors, and this knowledge could have influenced their responses to questions about certain outcomes.

    The authors used data from both general practitioner and parent questionnaires to assign each child to 1 of 4 categories of disability based on an existing model: no disability (no developmental problems); mild disability (middle ear disease, strabismus, febrile convulsions, and behavioural problems); moderate disability (mild neuromotor disabilities, intellectual impairment, moderate sensorineural hearing loss, mild to moderate visual impairment, treatment controlled epilepsy, and uncomplicated hydrocephalus); and severe disability (severe neuromotor and intellectual impairment, severe seizure disorders, and severe visual or auditory impairment). The authors did not report whether the person(s) responsible for assigning levels of severity were blinded to knowledge about whether a given child had had meningitis or the presence of specific prognostic factors. Again, such knowledge could have influenced decisions about assigning a severity level, especially in areas where considerable judgement was needed to interpret the responses on the questionnaires. Thus, a possibility exists that bias may have influenced reporting by general practitioners and parents and the determination of levels of disability by study personnel.

    Did the analysis adjust for important prognostic factors?

    As previously stated, studies of prognosis usually collect data on several prognostic factors that are thought to influence the outcome of interest. Decisions about which prognostic factors are most relevant are usually based on clinical experience and an understanding of the biology of the disease.2 When analysing the results of a prognosis study, authors usually identify different groups of patients based on these prognostic factors, and adjust for these different factors in the analysis.1 Such adjustment is important to identify which factors best predict outcomes. For example, Camfield et al followed up an inception cohort of 692 children with epilepsy for up to 22 years to identify factors associated with all cause mortality.7 The mortality rate was 6% at 20 years after onset compared with a rate of 0.88% in the general population. Initial analyses seemed to indicate that children who had onset at birth and those with secondary generalised epilepsy were more likely to die after 20 years. However, these differences disappeared when the analyses adjusted for the presence of severe neurological disorder. That is, children with epilepsy who had severe neurological deficits had a substantially increased risk of death after 20 years (an increase of 210%) compared with the general population; children with epilepsy who did not have neurological deficits had a similar risk of death to that of the general population. Onset at birth and type of epilepsy were not, in fact, associated with differential mortality rates. Without the inclusion of “neurological disorder” as a prognostic factor in the analysis, one could have mistakenly assumed that children who developed epilepsy at birth and those with secondary generalised epilepsy had an increased risk of mortality.

    Treatments administered to patients can also modify outcomes, and thus may be considered when adjusting for prognostic indicators. Although interventions are not considered to be prognostic factors per se, differential application or receipt of treatments in patients may influence outcomes.1

    In their analyses, Bedford et al5 included age of onset of infection (neonatal period or later), organism associated with the infection, birth weight, and gestational age as prognostic factors.


    The results of a prognosis study have to do with quantification of the number of events that occur over a period of time.2 This result can be expressed in different ways, which are described below.

    How large is the likelihood of the outcome event(s) in a specified period of time?

    Most simply, the outcome of a prognosis study can be expressed as a percentage.1 For example, a study of infants born with HIV infection found that 26% had died at a median follow up of 5.8 years.8 Thus, one could say that an infant born with HIV infection has a 26% chance of dying at 5.8 years.

    We know, however, that the risk of a particular outcome may vary in patients with different prognostic factors. Estimates of risk in patients with different prognostic factors are often presented as relative risks (RRs) or odds ratios (ORs). The relative risk(RR) is the risk of patients with a specific prognostic factor experiencing the outcome divided by the risk of patients without the specific prognostic factor experiencing the outcome. (An RR can also be used to represent the risk of patients with the disease experiencing the outcome divided by the risk of patients without the disease [control group] experiencing the outcome.) If the risk of the outcome is the same in patients with and without the prognostic factor, the RR will be 1.0. If the RR is <1.0, the risk of the outcome is reduced in patients with the specific prognostic factor when compared with patients without the prognostic factor. If the RR is >1.0, the risk of the outcome is increased in patients with the prognostic factor when compared with those without the prognostic factor. The further away the RR is from 1.0, the greater the strength of the association between the prognostic factor and the outcome.9 For various statistical reasons, some studies will express the outcome as the odds of the event rather than the risk of the event. The odds ratio (OR) is the odds of the outcome in the patients with a specific prognostic factor divided by the odds of the outcome in patients without the prognostic factor.9 The interpretation of ORs  = 1, <1, and >1 is similar to that for RRs.9

    Sometimes, we will be interested in determining whether the risk of a particular outcome changes over time. For example, we know that the risk of death after a myocardial infarction is highest immediately after the event and decreases thereafter.2 To address changes in the risk of a particular outcome over time, authors often use survival analysis and represent the results as a survival curve or Kaplan Meier curve.2 A survival curve is a graph of the number of events (or freedom from events) over time.

    In the study by Bedford et al,5 we find that 247 children (16%) who had meningitis had severe or moderate disabilities at 5 years of age, whereas only 21 children in the control group (1.5%) had such disabilities. The RR of 10.33 means that children who had meningitis in their first year of life were over 10 times more likely to have moderate or severe disabilities by age 5 years than children who did not have meningitis. You will recall that the authors considered age at infection, organism, birth weight, and gestational age as prognostic factors. Bedford et al5 found that children who had meningitis within the first month of life were more likely to have moderate disabilities at 5 years than those who had meningitis after the first month; the percentage of children with severe disabilities did not differ by age of onset. As well, rates of severe or moderate disability differed by the type of infecting organism. After controlling for birth weight and gestational age, children who had had meningitis still had a 7 fold increase in the risk of severe or moderate disability (weighted RRs of 7.11 and 7.64, respectively).

    How precise are the estimates of likelihood?

    Studies can provide only estimates of the true risk of an outcome.1 Thus, it is important to determine the precision of estimates of risk. An RR provides an estimate of the risk of a given outcome for the study sample. Readers, however, need to be fairly certain that the estimated RR is close to the true population RR. Confidence intervals (CIs) are the most accurate means of showing precision1. The 95% CI is the range of risks within which we can be 95% sure that the true value for the whole population lies.2

    Returning to the study by Bedford et al,5 we see that the RR of 10.33 for moderate or severe disability at 5 years had a 95% CI of 6.60 to 16.0. This means that we can be 95% certain that the true population RR is between 6.6% and 16%. The weighted RRs and 95% CIs for severe or moderate disability after controlling for birth weight and gestational age were 7.11 (4.30 to 11.7) and 7.64 (4.56 to 12.79), respectively. Thus, we see that the RRs are associated with a moderate degree of precision.


    Were the study patients similar to my own?

    Generalisability of findings is a primary concern for researchers and users of evidence. In a study report, the sample must be described in sufficient detail so that clinicians can compare the sample to their own patients. As with any research, the more similar the study sample is to a clinician’s patients, the more certain she can be about applying the findings in clinical decisions.

    Based on the study report by Bedford et al,5 we don’t know much about the demographic characteristics of the sample. Information relating to the age, sex, and perhaps economic background of the sample might have been helpful for readers attempting to discern similarities or differences with their own patients. We do know, however, that this was a national study done in England and Wales. Readers from other countries should consider whether differences in their own settings could substantially alter the findings (eg, differences in health care or disease rates). We also know that the children with meningitis and those in the control group were followed up through their general practitioners, a context similar to the primary care setting in which you see Amy as a patient.

    Will the results lead directly to selecting or avoiding therapy?

    The study by Bedford et al5 found that children with meningitis were more than 10 times more likely to have moderate to severe disabilities by age 5 years than children who did not have meningitis. We also know that prognostic factors such as age of onset, birth weight, and gestational age do not really differentiate between children who are more or less likely to develop these disabilities. Type of infecting organism was, however, associated with the likelihood of moderate to severe disability. Obviously, none of this information will provide you, or Amy’s parents, with a definitive answer about what to do. Together, you will need to decide how to deal with the increased risk of disability. Decisions may relate to assessment—that is, what types of assessment can help to identify disabilities and when (and how frequently) should these assessments be done. You will likely need to gather more information on the specific types of disabilities that may occur and whether any treatments are effective in preventing, delaying, or overcoming these disabilities.

    Are the results useful for reassuring or counselling patients?

    As suggested previously, treating a patient is not always the desired goal. Sometimes, evidence from a prognostic study can assist practitioners or families to determine whether interventions should be initiated, especially if the likelihood of adverse outcomes is high. Similarly, some diseases have good prognoses, and patients and families may decide to forgo treatment because a positive outcome is likely. Patients and families should be involved in clinical decisions and provide their views on the risks and benefits of any assessments or treatments given the likelihood of outcomes of interest.

    Your interpretation of the results of the study by Bedford et al5 suggests that the risk of disability, while increased, does not preclude consideration of assessment or other intervention. With this in mind, you discuss with Amy’s parents their views about the value of assessing Amy over the next couple of years or doing nothing.


    You meet with Amy’s parents to discuss what you have learned from the study by Bedford et al5. You note that you do not know much about the demographics of the study sample or how meningitis was treated. You also note that the authors followed up over 90% of children up to 5 years of age, which increases your confidence in the findings. It is clear that about 1 in 6 children who have meningitis in the first year of life (ie, 16%) will have moderate to severe developmental disabilities at 5 years of age. The risk is about 10 times that of children generally, but differs depending on the type of infective agent. In Amy’s case, the infective agent was Neisseriameningitidis, which is associated with 9.4% risk of severe to moderate disability. This risk is about 6 times that of children generally. If Amy had been infected by Group B streptococcus, a somewhat rarer infective agent, she would have had a 30% risk of moderate to severe disability, which is about 20 times that of children generally.

    Based on this information, you and Amy’s parents begin to discuss Amy’s likely needs.


    Studies of prognosis can provide clinicians with useful information about the expected outcomes of a disease or condition and the probability at which they are likely to occur. Assessment of relevant prognostic factors can help to identify which patients are more or less likely to experience a given outcome, and can serve as a basis for clinical decisions about treatment. Some key considerations when appraising studies of prognosis include the sample (an inception cohort, where patients have a similar prognostic risk), inclusion of relevant prognostic factors in data collection and analysis, sufficient length of follow up with respect to the outcomes of interest, percentage of patients followed up (higher percentages help to minimise bias), and objectivity of outcomes (more objective outcomes and blinding of outcome assessors help to minimise bias).

    New look for 2004

    You will notice some changes in the appearance of this issue, primarily in the format of abstracts and commentaries for quantitative studies.* The abstract sections dealing with study methods now appear in a separate text box. Key information for each section is presented in point form, and each section is represented by an icon. For example, the Patients section in all abstracts will be accompanied by the following icon:

    Embedded Image

    Most of the icons simply provide readers with quick visual identification of the section. A few, however, represent specific information about the methods of a particular study. The icons accompanying the Allocation and Blinding sections indicate what was actually done in the study. That is,

    Embedded Image represents concealed allocation
    Embedded Image represents unconcealed allocation
    Embedded Image represents unclear allocation concealment
    Embedded Image represents a blinded study (ie, all relevant groups listed in the definition of blinding in the glossary are blinded)
    Embedded Image represents a partially blinded study (ie, some groups are blinded)
    Embedded Image represents an unblinded study
    Embedded Image represents a study with unclear blinding

    Thus, readers can identify at a glance, these 2 key indicators of a study’s quality.

    We hope these changes result in a more visually appealing page, which facilitates reading and understanding of the content.

    View Abstract


    • * Because the methods relating to qualitative study designs are more varied and less standardised, we felt that these methods were best represented in a more flexible, descriptive format.

    Request permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.