Critical appraisal of cost-effectiveness and cost-utility studies in health care
Increasingly, cost-effectiveness and cost-utility analyses (broadly referred to as cost-effectiveness analyses [CEAs]) are used to inform resource allocation decisions in health care. National bodies, such as the UK National Institute of Health and Clinical Excellence (NICE), prefer CEAs to cost–benefit and cost-minimisation studies.
CEAs assess value for money by relating the costs and consequences (effects) of 2 or more health technologies. Ascertaining value for money is important since healthcare sectors have fixed budgets. This means that a decision made in favour of one technology is associated with an opportunity cost in the form of the health technologies now forgone. In our previous Notebook,1 we gave an overview of what economic evaluation involves and why such research is important.
Like all forms of research evidence, CEAs need to be critically appraised for validity, clinical importance, and applicability before they are given a “weight” in decision-making processes. The aim of this Notebook is to guide readers through the stages of evaluating published CEAs. Broadly speaking, these studies take 2 forms: a single source of data collected in the context of primary research (ie, a randomised controlled trial [RCT]) or synthesis of data from multiple sources in a decision-analytic model. Decision-analytic models represent a simplified version of “real world” decision making.
As with other research designs, appraisal of CEAs is facilitated by applying a series of tailored questions to the research study. The seminal guide to critical appraisal of economic evaluations was developed by Drummond et al.2 In this Notebook, we draw on the Drummond checklist as well as the UK requirements for economic evaluation.3 We present a series of questions under 4 main headings: defining and presenting the decision problem, measurement and data, analysis, and discussion (table). Broadly, this guide can be applied to CEAs done as part of primary research and modelling studies. More detailed information on each can be found elsewhere.4 5
1. DEFINING AND PRESENTING THE DECISION PROBLEM
1.1 Is there a clearly phrased, answerable research question?
A CEA study should clearly outline its aim(s). Although this information is often included in other sections of the study, presenting an overview allows readers to assess quickly how relevant the research is and how well the design addresses the question posed. A well-phrased research question can address points (a) to (d) below.
(a) Does the study compare the cost and consequences of alternative health technologies?
By definition, a CEA must compare 2 or more health technologies, evaluating their costs and consequences. The health technologies of interest should be clearly identified at the outset.
(b) Is the patient population clear?
Study findings are useful only if the patients to whom they apply are clearly identified. Readers should be provided with a clear description of the disease(s) or condition(s) being evaluated, along with a clear description of the patient population.
(c) Which clinical policies were explored?
There may be varying clinical policies directing the use of a new technology. For example, in their recent CEA, Gillies et al6 modelled different policies for screening and prevention of type 2 diabetes. One policy involved one-time screening at 45 years of age and another involved up to 2 additional screenings at 50 and 60 years of age.
In all CEAs, readers should assess how the evaluated policies relate to practice. Important details may include how, when, and where the health technologies were delivered and at what frequency. This information tells us how the health technologies relate to current standard practice. Importantly, when a single clinical policy is evaluated, internal validity (ie, quality of the data) is often high, but external validity (ie, generalisability of the evidence) can be limited.
(d) From whose perspective is the study being conducted?
A good CEA will state the perspective from which the evaluation is being conducted. The perspective establishes the relevance of the study for specific decision makers since it dictates which costs and consequences should be measured (see Measurement and data).
1.2 Are relevant alternatives clearly described and assessed?
A CEA should evaluate health technologies relevant to current practice. Comparing a new technology to an outdated or unsuitable comparator limits the conclusions that can be applied to current practice. Ideally, for comprehensive decision making, a CEA must compare the use of the new technology with all relevant strategies.7 When appraising a study, readers must assess whether there are important comparators excluded from the study. Finally, evidence of clinical effectiveness should be cited within the CEA.
2. MEASUREMENT AND DATA
2.1 What type of study has been conducted, and why?
In any CEA, the choice of study design must be appropriate for the underlying research question. The growing demand for cost-effectiveness information about health technologies means that CEAs are increasingly being conducted within RCTs. A well-conducted trial-based CEA provides useful data with high internal validity—but there are also limitations. Where multiple strategies for treating a disease or illness exist, individual RCTs provide only partial information. Time, expense, and feasibility limit the number of health technologies that can be assessed in a single study. These same factors can also limit trial duration. Additionally, basing decisions on a trial-based CEA alone means that other relevant evidence may be excluded.8
Optimal decisions regarding allocation of health technologies require the use of all available evidence on all relevant competing strategies. If evidence comes from multiple data sources, it must be synthesised using a decision-analytic model.9
2.2 Have all important and relevant costs and consequences been identified?
(a) Are all important costs included?
Important costs are those incurred when applying the health technology and those associated with its impact. Thus, decisions about which costs to include require a clear understanding of the health technologies being delivered as well as the epidemiology of the relevant condition. As highlighted above, a full description of the health technologies and the perspective taken will help readers judge the completeness of included costs.
It is unlikely that a CEA will have measured all costs associated with alternative technologies—a CEA is not the same as a burden of illness (or cost of illness) study.10 As CEAs evaluate the differences between health technologies in costs and consequences, some cost categories can legitimately be excluded. A CEA need not consider costs incurred due to future, unrelated illness occurring in years of life unaffected by the current technologies of interest.
(b) Do the costs included reflect the relevant perspective?
The study perspective will dictate the relevant costs to be assessed. If the cost perspective is that of the healthcare provider, resources may include staff time, drugs prescribed, and length of hospital stay. NICE guidance suggests that resource use should be measured from the perspective of the NHS and personal social service, whereas outcomes should comprehensively measure changes in the health of individuals. When a societal perspective is taken, the measurement of all costs, regardless of who bears them, is required. To ensure a comprehensive and useful assessment of cost-effectiveness, studies that do not take a societal perspective for the main analysis can still present relevant societal data for reference. For example, in a cost-utility analysis conducted alongside an RCT investigating physiotherapy for back and neck pain, the perspective for the main analysis was that of the NHS.11 However, data on private treatment costs incurred by trial participants and the costs of usual activities missed were also presented for completeness.
(c) Do the outcomes measured relate to the research question?
A CEA should clearly state the consequences assessed and how these were measured. If natural units were used (eg, survival time, disease-free time), these should be relevant to the decision maker, and fully justified. Moreover, effectiveness outcomes must capture all health gains and losses associated with use of the technology (eg, adverse events). From a policy perspective, a cost-utility approach is arguably the most valuable because it allows comparison of value for money across multiple diseases and health technologies. This is useful when considering the adoption of a new technology since comparisons can be made with benefits of the technologies being superseded.
2.3 Were costs and consequences measured accurately?
(a) Over what time period are costs and consequences being assessed?
Cost and consequences must be considered over an appropriate and clearly stated time period. The defined time period should be capable of capturing the differential costs and consequences of the health technology. If the time horizon is shorter than the potential time frame of costs incurred and/or benefits gained from the technologies being evaluated, the conclusions of the study are limited. If data for the required time period are not available, extrapolation over longer time periods can be done.
(b) How were costs and consequences measured?
After evaluating whether relevant costs and consequences were identified, readers must appraise how well they were measured. This includes assessment of what data were collected, how they were collected, when they were collected and how they were synthesised.
Identifying all important costs and consequences should help dictate what data should be collected. How and when data were collected relate to the study type. In studies such as RCTs, data on resource use and effectiveness can be prospectively collected from health professionals and participants at appropriate times.
When appraising a decision-analytic model, one should evaluate whether systematic searches were done to capture all relevant research.12 Included data should be synthesised systematically using recognised evidence synthesis techniques, such as meta-analysis and indirect treatment comparisons.13 14
In decision-analytic model studies, relative treatment effects should be estimated from at least 1 relevant and robust RCT, if not from multiple RCTs. As RCT data may be limited or unavailable, supplemental high-quality observational studies will often be required (eg, long-term outcome data, including mortality, adverse events, or unanticipated benefits). Other data sources might include cohort studies for parameters relating to the natural history of the condition and cross-sectional surveys for resource use and costs. When there is no evidence to inform aspects of the CEA, expert opinion may be used with caution. All sources of included data should be provided.
Modelling studies should be explicit about the limitations of included data (eg, use of observational data for effectiveness parameter estimation). Attempts to overcome these limitations (eg, quasi-experimental studies)15 should be described and the impact of these quantified.
2.4 Were the costs and consequences valued credibly?
A common method for estimating costs in CEAs is to measure units of resource use associated with the health technologies. Prices are then assigned to each unit (unit cost). Unit costs can be obtained directly from the source (eg, hospital finance department). If this is the case, the mechanism by which the costs used in the study were calculated should be explained. In practice, limited time and data mean that unit costs are often taken from publicly available sources. These unit costs are useful tools for CEAs; they are widely used and arguably increase the comparability of results across studies.3 UK sources of unit costs include the British National Formulary,16 NHS reference costs,17 and national datasets of NHS and social services staff and services costs.18 Such sources should be clearly referenced. It is acknowledged that these unit costs can be imperfect tools since market prices and real values differ in many circumstances.19 Unit costs should be reported separately from the volume of resource use.20 Finally, in any CEA it is important for readers to know the monetary unit used (eg, Euros, pounds sterling, US dollars) and the year to which the unit costs relate.
When conducting a cost-utility analysis, one must value the “life lived” by using a utility index. Frequently, a health descriptor is applied to the population of interest, and a predefined algorithm is applied to results in order to obtain a single index. NICE recommends the use of validated generic health state descriptors (ie, EQ-5D) and choice-based utility conversion algorithms (based on preference elicitation) from a population sample, such as in Dolan et al21 Where patient preferences have been measured directly in a study, the methods should be described clearly.
2.5 Were costs and consequences discounted where required?
In economic terms, we generally prefer to spend money in the future but obtain benefits in the present. CEAs must adjust their values to reflect this preference. This process is called discounting. Discounting costs accounts for the fact that there is an opportunity cost to spending money now rather than in the future.
It is now generally accepted that benefits should also be discounted.7 Again, there is a time preference to obtaining health benefits, and we prefer to have benefits now rather than in the future.22 23 The values used for discounting in a CEA should be clearly stated.
3.1 Was the analytic methodology appropriate?
CEA is a young and rapidly developing methodology. Important analytic developments take place regularly and can be complex. Yet for all CEAs, the mean costs and consequences are always of interest. Below, we outline some of the main issues in assessing how trial-based costs and consequences are calculated. These are (where relevant) whether the analysis accounted for censoring and missing data, if any distributional assumptions were made, if correlation between costs and effects was accounted for, and methodology for dealing with multinational data.
Not accounting for censoring and missing data can lead to biased mean estimates. Missing values can be imputed.24 For censoring, other analytical methodologies can be used (eg, restricted mean approach or inverse probability weighting approach).25 26
Often costs and consequences are assumed to be normally distributed, but this assumption may not hold. Recently, both frequentist and Bayesian methods have applied non-normal distributions (such as the Gamma distribution) to cost-effectiveness estimation.
It is important to consider the correlation between costs and consequences for accurate estimates. Bootstrapping is frequently used; however, both frequentist and Bayesian bivariate regression are alternative methodologies.26 27
Models with interaction terms or hierarchical models can be used to explore between-country variation in multinational trials.28
Additionally, there is increasing recognition in CEA about the importance of adjusting statistical analyses for important prognostic and baseline variables.27 This is particularly important if there is evidence of baseline imbalances in a trial.
As the purpose of models is to simplify complex relations, selecting the appropriate design and structure is vital.29 30 When appraising a decision-analytic model, readers must assess if the model structure and design describe the problem of interest and if the model is computationally viable. The structure of the model must include all important stages of the condition of interest so that corresponding costs and consequences can be assigned and the transition of patients between these states modelled. However, the model must also be practical and parsimonious.
The most common designs for decision-analytic models are decision trees, state transition models (eg, Markov models),31 and simulation models.32 Decision-tree models do not explicitly account for time and thus are not usually adequate for modelling chronic disease patient pathways. Markov models can model time; but sometimes more complex scenarios require the use of simulation models. All models require assumptions to be made; these must be documented because the internal and external validities of the model rely on the reasonableness of the assumptions. Additional details on appraising decision-analytic models can be found in Philips et al.4
3.2 Was subgroup analysis conducted, if relevant?
Heterogeneity is as relevant to CEAs as it is to RCTs and systematic reviews; that is, cost-effectiveness outcomes might differ between patient subgroups.33 Not considering heterogeneity can bias cost-effectiveness results or lead to imprecise estimates. The identification of patient subgroups for whom the technology might potentially be cost effective is one of the NICE goals for CEAs.3 For example, teriparatide (a drug aimed at reducing risk of osteoporotic fracture) was found to be cost effective but only in patients with a very high fracture risk;34 this finding contributed to national guidance on bisphosphonate use.35 Readers should use their knowledge to judge whether subgroup analyses would potentially be important.
3.3 Was an incremental analysis conducted?
A CEA aims to evaluate the incremental difference in costs and consequences between health technologies. This comparison results in 1 of 4 outcomes:
the new health technology is more costly and less effective than the comparator (new health technology is dominated)
the new health technology is less costly and more effective than comparator (new health technology dominates)
the new health technology is more costly and more effective than comparator
the new health technology is less costly and less effective than the comparator
When the last 2 options occur, a decision rule is required and the incremental cost-effectiveness ratio, or the net benefit, must be calculated.36 There may also be a fifth option where there is no difference between health technologies, but this situation is not common.
3.4 Was uncertainty adequately assessed?
When evaluating health technologies, there is uncertainty about the extent to which the estimate reflects the “true value.” Evaluating uncertainty in parameter estimates should be done through probabilistic sensitivity analysis, with all parameters considered jointly.3 Where distributions are used to compute the probabilistic analysis, these should be spelled out. Uncertainty should be expressed using cost-effectiveness acceptability curves (alongside cost-effectiveness planes or frontier curves). These help to assess the probability of making the correct decision.37 38 Uncertainty evaluation can be complemented using sensitivity analysis, where characteristics of the main analysis are varied to assess their impact. For example, the VenUS I trial investigated the cost effectiveness of 2 compression bandaging systems (4-layer bandaging versus short stretch) in the treatment of venous leg ulcers.39 A sensitivity analysis was conducted to assess whether costing the 4-layer bandaging as commercially available kits rather than using their consistent parts (base case) impacted on the cost effectiveness. It did not.
4. DISCUSSION OF RESULTS
Finally, a good study should present its findings in light of research that has gone before it, outlining how this new piece of works adds to the current evidence base. No research study is perfect, and authors should be open about the limitations of their work. Limitations include issues already discussed, such as assumptions made, limited generalisability, and analytical issues such as how missing data were managed. Finally, the authors should suggest how cost-effective health technologies could be implemented in current practice. While all of this information is useful, the final conclusions about the quality and relevance of the work should be made by readers after they have appraised the study.
NHS ECONOMIC EVALUATION DATABASE
This Notebook has described how to assess whether a high-quality CEA has been conducted and reported. There is an important, publicly available resource that can facilitate the appraisal process. The NHS Economic Evaluation Database (www.crd.york.ac.uk/crdweb/) is funded by the Department of Health’s NHS Research and Development Programme and housed at the Centre for Reviews and Dissemination (University of York, UK). At the end of December 2007, the database contained 6768 abstracts of economic evaluations. Each abstract is a detailed structured record describing the economic evaluation and is supplemented by a critical commentary that provides a summary of reliability and generalisability issues and discusses implications for the NHS.