Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Systematic Reviews: Methods and Procedures George A. Wells Editor, Cochrane Musculoskeletal Review Group Department of Epidemiology and Community Medicine University of Ottawa Ottawa, Ontario, Canada Meta-analysis: • Meta-analysis is a statistical analysis of a collection of studies • Meta-analysis methods focus on contrasting and comparing results from different studies in anticipation of identifying consistent patterns and sources of disagreements among these results • Primary objective: • Synthetic goal (estimation of summary effect) vs • Analytic goal (estimation of differences) • Systematic Review: – the application of scientific strategies that limit bias to the systematic assembly, critical appraisal and synthesis of all relevant studies on a specific topic • Meta-Analysis: – a systematic review that employs statistical methods to combine and summarize the results of several studies Features of narrative reviews and systematic reviews QUESTION NARRATIVE SYSTEMATIC Broad Focused SOURCES/ SEARCH Usually unspecified Comprehensive; Possibly biased explicit SELECTION Unspecified; biased?Criterion-based; uniformly applied APPRAISAL Variable SYNTHESIS Usually qualitative Quantitative INFERENCE Sometimes evidence-based Rigourous Usually evidencebased Steps of a Cochrane Systematic Review • Clearly formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report • What is the study objective  to validate results in a large population  to guide new studies  Pose question in both biologic and health care terms specifying with operational definitions  population  intervention  outcomes (both beneficial and harmful) Inclusion Criteria • Study design • Population • Interventions • Outcomes Steps of a Cochrane Systematic Review • Clearly formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report • • • • Need a well formulated and co-ordinated effort Seek guidance from a librarian Specify language constraints Requirements for comprehensiveness of search depends on the field and question to be addressed • Possible sources include:            computerized bibliographic database review articles abstracts conference proceedings dissertations books experts granting agencies trial registries industry journal handsearching • Procedure:  usually begin with searches of biblographic reports (citation indexes, abstract databases)  publications retrieved and references therein searched for more references Published Reports (publication bias ie. tendency to publish statistically significant results)  as a step to elimination of publication bias need information from unpublished research  databases of unpublished reports  clinical research registries  clinical trial registries  unpublished theses  conference indexes Steps of a Cochrane Systematic Review • Clearly formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report Study Selection • 2 independent reviewers select studies • Selection of studies addressing the question posed based on a priori specification of the population, intervention, outcomes and study design • Level of agreement: kappa • Differences resolved by consensus • Specify reasons for rejecting studies Data Extraction • 2 independent reviewers extract data using predetermined forms – – – – Patient characteristics Study design and methods Study results Methodologic quality • Level of agreement: kappa • Differences resolved by consensus Data Extraction …. • Be explicit, unbiased and reproducible • Include all relevant measures of benefit and harm of the intervention • Contact investigators of the studies for clarification in published methods etc. • Extract individual patient data when published data do not answer questions about: intention to treat analyses, time-toevent analyses, subgroups, doseresponse relationships Steps of a Cochrane Systematic Review • Well formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report Description of Studies • Size of study • Characteristics of study patients • Details of specific interventions used • Details of outcomes assessed Methodologic Quality Assessment • Can use as: • threshold for inclusion • possible explanation form heterogeneity • Base quality assessments on extent to which bias is minimized • Make quality assessment scoring systems transparent and parsimonious • Evaluate reproducibility of quality assessment • Report quality scoring system used Quality Assessment: Example Study Random Blinding Dropouts Adami 1995 + + + Black 1996 ++ + + Bone 1997 + + -- Chestnut 1995 + + + Hosking 1998 + -- + Liberman 1995 + + + McClung 1998 + + + ++ indicates that randomization was appropriate ( e Random numbers were computer generated) g Steps of a Cochrane Systematic Review • Well formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report Outcome Discrete (event) Odds Relative Ratio Risk (OR) (RR) Continuous (measured) Risk Difference (RD) (Basic Data) Mean Difference (MD) Standardized Mean Difference (SMD) (Basic Data) Overall Estimate Overall Estimate Fixed Effects Random Effects Fixed Effects Random Effects Effect measures: discrete data P1 = event rate in experimental group P2 = event rate in control group • • • • • RD = Risk difference = P2 - P1 RR = Relative risk = P1 / P2 RRR = Relative risk reduction = (P2-P1)/P2 OR = Odds ratio = P1/(1-P1)/[P2/(1-P2)] NNT = No. needed to treat = 1 / (P2-P1) Example Experimental event rate = 0.3 Control event rate = 0.4 RD = 0.4 - 0.3 RR = 0.3 / 0.4 RRR = (0.4 - 0.3) / 0.4 OR = (0.3/0.7)/(0.4/0.6) NNT = 1 / (0.4 - 0.3) = 0.1 = 0.75 = 0.25 = 0.64 = 10 Discrete - Odds Ratio (OR) Event a Experimental Control No event b c d Pe  a ne Pc  c nc Odds: number of patients experiencing event number of patients not experiencing event Odds ratio: Odds in Experimental group Odds in Control group OR= Basic Data  Pe    1-P  e a/ne  Pc   1-Pc  ad =  bc c/nc ne nc Discrete - Odds Ratio Example Experimental Event 13 Control No event 33 7 31 Pe  13 46 Pc  7 38 13 * 31 OR   1.745 7 * 33 Basic Data 13/46 7/38 46 38 Discrete - Relative Risk (RR) Event a Experimental Control No event b c d Pe  a ne Pc  c nc Risk: number of patients experiencing event number of patients Risk Ratio: Risk in Experimental group Risk in Control group RR  Basic Data a(c  d) Pe Pc  (a  b)c a/ne c/nc ne nc Discrete - Relative Risk - Example Experimental Event 13 Control No event 33 7 31 Pe  13 46 Pc  7 38 13 / 46 RR  Pe Pc   1.534 7/38 Basic Data 13/46 7/38 46 38 Discrete - Risk Difference (RD) Event a Experimental Control No event b c d Pe  a ne Risk: ne nc Pc  c nc number of patients experiencing event number of patients Risk Difference: (Risk in Experimental group) - (Risk in Control group) RD = Pe- Pc Basic Data a/ne a c   ab cd c/nc Discrete - Risk Difference - Example Event 13 Experimental Control No event 33 7 31 Pe  13 46 Pc  7 38 RD = Pe- Pc = 13/46 - 7/38 = 0.098 Basic Data 13/46 7/38 46 38 Discrete - Odds Ratio (O) Event a Experimental Control c p e  a ne Estimator: p̂ e /(1  pˆ e ) o pˆ c /(1  pˆ c ) Standard Error: sL o No event b d ne nc p c  c nc L o  ln(o)   1 1      ne pe (1  pe ) nc pc (1  pc )  L o  Z/2 sLo 100(1-  )% CI: exp(L o  Z/2 sLo ) 1/2 Discrete - Relative Risk (R) Experimental Event a Control c p e  a ne Estimator: r  p̂ e /p̂c No event b d ne nc pc  c nc L r  ln(r) 1 - p e 1  pc  sL r     n p n p c c   e e Standard Error: 100(1-  )% CI: exp(Lr  Z/2 sLr ) L r  Z/2 sLr 1/2 Discrete - Risk Difference (D) Experimental Control Event a c p e  a ne Estimator: No event b d nc pc  c nc d  p̂ e - p̂c Standard Error:  p e (1 - p e ) pc (1  pc )  sd     n n e c   100(1-  )% CI: d  Z/2 sd ne 1/2 When to use OR / RR / RD Association OR RR RD (0,) (0,) (- 1,1) ‘Decreased’ <1 <1 <0 None 1 1 0 ‘Increased’ >1 >1 >0 OR vs RR Odds Ratio RR = Relative Risk if event occurs infrequently (i.e. a and c small relative to b and d) a(c+d) ad = OR (a+b)c bc Odds Ratio > Relative Risk if event occurs frequently RD vs RR When interpretation in terms of absolute difference is better than in relative terms (eg. Interest in absolute reduction in adverse events) PROPERTIES OF RISK DIFFERENCE (RD), RELATIVE RISK (RR) AND ODDS RATIO (OR) RD RR OR Simple measure? Yes Yes No Symmetric (measure unaffected by labelling of study groups)? Yes No Yes Predicted event rates restricted to [0,1] if measure is assumed constant? No No Yes Unbiased estimate available? Yes No No Efficient estimation in small samples? No No Yes Motivating biological model available? Yes Yes Yes Continuous Data - Mean Difference (MD) number mean standard deviation Experimental ne xe se Control nc xc sc Mean difference (MD) : se (xe - xc )  xe - xc se2 sc2  ne nc 100(1 -  ) % CI : ( xe -xc )  Z / 2 se (xe -xc ) Continuous Data - Standardized Mean Difference (SMD) number mean standard deviation Experimental ne xe se Control nc xc sc x -x df e c s SMD : where : s f (ne  1)s2e  (nc  1)s2c n e  nc  2 4(ne  nc  2)  4 4(ne  nc  2)  1  ne  nc  d se(d)     n n 2 ( n  n ) e c   e c 2 1/ 2 100(1 -  )% CI : d  Z/2 se(d) When to use MD / SMD Mean Difference • When studies have comparable outcome measures (ie. Same scale, probably same length of follow-up) • A meta-analysis using MDs is known as a weighted mean difference (WMD) Standardized Mean Difference • When studies use different outcome measurements which address the same clinical outcome (eg different scales) • Converts scale to a common scale: number of standard deviations Example: Combining different scales for Swollen Joint Count Study Expt Mean SD N Control Mean SD 12 19.4 N MD SMD Andersen 6.9 5.2 Furst 18.0 11.0 17 27.0 15.0 16 -9.0 -0.671 Pinheiro -- -- -- -- -- -- -- 12.2 12 -12.5 -1.287 -- Weinblatt 20.0 7.75 15 23.0 8.0 16 -3.0 -0.371 Williams 12.6 56 25.0 13.4 48 -8.0 -0.612 17.0 Sources of Variation over Studies • “True” inter-study variation may exist (fixed/random-effects model) • Sampling error may vary among studies (sample size) • Characteristics may differ among studies (population, intervention) Modelling Variation • Parameter of interest:  (quantifies average treatment effect) • Number of independent studies: k • Summary Statistic: Yi (i=1,2,…,k) • Large sample size: asymptotic normal distribution Fixed-effects model vs Random-effects model Fixed-Effects Model • Outcome Yi from study i is a sample from a distribution with mean  (ie. common mean across studies) • Yi are independently distributed as N (  ,si ) (i=1,2,…,k) where s i2 = Var(Yi ) and assume E(Yi) =  2 Fixed-Effects Model x Random-Effects Model • Outcome Yi from study i is a sample from a distribution with mean  i (ie. study-specific means) • Yi are independently distributed as N ( i , s i2) 2 (i=1,2,…,k) where s i = Var(Yi ) and assume E(Yi) =  i •  i is a realization from a distribution of ‘effects’ with mean  2 •  i are independently distributed as N ( , ) (i=1,2,…,k) where •  2 = Var (  i ) is the inter-study variation •  is the average treatment effect Random-Effects Model x Random-Effects Model ….. Estimating Average Study Effect  • after averaging study-specific effects, distribution 2 2 of Yi is N (  , si   ) 2 • although  is parameter of interest,  must be considered and estimated Estimating Study-Specific Effects  i • distribution of  i conditional on observed data,  , 2 and is N ( Fi  (1  Fi )Yi , si2 (1  Fi ) ) • where Fi is the shrinkage factor for the ith study Fi  si2 /(si2   2 ) Modelling Variation • Studies are stratified and then combined to account for differences in sample size and study characteristics • A weighted average of estimates from each study is calculated • Question of whether a common or study-specific parameter is to be estimated remains …. Procedure: • perform test of homogeneity • if no significant difference use fixed-effects model • otherwise identify study characteristics that stratifies studies into subsets with homogeneous effects or use random effects model Fixed Effects Model • Require from each study  effect estimate; and  standard error of effect estimate Combine these using a weighted average: pooled estimate = sum of (estimate  weight) where weight sum of weights = 1 / variance of estimate • Assumes a common underlying effect behind every trial Fixed-Effects Model: General Scheme Study Measure Std Error Weight 1 2 . . . k Y1 Y2 . . . Yk s1 s2 . . . sk W1 W2 . . . Wk (no association: Yi=0) Overall Measure: ˆmle  W Y W i i i i i se(ˆ )  1 W i i 100(1   )% CI : ˆ  Z / 2se(ˆ ) Wi  1 2 si Chi-Square Tests: 2 2 2  total   assoc   hom og df (k) 2  total  2  assoc  ( 1) (k-1) k 2 W Y  ii   k2 i1 (  WiYi 2 )2 i   Wi 12 1  k21 2 i 2 2 ˆ  hom  W ( Y   )  i i og  i 1 2  assoc  N (0,1) 2 Cochran' s Q test If ‘large’ association If ‘large’ heterogeneity Features in Graphic Display • For each trial – estimate (square) – 95% confidence interval (CI) (line) – size (square) indicates weight allocated • Solid vertical line of ‘no effect’ – if CI crosses line then effect not significant (p>0.05) • Horizontal axis – arithmetic: RD, MD, SMD – logarithmic: OR, RR • Diamond represents combined estimate and 95% CI • Dashed line plotted vertically through combined estimate Odds Ratio Three methods for combining (1) Mantel-Haenszel method (2) Peto’s method (3) Maximum likelihood method Relative Risk Risk Difference Peto Odds Ratio Mantel-Haenszel Odds Ratio Relative Risk Risk Difference Weighted Mean Difference Standardized Mean Difference Weighted Mean Difference Standardized Mean Difference Heterogeneity • Define meaning of heterogeneity for each review • Define a priori the important degree of heterogeneity (in large data sets trivial heterogeneity may be statistically significant) • If heterogeneity exists examine potential sources (differences in study quality, participants, intervention specifics or outcome measurement/definition) • If heterogeneity exists across studies, consider using random effects model • If heterogeneity can be explained using a priori hypotheses, consider presenting results by these subgroups • If heterogeneity cannot be explained, proceed with caution with further statistical aggregation and subgroup analysis Heterogeneity: How to Identify it • Common sense  are the patients, interventions and outcomes in each of the included studies sufficiently similar • Exploratory analysis of study-specific estimates • Statistical tests Heterogeneity: How to deal with it Lau et al. 1997 Heterogeneity: Exploring it • Subgroup analyses  subsets of trials  subsets of patients  SUBGROUPS SHOULD BE PRE-SPECIFIED TO AVOID BIAS • Meta-regression – relate size of effect to characteristics of the trials Exploring Heterogeneity: subgroup analysis Exploring Heterogeneity: subgroup analysis Random Effects Model • Assume true effect estimates really vary across studies • Two sources of variation: - within studies (between patients) - between studies (heterogeneity) • What the software does: - Revise weights to take into account both components of variation: • weight = 1 variance+heterogeneity • When heterogeneity exists we get  a different pooled estimate (but not necessarily) with a different interpretation  a wider confidence interval  a larger p-value Random Effects Model If  2 is known then MLE of  is ˆ( )mle  W ( )Y W ( ) i i i where Wi ( )  i 1 si2   2 i If  2 is unknown three common methods of inference can be used: Restricted Maximum Likelihood (REML) Bayesian Method of Moments (MOM) Method of Moments (Random effects model)   2   ( k  1 )   hom og 2w  max 0,  2 W  W W   i  i  i  i  Study 1 2 . . . k Measure Y1 Y2 . . . Yk Overall Measure ˆ *  Weight (FE) W1 W2 . . . Wk Weight (RE) w1*=(w1-1+ 2w)-1 w2*=(w2-1+ 2w)-1 . . . wk*=(wk-1+ 2w)-1 W Y W * i i i * i i se(ˆ * )  1  Wi i 100(1   )% CI : * *   Z / 2 se(ˆ * ) Effect of model choice on study weights Larger studies receive proportionally less weight in RE model than in FE model Fixed Effects Random Effects Fixed vs Random Effects: Discrete Data Fixed vs Random Effects: Continuous Data Fixed Effects Random Effects Omission of Outlier - Chestnut Study Analysis • Include all relevant and clinically useful measures of treatment effect • Perform a narrative, qualitative summary when data are too sparse, of too low quality or too heterogeneous to proceed with a meta-analysis • Specify if fixed or random effects model is used • Describe proportion of patients used in final analysis • Use confidence intervals • Include a power analysis • Consider cumulative meta-analysis (by order of publication date, baseline risk, study quality) to assess the contribution of successive studies Steps of a Cochrane Systematic Review • Well formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report Subgroup Analyses • Pre-specify hypothesis-testing subgroup analyses and keep few in number • Label all a posteriori subgroup analyses • When subgroup differences are detected, interpret in light of whether they are: • • • • • • established a priori few in number supported by plausible causal mechanisms important (qualitative vs quantitative) consistent across studies statistically significant (adjusted for multiple testing) Sensitivity Analyses • Test robustness of results relative to key features of the studies and key assumptions and decisions • Include tests of bias due to retrospective nature of systematic reviews (eg.with/without studies of lower methodologic quality) • Consider fragility of results by determining effect of small shifts in number of events between groups • Consider cumulative meta-analysis to explore relationship between effect size and study quality, control event rates and other relevent features • Test a reasonable range of values for missing data from studies with uncertain results Funnel Plot • Scatterplot of effect estimates against sample size • Used to detect publication bias • If no bias, expect symmetric, inverted funnel x x x x x x x x x x x x • If bias, expect asymmetric or skewed shape x x x x x x x x x x Suggestion of missing small studies Funnel Plot Example 1: Prophylaxis of NSAID induced Gastric Ulcers 700 600 Sample Size 500 400 300 Intervention 200 100 H2-Blockers 0 0.0 .2 .4 .6 Effect Size (RR) .8 1.0 1.2 Funnel Plot Example 2: Alendronate for Postmenopausal Osteoporosis 2500 Sample Size 2000 WMD of % change in lumbar bone mineral density 1500 1000 500 0 0 5 Weighted Mean Difference 10 Steps of a Cochrane Systematic Review • Well formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report Presentation of Results • Include a structured abstract • Include a table of the key elements of each study • Include summary data from which the measures are computed • Employ informative graphic displays representing confidence intervals, group event rates, sample sizes etc. Interpretation of Results • Interpret results in context of current health care • State methodologic limitations of studies and review • Consider size of effect in studies and review, their consistency and presence of dose-response relationship • Consider interpreting results in context of temporal cumulative meta-analysis • Interpret results in light of other available evidence • Make recommendations clear and practical • Propose future research agenda (clinical and methodological requirements) Generic Inferential Framework Generic inferential framework (1) Conceptually, think of a ‘generic’ effect size statistic T (2) corresponding effect size parameter θ (3) associated standard error SE(T), square root of variance (4) for some effect sizes, some suitable transformation may be needed to make inference based on normal distribution theory Generic inferential framework ... (A) Fixed-Effects Model (FEM): – Assume a common effect size – Obtain average effect size as a weighted mean (unbiased) • Optimal weight is reciprocal of variance (inverse variance weighted method) Generic inferential framework ... • Variances inversely proportional to withinstudy sample sizes – what is the effect of larger studies in calculating weights? – may also weigh by ‘quality’ index, q, scaled from 0 to 1 Generic inferential framework ... • Average effect size has conditional variance (a function of conditional variances of each effect size, quality index, …) – e.g.. V = 1/total weight • Multiply the resulting standard error by appropriate critical value (1.96, 2.58, 1.645) • Construct confidence interval and/or test statistic Generic inferential framework ... • Test the homogeneity assumption using a weighted effect size sums of squares of deviations, Q • If Q exceeds the critical value of chisquare at k-1 d.f. (k = number of studies), then observed between-study variance significantly greater than what would be expected under the null hypothesis Generic inferential framework ... • When within-study sample sizes are very large, Q may be rejected even when individual effect size estimates do not differ much • One can take different courses of action when Q is rejected (see next page) Generic inferential framework ... • Methodologic choices in dealing with ‘heterogeneous’ data Generic inferential framework ... (B) Random-Effects Model (REM): – Total variability of an observed study effect size reflects within and between variance (extra variance component) – If between-studies variance is zero, equations of REM reduce to those of FEM – Presence of a variance component which is significantly different from zero may be indicative of REM Generic inferential framework ... • Once significance of variance component is established (e.g.. Q test for homogeneity of effect size), – its magnitude should be estimated – variance components can be estimated in many ways! • the most commonly used method is the so-called the DerSimonian-Laird method which is based on method-ofmoments approach – Compute random effects weighted mean as an estimate of the average of the random effects in the population – construct confidence interval and conduct hypothesis tests as before (new variance and thus new weights!!!) Correlation Coefficient Example: Correlation coefficient • A measure of association more popular in crosssectional observational studies than in RCTs is Pearson’s correlation coefficient, r given by r  ( X  X )(Y  Y )  ( X  X )  (Y  Y ) 2 2 • X and Y must be continuous (e.g. blood pressure and weight) • r lies between -1 to 1 • not available in RevMan / MetaView at this time Correlation coefficient (cont’d) • Following the generic framework discussed earlier: – the effect size statistic is r – the corresponding effect size parameter is the underlying population correlation coefficient,  – in this case, a suitable transformation is needed to achieve approximate normality of effect size – inference is conducted on the scale of the transformed variable and final results are back-transformed to the original scale Correlation coefficient (cont’d) Assuming X and Y have a bivariate normal distribution, the Fisher’s Z transformed variable 1 1 r Z  log 2 1 r has, for large sample, an approximate normal distribution with mean of and a variance of 1 1    log 2 1  1 Var ( Z )  n3 Hence, weighting factor associated with Z is W = 1/Var = n-3. Correlation coefficient (cont’d) • meta-analysis is carried out on Z-transformed measures and final results are transformed back to the scale of correlation using e 1 r  2Z e 1 2Z Numerical Example • Source: Fleiss J., Statistical Methods in Medical Research 1993; 2: 121 -145. • correlation coefficients reported by 7 independent studies in education are included in the meta-analysis • Comparison: association between a characteristic of the teacher and the mean measure of his or her student’s achievement Example: Fleiss (1993) __________________________________________ Study n r Z* W** WZ WZ2 ============================================================== 1 15 -0.073 -0.073 12 -0.876 0.064 2 16 0.308 0.318 13 4.134 1.315 3 15 0.481 0.524 12 6.288 3.295 4 16 0.428 0.457 13 5.941 2.715 5 15 0.180 0.182 12 2.184 0.397 6 17 0.290 0.299 14 4.186 1.252 7 __ 15 0.400 0.424 _ 12 ___5.088 2.157__ Sum 88 26.945 11.195 =================================================== *Z = Fisher’s Z-transformation of r ** W = n-3 Q  Wi ( Z i  Z ) 2 2  Wi Z i  (Wi Z i ) / Wi 2  11.195  (26.945) /88  2.94 2 Q = 2.94 on 6 df is not statistically significant. Results and discussions • No evidence for heterogeneous association across studies • Fixed effect analysis may be undertaken • Questions: – Would a random effect analysis as shown earlier produce a different numerical value for the combined correlation coefficient? – How would the weights be modified to carry out a REM? Results and discussions (cont’d) • the weighted mean of Z is Z  Wi Zi / Wi  26.945/88  0.306 • the approximate standard error of the combined mean is 1 1 SE ( Z )    0.107 Wi 88 Results and discussions (cont’d) • Test of significance is carried out using Z 0.306 z   2.86 SE ( Z ) 0.107 – this value exceeds the critical value 1.96 (corresponding to 5% level of significance), so we conclude that average value of Z (hence the average correlation) is statistically significant Results and discussions (cont’d) • 95% confidence interval for  is Z  1.96  SE ( Z ) 0.096    0.516 • Transforming back to the original scale, a 95% CI for the parameter of interest, , is 0.096    0.474 – again confirming a significant association Critical Appraisal of a Systematic Review (A) The Message • Does the review set out to answer a precise question about patient care? – Should be different from an uncritical encyclopedic presentation (B) The Validity • Have studies been sought thoroughly:  Medline and other relevant bibliographic database  Cochrane controlled clinical trials register  Foreign language literature  "Grey literature" (unpublished or un-indexed reports: theses, conference proceedings, internal reports, non-indexed journals, pharmaceutical industry files)  Reference chaining from any articles found  Personal approaches to experts in the field to find unpublished reports  Hand searches of the relevant specialized journals. Validity (cont’d) • Have inclusion and exclusion criteria for studies been stated explicitly, taking account of the patients in the studies, the interventions used, the outcomes recorded and the methodology? Validity (cont’d) • Have the authors considered the homogeneity of the studies: the idea that the studies are sufficiently similar in their design, interventions and subjects to merit combination. – this is done either by eyeballing graphs like the forest plot or by applications of chi-square tests (Q test) (C) The Utility • The various studies may have used patients of different ages or social classes, but if the treatment effects are consistent across the studies, then generalisation to other groups or populations is more justified. Utility (cont’d) • Be wary of sub-group analyses where the authors attempt to draw new conclusions by comparing the outcomes for patients in one study with the patients in another study – Be wary of "data-dredging" exercises, testing multiple hypotheses against the data, especially if the hypotheses were constructed after the study had begun data collection. Utility (cont’d) • One may also want to ask:  Were all clinically important outcomes considered?  Are the benefits worth the harms and costs?