Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
C2 Training: May 9 – 10, 2011 Data Analysis and Interpretation: Computing effect sizes The Campbell Collaboration www.campbellcollaboration.org A brief introduction to effect sizes Meta-analysis expresses the results of each study using a quantitative index of effect size (ES). ESs are measures of the strength or magnitude of a relationship of interest. ESs have the advantage of being comparable (i.e., they estimate the same thing) across all of the studies and therefore can be summarized across studies in the meta-analysis. Also, they are relatively independent of sample size. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Effect Size Basics • Effect sizes can be expressed in many different metrics – d, r, odds ratio, risk ratio, etc. • So be sure to be specific about the metric! • Effect sizes can be unstandardized or standardized – Unstandardized = expressed in measurement units – Standardized = expressed in standardized measurement units C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Unstandardized Effect Sizes • Examples – 5 point gain in IQ scores – 22% reduction in repeat offending – €600 savings per person • Unstandardized effect sizes are helpful in communicating intervention impacts – But in many systematic reviews are not usable since not all studies will operationalize the dependent variable in the same way C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized Effect Sizes • Some standardized effect sizes are relatively easy to interpret – Correlation coefficient – Risk ratio • Others are not – Standardized mean difference (d) – Odds ratio, logged odds ratio C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Types of effect size Most reviews use effect sizes from one of three families of effect sizes: • the d family, including the standardized mean difference, the r family, including the correlation coefficient, and • the odds ratio (OR) family, including proportions and other measures for categorical data. • C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Effect size computation • Compute a measure of the “effect” of each study as our outcome • Range of effect sizes: – Differences between two groups on a continuous measure – Relationship between two continuous measures – Differences between two groups on frequency or incidence C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Types of effect sizes • Standardized mean difference • Correlation Coefficient • Odds Ratios C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized mean difference • Used when we are interested in two-group comparisons using means • Groups could be two experimental groups, or in an observational study, two groups of interest such as boys versus girls. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Notation for study-level statistics Group 1 Group 2 X11 X 21 X12 X 22 X1nG1 C2 Training Materials – Oslo – May 2011 X 2nG2 n is sample size www.campbellcollaboration.org Notation for study-level statistics Group means: X G1 , X G 2 Group sample sizes: nG1 , nG 2 Total sample size: N nG1 nG 2 Group standard deviations: sG1 , sG 2 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized mean difference X G1 X G 2 ESsm sp Pooled sample standard deviation C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Pooled sample standard deviation (n G1 1)s (n G2 1)s sp (n G1 1) (n G2 1) 2 G1 C2 Training Materials – Oslo – May 2011 2 G2 www.campbellcollaboration.org Correction to ESsm 3 ESsm 1 ES sm 4N 9 where N is the total sample size C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standard error of standardized mean difference n G1 n G 2 (ES ) SEsm n G1n G 2 2(n G1 n G 2 ) ' 2 sm C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Example • Table 1 from: Henggeler, S. W., Melton, G. B. & Smith, L. A. (1992). Family preservation sing multisystemic therapy: An effective alternative to incarcerating seriuos juvenile offenders. Journal of Consulting and Clinical Psychology, 60(6), 953961. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Note: Text of paper (p. 954) indicates that MST n = 43, usual services n = 41. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing pooled sd (43 1) (13.9) (41 1) (19.1) sp (43 1) (41 1) 2 2 16.6 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing ESsm (5.8 16.2) ESsm 16.6 0.63 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing unbiased ESsm 3 ES 1 *( 0.63) 4(43 41) 9 (0.99) *( 0.63) 0.62 ' sm C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing SEsm 43 41 (0.62) SEsm 43* 41 2(43 41) 2 0.22 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 95% Confidence interval for ES’sm 0.62 1.96(0.22) [1.06, 0.18] The 95% confidence interval for the standardized mean difference in weeks of incarceration ranges from -1 sds to -0.2 sds. Given that the sd of weeks is 16.6, the juveniles in MST were incarcerated on average -1.06*16.6 = -17.6 to -0.18*16.6 = -3 less weeks than juveniles in the standard treatment. In weeks, the confidence interval is [-17.6, -3.0]. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Note: Text of paper (p. 954) indicates that MST n = 43, usual services n = 41. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Practice computations • Compute effect size for number of arrests • Compute effect size with bias correction • Compute 95% confidence interval for effect size • Interpret the effect size C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Pooled sd for arrests (43 1) (1.34) (41 1) (1.55) sp (43 1) (41 1) 2 2 2.09 1.44 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org ESsm for arrests (0.87 1.52) ES sm 1.45 0.65 1.45 0.45 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing unbiased ESsm 3 ES 1 *( 0.45) 4(43 41) 9 (0.99) *(0.45) 0.44 ' sm C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing SEsm 43 41 (0.44) SEsm 43* 41 2(43 41) 2 0.22 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 95% Confidence interval for ES’sm 0.44 1.96(0.22) [0.87, 0.01] The 95% confidence interval for the standardized mean difference in number of arrests is from -0.87 sds to -0.01 sds. Given that the sd of arrests is 1.44, the juveniles in MST were arrested on average -0.87*1.44 = -1.25 to -0.01*1.44 = -0.01 less than juveniles in the standard treatment. In arrests, the confidence interval is [-1.25, -0.01]. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Computing standardized mean differences The first steps in computing d effect sizes involve assessing what data are available and what’s missing. You will look for: • Sample size and unit information • Means and SDs or SEs for treatment and control groups • ANOVA tables • F or t tests in text, or • Tables of counts C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Sample sizes Regardless of exactly what you compute you will need to get sample sizes (to correct for bias and compute variances). Sample sizes can vary within studies so check initial reports of n against (1) n for each test or outcome or (2) df associated with each test C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized Mean Differences • Means, standard deviations and sample sizes the most direct method • Without individual group sample sizes (n1 and n2), assume equal group n’s • Can compute standardized mean differences from t-statistic and from one-way F-statistic C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org ESsm from t-tests ESsm C2 Training Materials – Oslo – May 2011 nG1 nG 2 t nG1nG 2 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized mean difference from t-test ES sm C2 Training Materials – Oslo – May 2011 282 270 *0.46 282* 270 0.039 www.campbellcollaboration.org Standardized mean difference from means and sds 203.24 202.3 ES sm 24.14 0.039 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org ESsm from F-tests (one-way) ESsm nG1 nG 2 Fbetween nG1nG 2 Note that you have to decide the direction of the effect given the results. C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standardized mean difference from F-test ES sm 43 41 *3.94 43* 41 0.43 Note that we choose a negative effect size since the number of arrests is less for the MST group than for the control group C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org From means and sds from before 0.87 1.52 ESsm 1.45 0.45 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Correlational data C2 Training Materials – Oslo – May 2011 1 X 11 X 12 n X n1 X n2 www.campbellcollaboration.org Correlation data ESr r ES Zr C2 Training Materials – Oslo – May 2011 1 ESr 0.5log e 1 ES r www.campbellcollaboration.org Standard error of z-transform SEZr C2 Training Materials – Oslo – May 2011 1 n 3 www.campbellcollaboration.org Example ESr 0.39 ES Zr 1 0.39 0.5log e 1 0.39 0.41 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standard error of z-transform 1 SEZr 100 3 0.10 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 95% confidence interval for z 0.411.96*0.10 0.21,0.61 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org To translate back to r-metric r C2 Training Materials – Oslo – May 2011 e 2 ES zr 1 e 2 ES zr 1 www.campbellcollaboration.org Confidence interval in r-metric [0.21, 0.54] C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Outcomes of one study Drummond et al. (1990) Success Failure TOTAL Treatment 5 14 19 Comparison 6 12 18 TOTAL 11 26 37 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Odds of improving, ΩTrt Prob(Success|Treatment) T Prob(Failure|Treatment) Prob(S|Trt) 1- Prob(S|Trt) C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Odds of improving, ΩTrt Estimate ΩTrt by OE #successes / total # trt OE # failures / total # trt 5 /19 5 14 /19 14 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Odds of improving, ΩCntl Estimate ΩCntl by OE’ #successes / total # cntl OE ' # failures / total # cntl 6 /18 6 12 /18 12 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Odds ratio, ω Trt estimated by Cntl OE # trt success /# trt failures o OE ' # cntlsuccess /# cntl failures # trt s # cntls # trt s*# cntl f # trt f # cntl f # trt f *# cntls C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Example 5*12 Odds ratio, o 6*14 60 0.71 84 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Outcomes of one study Frequencies Success Failure Treatment a b Comparison c d C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Odds ratio, o or ESOR ad ESOR bc C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Interpretation of ESOR • ESOR = 1, Treatment & Control equally effective • ESOR > 1, Treatment successes more likely than Control successes • 0 < ESOR < 1, Treatment successes less likely than Control C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org ESLOR , log-odds ratio ad ESLOR log e bc C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Standard Error of ESLOR SE LOR C2 Training Materials – Oslo – May 2011 1 1 1 1 a b c d www.campbellcollaboration.org Interpretation of ESLOR • ESLOR = 0, No difference between Treatment and Control • ESLOR > 0, Treatment successes more likely than control successes • ESLOR < 0, Treatment successes less likely than control successes C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Information for a 2 x 2 table • MST n = 92 • IT (Control) n = 84 • 26.1% of MST group re-arrested • 71.4% of IT group re-arrested C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 2 x 2 Table Not arrested Re-arrested MST 92 – 24 = 68 26.1% of 92 = 24 IT 84 – 60 = 24 71.4% of 84 = 60 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Log-odds ratio 68*60 ESLOR log e 24*24 1.96 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org SE of log-odds ratio SE LOR 1 1 1 1 68 24 24 60 0.34 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 95% Confidence interval 1.96 1.96*0.34 1.29, 2.62 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org 2 x 2 Table In home placement Out of home placement MST 90.6% of 59 = 53.45 9.4% of 59 = 5.55 Usual child welfare services 58.1% of 37 = 21.5 41.9% of 37 = 15.5 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org Log-odds ratio 53.45*15.5 ESLOR log e 5.55* 21.5 828.48 /119.325 6.94 C2 Training Materials – Oslo – May 2011 www.campbellcollaboration.org