Download Confidence interval

CPSC 531: Output Data Analysis Instructor: Anirban Mahanti Office: ICT 745 Email: mahanti@cpsc.ucalgary.ca Class Location: TRB 101 Lectures: TR 15:30 – 16:45 hours Slides primarily adapted from: “The Art of Computer Systems Performance Analysis” by Raj Jain, Wiley 1991. [Chapters 12, 13, and 25] CPSC 531: Data Analysis 1 Outline  Measures of Central Tendency  Mean, Median, Mode  How to Summarize Variability?  Comparing Systems Using Sample Data  Comparing Two Alternatives  Transient Removal CPSC 531: Data Analysis 2 Measures of Central Tendency (1)  Sample mean – sum of all observations divided by the total number of observations Always exists and is unique  Mean gives equal weight to all observations  Mean is strongly affected by outliers   Sample median – list observations in an increasing order; the observation in the middle of the list is the median; Even # of observations – mean of middle two values  Always exists and is unique  Resistant to outliers (compared to mean)  CPSC 531: Data Analysis 3 Measures of Central Tendency (2) mode 0.4    Mode may not exists (e.g., all sample have equal weight) More than one mode may exist (i.e. bimodal) If only one mode then distribution is unimodal 0.2 0.1 0 0 4 8 12 x 16 20 mode mode 0.2 PDF f(x) 0.15 0.1 0.05 0 0 4 8 12 16 20 x mode 0.6 0.5 PDF f(x) histogram from the observations; find bucket with peak frequency; the middle point of this bucket is the mode; PDF f(x)  Sample mode – plot 0.3 0.4 0.3 0.2 0.1 0 0 4 8 12 x CPSC 531: Data Analysis 4 Measure of Central Tendency (3)  Is data categorical?  Yes: use mode  e.g. most used resource in a system  Is total of interest?  Yes: use mean  e.g. total response time for Web requests  Is distribution skewed?  Yes: use median • Median less influenced by outlier than mean.  No: use mean. Why? CPSC 531: Data Analysis 5 Common Misuses of Means (1)  Usefulness of mean depends on the number of observations and the variance  E.g. two response time samples: 10 ms and 1000 ms. Mean is 505 ms! Correct index but useless.  Using mean without regard to skewness System A 10 9 11 10 10 Mean: 10 Mode: 10 Min,Max: [9,11] System B 5 5 5 4 31 10 5 [4,31] CPSC 531: Data Analysis 6 Common Misuses of Means (2)  Mean of a Product by Multiplying means     Mean of product equals product of means if the two random variables are independent. If x and y are correlated E(xy) != E(x)E(y) Avg. users in system 23; avg. processes/user 2. Avg. # of processes in system? Is it 46? No! Number of processes spawned by users depends on the load. CPSC 531: Data Analysis 7 Outline  Measures of Central Tendency  How to Summarize Variability?  Comparing Systems Using Sample Data  Comparing Two Alternatives  Transient Removal CPSC 531: Data Analysis 8 Summarizing Variability  Summarizing by a single number rarely enough.  Given two systems with same mean, we generally prefer one with less variability 20% 4s Mean=2s Response Time Frequency Frequency 80% 1.5 s 60% ~ 0.001 s ~5 s 40% Mean=2s Response Time  Indices of dispersion • Range, Variance, 10- and 90-percentiles, Semi-interquantile range, and mean absolute deviation CPSC 531: Data Analysis 9 Range  Easy to calculate; range = max – min  In many scenarios, not very useful:  Min may be zero  Max may be an “outlier”  With more samples, max may keep increasing and min may keep decreasing → no “stable” point  Range is useful if systems performance is bounded CPSC 531: Data Analysis 10 Variance and Standard Deviation  Given sample of n observations {x1, x2, …, xn} the sample variance is calculated as: 2 1 n s   xi  x  n  1 i 1 2 1 n where x   xi n i 1  Sample variance: s2 (square of the unit of observation)  Sample standard deviation: s (in unit of observation)  Note the (n-1) in variance computation  (n-1) of the n differences are independent  Given (n-1) differences, the nth difference can be computed  Number of independent terms is the degrees of freedom (df) CPSC 531: Data Analysis 11 Standard Deviation (SD)  Standard deviation and mean have same units  Preferred!  E.g. a) Mean = 2 s, SD = 2 s; high variability?  E.g. b) Mean = 2 s, SD = 0.2 s; low variability?  Another widely used measure – C.O.V  C.O.V = Ratio of standard deviation to mean  C.O.V does not have any units  C.O.V shows magnitude of variability  C.O.V in (a) is 1 and in (b) is .1 CPSC 531: Data Analysis 12 Percentiles, Quantiles, Quartiles  Lower and upper bounds expressed in percents or as fractions 90-percentile →0.9-quantile  –quantile: sort and take [(n-1)+1]th observation  • [] means round to nearest integer  Quartiles divide data into parts at 25%, 50%, 75% → quartiles (Q1, Q2, Q3) 25% of the observations ≤ Q1 (the first quartlie)  Second quartile Q2 is also the median   The range (Q3 – Q1) is interquartile range  (Q3 – Q1)/2 is semi-interquartile (SIQR) range CPSC 531: Data Analysis 13 Mean Absolute Deviation  Mean absolute deviation is calculated as: 1 n  xi  x n i 1 CPSC 531: Data Analysis 14 Influence of Outliers  Range: considerably  Sample variance: considerably, but less than range  Mean absolute deviation: less than variance  Doesn’t square (aka magnify) the outliers  SIQR range: very resistant  Use SIQR for index of dispersion whenever median is used as index of central tendency CPSC 531: Data Analysis 15 Outline  Measures of Central Tendency  How to Summarize Variability?  Comparing Systems Using Sample Data  Sample vs. Population  Confidence Interval for Mean  Comparing Two Alternatives  Transient Removal CPSC 531: Data Analysis 16 Comparing Systems Using Sample Data  The words “sample” and “example” have a common root – “essample” (French)  One sample does not prove a theory - a sample is just an example  The point is - definite statement cannot be made about characteristics of all systems.  However, probabilistic statements about the range of most systems can be made  Confidence interval concept as a building block CPSC 531: Data Analysis 17 Sample versus Population  Generate 1-million random numbers  with mean  and SD  and put them in an urn  Draw sample of n observations  {x1, x2, …, xn} has mean , standard deviation s x x  is likely different than !  The population mean  is unknown or impossible to obtain in many real-world scenarios obtain estimate of  from x  Therefore, CPSC 531: Data Analysis 18 Confidence Interval for the Mean  Define bounds c1 and c2 such that: Prob{c1 <  < c2} = 1- (c1, c2) is confidence interval   is significance level  100(1- ) is confidence level   Typically small  desired  confidence level 90%, 95% or 99%  One approach: take k samples, find sample means, sort, and take the [1+0.05(k-1)]th as c1 and [1+0.95(k-1)]th as c2 CPSC 531: Data Analysis 19 Central Limit Theorem  We do not need many samples. Confidence intervals can be determined from one sample because ~ N(, /sqrt(n))  SD of sample mean  /sqrt(n) called Standard error  Using the CLT, a 100(1- )% confidence interval for a population mean is ( -z1-/2s/sqrt(n), +z1-/2s/sqrt(n)) x x x  z1-/2 is the (1-/2)-quantile of a unit normal variate (and is obtained from a table!)  s is the sample SD CPSC 531: Data Analysis 20 Confidence Interval Example  CPU times obtained by repeating experiment 32 times. The sorted set consists of   {1.9,2.7,2.8,2.8,2.8,2.9,3.1,3.1,3.2,3.2,3.3,3.4,3.6,3.7,3.8,3.9,3.9 ,4.1,4.1,4.2,4.2,4.4,4.5,4.5,4.8,4.9,5.1,5.1,5.3,5.6,5.9} Mean = 3.9, standard deviation (s) = 0.95, n=32  For 90% confidence interval z1-/2 = 1.645, and we get {3.90 + (1.645)(0.95)/(sqrt(32))} = (3.62,4.17) CPSC 531: Data Analysis 21 Meaning of Confidence Interval  What does this mean? With 90% confidence, we can say population mean is within the above bounds; that is, chance of error is 10%. E.g., Take 100 samples and construct CI’s. In 10 cases, the interval will not contain population mean x -c x x  +c 90% chance that this interval contains  CPSC 531: Data Analysis 22 Length of Confidence Interval  Let z1-/2s/sqrt(n) = c  Then, z1-/2 = (c.sqrt(n))/s  Larger s implies wider confidence interval  Larger n implies shorter confidence interval • → with more observations, we are better able to predict population mean • → square-root n relationship implies increasing observations by a factor of 4 only cuts confidence interval by a factor of 2.  Confidence Interval computation, as described here works for n ≥ 30. CPSC 531: Data Analysis 23 What if n not large?  For smaller samples, can construct confidence intervals only if observations come from normally distributed population x  t[1 / 2;n1]s / n , x  t[1 / 2;n 1]s / n   t[1-α/2;n-1] is the (1-α/2)-quantile of a t-variate with (n-1) degrees of freedom CPSC 531: Data Analysis 24 Testing for a Zero Mean  Check if measured value is significantly different than zero  Determine confidence interval  Then check if zero is inside interval.  Procedure applicable to any other value a mean 0 Mean is zero Mean is nonzero CPSC 531: Data Analysis 25 Outline  Measures of Central Tendency  How to Summarize Variability?  Comparing Systems Using Sample Data  Comparing Two Alternatives  Transient Removal CPSC 531: Data Analysis 26 Comparing Two Alternatives  Often interested in comparing systems “naïve” VOD vs. “batching” VOD (assignment 3)  “SJF” vs. “FIFO” request scheduling (assignment 1)   Statistical techniques for such comparison:  Paired Observations  Unpaired Observations (we will omit this!)  Approximate Visual Test  Did you use any of these in your assignments? CPSC 531: Data Analysis 27 Paired Observations (1)  n experiments with one-to-one corrsp. between test on system A and test on system B no correspondence => unpaired  This test uses the zero mean idea…   Treat the two samples as one sample of n pairs  For each pair, compute difference  Construct confidence interval for difference  CI includes zero => systems not significantly different CPSC 531: Data Analysis 28 Paired Observations (2)  Six similar workloads used on two systems. {(5.4, 19.1), (16.6, 3.5), (0.6,3.4), (1.4,2.5), (0.6, 3.6) (7.3, 1.7)} Is one system better?  The performance differences are {-13.7, 13.1, -2.8, -1.1, -3.0, 5.6}  Sample mean = -.32, sample SD = 9.03  CI = -0.32 + t[sqrt(81.62/6)] = -0.32 + t(3.69)  .95 quantile of t with 5 DF’s is 2.015  90% confidence interval = (-7.75, 7.11)  Systems not different as zero mean in CI CPSC 531: Data Analysis 29 Approximate Visual Test  Compute confidence interval for means  If CI’s don’t overlap, one system better than the other mean mean CI’s do not overlap => alternatives different mean CI’s overlap and mean of one is in the CI of the other => not significantly diff. CI’s overlap but mean of one is not in the CI of the other => need more testing CPSC 531: Data Analysis 30 Determining Sample Size  Goal: find the smallest sample size n such that desired confidence in the results  Method:    small set of preliminary measurements estimate variance from the measurements use estimate to determine sample size for accuracy  r% accuracy=> +r% at 100(1-)% confidence r  xz  x 1   100  n s  100zs  n     rx  2 CPSC 531: Data Analysis 31 Outline  Measures of Central Tendency  How to Summarize Variability?  Comparing Systems Using Sample Data  Comparing Two Alternatives  Transient Removal CPSC 531: Data Analysis 32 Transient Removal  In many simulations, we are interested in steady state performance  Remove initial transient state  However, defining exactly what constitutes end of transient state is difficult!  Several heuristics developed: Long runs  Proper initialization  Truncation  Initial data deletion  Moving average of replications  Batch means  CPSC 531: Data Analysis 33 Long Runs  Use very long runs  Impact of transient state becomes negligible  Wasteful use of resources  How long is “long enough”?  Raj Jain text recommends that this method not be used in isolation CPSC 531: Data Analysis 34 Batch Means  Run simulation for long duration  Divide observations (N) into m batches, each of size n  Compute variance of batch means using procedure shown for n = 2, 3, 4, 5 …  Plot variance vs. batch size Ignore 1) Computebatch mean 1 n xi   xij , i  1,2,...,m n i 1 2)Computeoverallmean 1 m x   xi m i 1 3) Computevarianceof batch means 1 m 2 Var ( x )   ( xi  x ) m  1 i 1 Variance of Batch means Transient interval Batch Size n CPSC 531: Data Analysis 35

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Confidence interval