* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Stats Review Lecture 3 - Random Variables 08.29.12
Survey
Document related concepts
Transcript
Discrete Random Variable • Let X denote the return of the S&P 500 tomorrow, rounded to the nearest percent • what are the possibilities, i.e. 0, 1%, … • what is the probability of each of the above possibilities • Probability distribution function: f(x) = P(X=x) Probability Distribution Probability 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 -4 -3 -2 -1 0 Return 1 2 3 4 Cumulative Distribution Function CDF 1.20 1.00 0.80 0.60 0.40 0.20 -6 -4 0.00 -2 0 Return 2 4 6 Discrete Random Variable • Expectation • Variance Which Distribution Has Higher Mean? 0.25 0.20 0.15 0.10 0.05 0.00 0.25 0.20 0.15 0.10 0.05 0.00 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 Which Distribution Has Higher Variance? 0.40 0.30 0.20 0.10 0.00 0.25 0.20 0.15 0.10 0.05 0.00 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 Expectation of a Function of a R.V. • Function g(X): – What is the expectation E(g(X))? • General result: • Example – call option on the S&P 500 Binomial Distribution • Bernoulli distribution – A r.v. X has two possible outcomes, 0 or 1 • Binomial distribution – Number of successes that occur in n trials • Example: Ch. 4, 6b Poisson • A r.v. X takes on values 0, 1, 2, .... • Poisson distribution if for some l > 0, • The Poisson r.v. is an approximation for binomial with l = np. • Example: how many days in a year will the S&P500 drop more than 1%? • Example 7b Geometric • Independent trials with prob. of success p – How many trials until a success occurs? • What happens when n goes to infinity? • Example: how many days until we get a stock market drop of 2% or more? Negative Binomial • Independent trials with prob. of success p – How many trials until r successes occur? • What happens when n goes to infinity? • Example: how many days until we get three stock market drops of 2% or more (not necessarily consecutive)? Hyper-Geometric • Choose n balls out of N, without replacement – m white, N – m black – X = number of white balls selected • Example 8i • What happens if you choose the n balls with replacement? Continuous Random Variable • Let X denote the return of the S&P 500 tomorrow, no rounding • what are the possibilities • what is the probability of each of the above possibilities • Probability density function: Probability Density Function pdf 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 -4 -2 0 Return 2 4 Cumulative Density Function cdf 1.20 1.00 0.80 0.60 0.40 0.20 0.00 -4 -2 0 Return 2 4 Continuous Random Variable • Expectation • Variance • Example Ch5, 1a, 1b, 2a Continuous Random Variable • For any real-valued function g and continuous r.v. X: • Example: payoff on a call option, 2b Which Distribution Has Higher Mean? 0.25 0.20 0.15 0.10 0.05 0.00 0.40 -4 -2 0 2 4 -2 0 2 4 0.30 0.20 0.10 0.00 -4 Which Distribution Has Higher Mean? 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0.40 -4 -2 0 2 4 -2 0 2 4 0.30 0.20 0.10 0.00 -4 Which Distribution Has Higher Variance? 0.40 0.30 0.20 0.10 0.00 0.30 -4 -2 0 2 4 -2 0 2 4 0.20 0.10 0.00 -4 Skewness 0.40 0.30 0.20 0.10 0.00 0.40 0.30 0.20 0.10 0.00 -4 -2 0 2 4 -4 -2 0 2 4 Kurtosis Additional Sample Questions • Given a discrete probability function (pdf) (i.e., all possible outcomes and their probabilities), compute the mean and the variance • Given a graph of several discrete or continuous pdf, estimate which ones has the highest mean, variance, skewness, kurtosis • Given two random variables, guess whether they have positive or negative covariance and/or correlation The Uniform Distribution 0.15 0.10 0.05 0.00 -4 Example 3b -2 0 2 4 The Normal Distribution 0.50 0.40 0.30 0.20 0.10 0.00 -4 -2 0 2 4 The Normal Distribution Example Ch 5, 4b, 4e Properties of the Normal Distribution 2.00 1.50 1.00 x~N[.5,1] 0.50 1+.5x 0.00 -0.50 -4 -2 0 2 4 Normal is an Approximation to Binomial • Sn = number of successes in n independent trials with individual prob. of success p. • The DeMoivre-Laplace limit theorem: Normal is an Approximation to Binomial Lognormal Distribution • What is the distribution of the S&P 500 index tomorrow? • If the return on the S&P500 is normally distributed, the index itself is lognormally distributed Lognormal Distribution Chi-squared Distribution • Sum of squared standard normal variables F distribution • Ratio of two independent chi-squared variables with degrees of freedom n1 and n2 t distribution • Very important for hypothesis testing Normal vs. t distribution Exponential Distribution • PDF: • CDF: • Exercise: Joint Distributions of R. V. • Joint probability distribution function: f(x,y) = P(X=x, Y=y) • Example Ch 6, 1c, 1d Independence • Two variables are independent if, for any two sets of real numbers A and B, • Operationally: two variables are indepndent iff their joint pdf can be “separated” for any x and y: Joint Distributions of R. V. • The expectation of a sum equals the sum of the expectations: • The variance of a sum is more complicated: • If independent, then the variance of a sum equals the sum of the variances Sum of Normally Distributed RV 0.50 0.40 x~N[.5,1] 0.30 y~N[1,1] 0.20 x+y~N[1.5,2] 0.10 0.00 -4 -2 0 2 4 Additional Sample Questions • Find the distribution of a transformation of two or more normal random variables • By looking at a graph of a pdf, guess whether it is normal, log-normal, or tdistribution • What normally distributed random variables do you need to construct an F distribution with 3 and 5 degrees of freedom Conditional Distributions (Discrete) • For any two events, E and F, • Conditional pdf: • Examples Ch 6, 4a, 4b Conditional Distributions (Discrete) • Conditional cdf: Conditional Distributions (Discrete) • Example: what is the probability that the TSX is up, conditional on the S&P500 being up? Conditional Distributions (Continous) • Conditional pdf: • Conditional cdf: • Example 5b Conditional Distributions (Continous) • Example: what is the probability that the TSX is up, conditional on the S&P500 being up 3%? Joint PDF of Functions of R.V. • = joint pdf of X1 and X2 • Equations for and and can be uniquely solved given by: and • The functions and have continuous partial derivatives: Joint PDF of Functions of R.V. • Under the conditions on previous slide, • Insert eq. 7.1, p275 • Example: You manage two portfolios of TSX and S&P500: – Portfolio 1: 50% in each – Portfolio 2: 10% TSX, 90% S&P 500 • What is the probability that both of those portfolios experience a loss tomorrow? Joint PDF of Functions of R.V. • Example 7a – uniform and normal cases Estimation • Given limited data we make educated guesses about the true parameters • Estimation of the mean • Estimation of the variance • Random sample Population vs. Sample • Population parameter describes the true characteristics of the whole population • Sample parameter describes characteristics of the sample • Statistics is all about using sample parameters to make inferences about the population parameters Distribution of the Sample Mean • The sample mean follows a t-distribution: Confidence Intervals • We can estimate the mean, but we’d like to know how accurate our estimate is • We’d like to put upper and lower bounds on our estimate • We might need to know whether the true mean is above certain value, e.g. zero Constructing Confidence Intervals • We already know the distribution of our estimate of the mean • To construct a 95% confidence interval, for instance, just find the values that contain 95% of the distribution Constructing Confidence Intervals X s/ n 2.5% of the distribution Critical values falls in this region 95% of the time 2.5% of the distribution Confidence Intervals and Hypothesis Testing • The critical values are available from a table or in Matlab >> tinv(.975, n-1) • If the confidence interval includes zero, then the sample mean is not statistically different from the population mean we are testing • One-sided vs. two-sided tests Example • Are the returns on the S&P 500 significantly above zero? – Sample mean = .23 – Sample standard deviation = .59 – Sample size = 128 • Compute the test: • At 95% the critical value is 1.98 • Therefore, we reject that the returns are zero Distribution of S&P500 Returns • The direct use of historical data requires the following assumptions: – The true distribution of returns is constant through time and will not change in the future – Each period represents an independent draw from this distribution Distribution of Stock Returns S&P 500 0.3 0.25 0.2 0.15 0.1 0.05 0 -0.80 -0.20 0.40 1.00 1.60 2.20 more Distribution of Stock Returns TSE 200 0.2 0.15 0.1 0.05 0 -0.80 0.20 1.20 2.20 Distribution of Stock Returns DAX 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.80 0.20 1.20 2.20 Linear Regression (Harvey 1989) 0.10 0.08 0.06 0.04 Growth Spread 0.02 0.00 O-54 -0.02 -0.04 J-68 F-82 O-95 J-09 Harvey 1989 Growth 0.10 0.08 GNP Growth 0.06 0.04 0.02 0.00 -0.03 -0.02 -0.01 0 0.01 0.02 -0.02 -0.04 Spread 0.03 0.04 Harvey 1989 Growth Regression Line: Growth 0.10t 1:t 5 a b( Spread )t ut 5 0.08 GNP Growth 0.06 0.04 0.02 0.00 -0.03 -0.02 -0.01 0 0.01 0.02 -0.02 -0.04 Spread 0.03 0.04 Regression • Minimize the squared residuals: Regression in Matrix Form • Regression equation: • Minimize the squared residuals: