Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Commodities Futures Price Prediction An Artificial Intelligence Approach Thesis Defense Commodities Markets • Commodity – A good that can be processed and resold – Examples – corn, rice, silver, coal • Spot Market • Futures Market Futures Markets • Origin • Motivation – Hedgers • Producers • Consumers – Speculators • Size and scope – CBOT (2002) • 260 million contracts • 47 different products Profit in the Futures Market • Information – Supply • • • • Optimal production Weather Labor Pest damage – Demand • Industrial • Consumer • Time series analysis Time Series Analysis Background • Time Series – examples – – – – – – River flow and water levels Electricity demand Stock prices Exchange rates Commodities prices Commodities futures prices • Patterns Time Series Analysis - Methods • • • • • Linear regression Non-linear regressions Rule based systems Artificial Neural Networks Genetic Algorithms Data • • • • • Daily price data for soybean futures Chicago Board of Trade Jan. 1, 1980 – Jan. 1, 1990 Datastream Normalized Why use an Artificial Neural Network (ANN)? • Excellent pattern recognition • Other uses of ANN and financial time series analysis – Estimate generalized option pricing formula – Standard & Poors 500 index futures day trading system – Standard & Poors 500 futures options prices ANN Implementation • Stuttgart Neural Network Simulator, version 4.2 • Resilient propagation (RPROP) – Improvement over standard back propagation – Uses only the sign of the error derivative • Weight decay • Parameters – – – – Number of inputs 10 and 100 Number of hidden nodes 5, 10, 100 Weight decay 5, 10, 20 Initial weight range +/- 1.0, 0.5, 0.25, 0.125, 0.0625 ANN Data Sets • Training set Jan. 1, 1980 – May 2, 1983 • Testing set May 3, 1983 – Aug. 29, 1986 • Validation set Sept. 2, 1986 – Jan. 1, 1990 ANN Results • Mean Error – 100 input • 12.00 • 24.93 – 10 input • 10.62 • 25.88 – Cents per bushel Why Evolve the parameters of an ANN? • Selecting preferred parameters is a difficult poorly understood task • Search space is different for each task • Trial and error is time consuming • Evolutionary techniques provide powerful search capabilities for finding acceptable network parameters. Genetic Algorithm Implementation • Galib, version 4.5 (MIT) • Custom code to implement RPROP with weight decay • Real number representation – – – – – – Number of input nodes (1 – 100) Number of hidden nodes (1 – 100) Initial weight range (0.0625 – 2.0) Initial step size (0.0625 – 1.0) Maximum step size (10 – 75) Weight decay (0 – 20) Genetic Algorithm – Implementation (continued) • • • • Roulette wheel selection Single point crossover Gausian random mutation High mutation rate Evaluation Function • Decode the parameters and instantiate a network using them • Train the ANN for 1000 epochs • Report the lowest total sum of squared error for both training and testing data sets • Fitness equals the inverse of the total error reported. Parameter Evolution - Results • GANN Mean error 10.82 • NN Mean error 10.62 • Conclusions: – GANN performance is close and out performs the majority of networks generated via trial and error – Genotype / Phenotype issue – Other, possibly better GA techniques • Multipoint crossover • Tournament selection Evolving the Weights of an ANN • Avoid local minima • Avoid tedious trial and error search for learning parameters • Perform search of broad, poorly understood solution space and maximize the values for function parameters Weight evolution Implementation • • • • • • Galib, version 4.5 (MIT) Custom written neural network code Real number representation Gausian Mutation Two point crossover Roulette wheel selection Weight Evolution – objective function • Instantiate a neural network with the weight vector (I.e. the individual) • Feed one epoch of the training data • Fitness equals the inverse of the sum of the squared network error returned Weight Evolution – keeping the best individual • Fitness function evaluates against training set only • Objective function evaluates against training set as well, but only for retention of candidate best network • Meta-fitness, or meta-elite individual Weight Evolution - Results • Mean Error – GANN-Weight 10.67 – GANN 10.82 – NN 10.61 • Much faster • Fewer man hours Summary • Pure ANN approach is very man hour intensive and expert experience is valuable • Evolving network parameters requires few man hours, but many hours of computational resources. • Evolving the network weights provides most of the performance for smaller cost in both human and computer time