Predicting+the+Stock+Market


 * by John Herbert and Gregory Potts**


 * Introduction**

The purpose of this project is to determine the age old question of, “Can the stock market be predicted?” and in doing so, create tremendous wealth for those capable of predicting the unknown. Every January, hordes of highly paid experts attempt to predict what the economy and world markets will do in the coming year. Later in that year, nearly all of the forecasts turn out to be wrong. Analysts typically forecast out taking into consideration inflation and interest rates, earnings, oil/energy prices, political instability or unrest…and even with millions of lines of information available to be analyzed, nobody has been able to quite put their finger on why the stock market acts the way it does. We as a group are bound and determined to test commonly known economic indicators such as oil price, agriculture output, and gas supply and demand on their ability to produce the most accurate results in correctly predicting the stock market. Credible sources like The Wall Street Journal 1 and financial guru J. Welles Wilder 2suggest an investor study sentiment or “mood” surveys as a market signal or use a momentum oscillator that measures the speed of change in price movements. While there are a multitude of theories on stock market prediction there is really only one thing that we can be sure of when making stock predictions…..that nothing is certain. Stock market indicators can be grouped into 4 major market forces that look like this: > Governments hold much sway over the free markets. [|Fiscal] and [|monetary policy] have a profound effect on the [|financial] marketplace. By increasing and decreasing [|interest rates] the government and [|Federal Reserve] can effectively slow or attempt to speed up growth within the country. This is called monetary policy. If government spending increases or contracts, this is known as fiscal policy, and can be used to help ease unemployment and/or stabilize prices. By altering [|interest rates] and the amount of dollars available on the open market, governments can change how much investment flows into and out of the country. (Learn more in our // [|Federal Reserve Tutorial] //.) > The flow of [|funds] between countries impacts the strength of a country's economy and its currency. The more money that is leaving a country, the weaker the country's economy and currency. Countries that predominantly [|export], whether physical goods or services, are continually bringing money into their countries. This money can then be reinvested and can stimulate the [|financial markets] within those countries. > Speculation and expectation are integral parts of the financial system. Where consumers, investors and politicians believe the economy will go in the future impacts how we act today. Expectation of future action is dependent on current acts and shapes both current and future trends. [|Sentiment indicators] are commonly used to gauge how certain groups are feeling about the current economy. Analysis of these indicators as well as other forms of [|fundamental] and [|technical analysis] can create a bias or expectation of future price rates and trend direction. (Read more on this closely watched economic indicator; see // [|Understanding the Consumer Confidence Index] // and // [|Investors Intelligence Sentiment Index] //.) > Supply and demand for products, currencies and other investments creates a push-pull dynamic in prices. Prices and rates change as supply or demand changes. If something is in demand and supply begins to shrink, prices will rise. If supply increases beyond current demand, prices will fall. If supply is relatively stable, prices can fluctuate higher and lower as demand increases or decreases. (See more on this subject in // [|Economics Basics: Demand and Supply] // and // [|Monetarism: Printing Money To Curb Inflation] .//)3
 * 1) //Governments//
 * 1) //International Transactions//
 * 1) //Speculation and Expectation//
 * 1) //Supply and Demand//

So what we will to do as a group is analyze a data set containing different stock market indicators based on the above 4 categories to test which indicators are most accurate and effective as predictors. In our analysis we determined that logistic regression as well as neural nets may be our most effective models to work with. We focused on neural nets because of their ability to implicitly detect complex nonlinear relationships between independent and dependent variables as well as the ability to detect all possible interactions between predictor variables. Logistical regression is advantageous for our data set/project choice due to the fact that the final goal for many regression analyses is to produce a mathematical function and in our research it seems evident that many who claim they can “predict the stock market” base this assumption off an algorithm or equation they created. For example, the Renaissance Technologies was a hedge fund that got famous based on statistical models and predictors to determine funds and stocks to invest in. The fund did fairly well for a number of years, but they discovered that even with PhD’s in statistics; they could not calculate the full risk of the market and ended up failing. Therefore, any model should be taken with a grain of salt, and qualitative understandings of current markets should be used in conjunction with the statistical model.

It has become evident to us that an inordinate amount of time is spent trying to “game” the stock market and predict the future in hopes of getting rich. When it comes down to it, the only reason people enter and invest in the stock market is to make money. Period. So it seems that spending some time and energy in hopes of gaining a greater understanding of the stock market has the potential to yield incredible results and is well worth the blood, sweat and tears. Based on multiple articles and research studies, the group will test each hypothesis on the accuracy of their prediction to determine which theory holds true as the best predictor of the market.
 * Purpose of Model**


 * Literature Review**

According to research Sincere has conducted by speaking and analyzing multiple stock traders’ strategies, he uses two broad based sets of data: sentiment surveys and trading volume. Sentiment surveys include Investors Intelligence Sentiment Survey and the Consumer Confidence Index. Trading volume includes Arms Index, which tracks overbought or oversold stocks which can indicate when potential bubbles or toughs occur in the market, and VIX, which measures volatility in the options market. Sincere assumes that investor psychology along with trends in volatility and trading volumes are indicators of market movements, not necessarly macroeconomic factors.
 * 1.)** //Use These Market Indicators to Predict Stock Moves// by Michael Sincere 4

According to Ken Little, 7 factors affect the stock: Inflation, interest rates, company earnings, oil and energy prices, war and terrorism, crime and fraud, and serious domestic political unrest. All these factors include both short term and long term factors. These also are making the assumption that markets are efficient and that investor psychology does not play as big of a role as Sincere seems to believe. Little makes the assumption that uncertainty is the key driver of price drops.
 * 2.)** //Uncertainty Makes the Stock Market Crazy// by Ken Little 5

Morgan Stanley uses macroeconomics to determine conditions that will affect the market in upcoming months. These macroeconomic factors include: real GDP, Consumer Price Index, unemployment rate, treasury yields, gold, oil, yen and euro exchange rates, and the VIX volatility index. These indices, according to Morgan Stanley, are accurate in analyzing the current economic health of the US. According to them, they state that current market conditions are out of line with the bull market currently in effect; therefore they expect a drop is the stock market due to the mismatch of data and current prices.
 * 3.)** //Investment Perspectives March 29, 2012 -// A Morgan Stanley Research Report 6

George Kester, a professor of finance at Martel, tests the age old theory that the outcome of the super bowl is a predictor of the stock market. This theory states that if the winning team originates for the National Football League, the market will increase, if the winning team originates from the old American Football League, the market will decline. The psychology is that investors see the teams from the old American Football League winning as something amiss in America and tend to make bearish trades. Kester has continued to research this subject (which began in the late 1960s) and shows a 91% success rate.
 * 4.)** //Super Bowl Stock Market Predictor Still a Winner// by George Kester 7

The group’s hypothesis is that each of the first articles incorporates the best model for predicting market changes. Each article, as a sum, touches on each of the four market forces mentioned in the introduction, therefore a mixture of all predictors from the four categories of government, international transactions, speculation and expectation, and supply and demand will yield the most accurate model. While the research on Super Bowl winnings as a predictor is compelling, it is the group’s prediction that this model is irrational and will not be sustained into the future. Also, the article mentions that the progression of the National Football League (NFL) is making it harder to determine which team is from each historical league, so analyst suspect that this theory will eventually break down.
 * Hypothesis**

Data was collected from Yahoo! Finance for index prices of the S&P 500, NASDAQ Composite Index, and Dow Jones Industry Index. All these indices have a collection of diversified stocks from multiple industries which we assume will give a generalized representation of the overall US market. Yahoo! Finance was used to pull data for the VIX index. All economic indicators are pulled from the Federal Reserve Bank of St. Lious’s “FRED Economic Data.” These data sets are broken down into 7 categories: money, banking, and finance; national accounts; population, employment and labor markets; production and business activity; prices; international data; and U.S. regional data. All data for the Morgan Stanley research article can be located on this site, as well as information for oil and energy prices, and multiple interest rates. Inflation rate data and is extracted from [|www.inflationdata.com] which gives monthly US inflation data from 1914 to 2012. The consumer confidence index is extracted from the “Understanding Dairy Markets” web site that gives monthly data from 1977-2012. The Super Bowl outcome data can be retrieved from [|www.superbowlhistory.net]. Other non-financial data, such as crime, terrorism, and fraud will be drawn from the FBI crime report and the US census web sites.
 * Data Collection**

Data preparation will consist of discarding outliers and blank data rows; normalizing the data by transforming them into z-scores, and partitioning the data into training and testing data sets. In addition, each of the predictors needs to be lagged by one month increments for 4 months, giving a total of 5 rows of data per predictor. For the super bowl model, dummy variables will need to be prepared for win/loss outcomes of the game.
 * Data Preparation**

//Models 1-3// Three different targets will be selected for each model: S&P 500, Dow Jones Industrial Index, and the NASDAQ Composite index, and each predictor will be tested against it. Three models will be created: linear regression, neural net, and CHAID. In addition, in order to avoid multicollinearity, a PCA factor analysis will be done to attempt to extract the primary drivers of the stock market. Multiple models for each theory will be developed to attempt to improve accuracy and stability.

//Model 4// For this model, the group is primarily concerned with either a win or a loss in the Super Bowl will lead to either a bull or bear market outcome for the year. Therefore, each year of data will assigned either a 1 for bull, or 0 for bear and 1 for win and 0 for loss. Next, the group will partition the data and run a Bayes Net, Logistic Regression, and decision tree models.

//Model 5// The hypothesis states that a mixture of economic and psychological factor will produce the highest accuracy in the model. Therefore, multiple predictors will be tested with linear regression, neural net, and CHAID.


 * Resources**

1.) @http://articles.marketwatch.com/2011-02-21/investing/30765000_1_sentiment-surveys-stock-market-traders 2.) @http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_index_rsi 3.) @http://stocks.about.com/od/whatmovesthemarket/a/Whatmovesmarket.htm [|http://www.investopedia.com/articles/trading/09/what-factors-create-trends.asp#axzz1rrHu26PP] 4.) [] 5.) [] 6.) //Investment Perspectives//**,** Morgan Stanley & Co LLC, March 29, 2012 7.) []