Unemployment+Factors+and+Prediction+Proposal

 ** Unemployment Factors and Prediction ** Project Proposal for GSM 672: Data Mining Team E: Mohammad Al Fayez, Sathana Valli Sathya Moorthi, Shirley Rodriguez [|Link to the proposal in Word document]

Spring 2011

= = = = =I. PURPOSE=

Introduction
In the last years, unemployment rates have been one of the most heard topics when analyzing the country’s economical recovery. The unemployment rate, the percent of the labor force that is unemployed, is highly considered by the government, investors, economists and citizens in general when making decisions. Then, being able to accurately predict the unemployment rate will lead to better and more certain decisions. The present project aims to develop a predicting model, which will identify the most important economical variables to forecast the unemployment rate.

Literature review
Secondary research shows that the economy is growing again and creating some jobs, but those new positions are not enough to replace the millions of jobs lost during the recession for many years [1]. People who are actively searching for a job, might find some motivation if they are able to have access to a promising prediction of the unemployment rate. Questia’s article emphasizes in the importance of forecasting unemployment during contractionary periods because it brings important information about duration and turning points of these cycles. This suggests that any forecasting gains in predicting the employment rate that can be made during contractionary periods will be valued more than those made during expansionary periods [2]. For many years, predicting unemployment has been an active field of study. Past attempts have found that nonlinear models might improve the forecast ability of the unemployment series during economic contractions [3]. However, studies that experimented using some nonlinear extension leading to a cyclical trend model with smooth transition in underlying parameters to improve the forecast accuracy did not bring exceptional results to previous studies [4]. Another method developed to predict the unemployment rate was an autoregressive model that used past values of unemployment. The main goal of this research was to forecast the unemployment rate using a model that accounts for both spatial and temporal correlations [5]. Various economical variables have been considered to forecast the unemployment rate in the past. Some of those like the minimum wage seem to be relevant to predict but one cannot assure that the variable itself will bring an accurate prediction [6]. It is expected a more accurate result in prediction when utilizing different variables that all together will bring a better predicting outcome. To achieve this goal it is important to consider variables directly related to the unemployment rate. Good examples are Consumer sentiment, which is a crucial economic indicator for projecting inflation, retail sales, unemployment and other factors within the economy [7] and monetary policies that affect real interest rates, which in turn shape demand and ultimately output, employment, and inflation [8]. = = = = = = =II. Variable Control within the Purpose= To achieve the goal of this project several economical variables need to be analyzed and used to predict unemployment rate. During the analysis some of those variables may emerge to be more relevant and correlated to unemployment. The following table lists the initial variables selection:
 * Variable || Description ||
 * Unemployment rate || The dependent variable that will be predicted. The use of unemployment rate instead of unemployment is more relevant in this case since the model uses historical data. And using the rate instead of the actual number would be much easier. ||
 * Inflation rate || Inflation rate is to measure the change in prices over time. ||
 * Consumer Price Index (CPI) || CPI is a measure of change in prices paid by consumers and it is used to calculate inflation. ||
 * Minimum wage || The US federal minimum wages. This variable is in US dollar per hour. ||
 * Consumer Sentiment || University of Michigan Consumer Sentiment (UMCSENT): it is an index that measures the consumer spending confidence based on about 500-phone interviews. The index is updated on a monthly basis. ||
 * Gold price || The gold price can be an indication of the consumer confidence and the overall economic condition. ||
 * Oil prices || Oil is the most important source of energy and an increase in its price can harm the speed of the economical growth and therefore employment. ||
 * Import and export price index || Is a monthly index that measures the average change in prices of imported or exported goods and services ||
 * Producer price index || Is another index that measures the average change in prices received by US producers for their finished goods or services. ||
 * Gross domestic product (GDP) || GDP is the value of all the goods and services produced by a country during a given period. ||

= = = = =III. Hypotheses= The unemployment rate shall increase more than the normal tolerance level and with the model explored we will be able to find the indicators that should be closely watched to study and understand the same. Premise 1: The unemployment rate rose 14.7 % after the recession ended in 1991 and almost the exact amount after the 2001 recession [9]. Premise 2: No one is predicting another economy downfall in which case predicting reasons for unemployment becomes important for common man. Ex: No one predicted 2008 depression after 9/11 downfall [10]. = = = = = = =IV. Data Collection Procedure= <span style="line-height: 115%; margin-bottom: 12.0pt; margin-left: 0in; margin-right: 0in; margin-top: 12.0pt;">All the data needed for the project is publicly published online through different sources. After collecting all the data the team will proceed with the following steps:
 * 1) Standardized the data: The data needed for the project is released in different frequencies (daily, monthly, quarterly and yearly). And since the unemployment numbers are released monthly all the data should be in the same frequency. If only yearly or quarterly data is available, the numbers must be duplicated to fill the missing data for the proceeding months.
 * 2) Adjust the number, if needed, for inflation (such as oil and gold price) in order to get more accurate results.
 * 3) Add the data to the main dataset and combine the data files in one single file or dataset.
 * 4) Clean the missing data and outliers.
 * 5) Normalize data if required.
 * 6) Re-scale to 0 and 1 if required by the model or will enhance the predictive accuracy of the model.

= = = = =V. Data Interpretation Plan:= <span style="line-height: 115%; margin-bottom: 12.0pt; margin-left: 0in; margin-right: 0in; margin-top: 12.0pt;">To make our model to predict unemployment factors, we grouped several social and economical data sets. There are several predictors such as Series Id and period which needs to be converted to numerical or not taken into consideration. Also we have years and other variables in different statistical measurement and needs to be normalized. Once several models are performed and analyzed, we shall take the best model based on accuracy. We also understand that sometimes more than one factor might result in importance. Thus we realize it is critical to apply our enterprise knowledge and finance expertise to make a decision. In general, the nature of the problem will have more tolerance to misclassification error. Moreover, classifying ‘relevant factor to unemployment as irrelevant’ shall have more affect on the model than classifying an irrelevant factor as relevant to employment. Also co-relation between several predictions is an area of importance. If that is the case, we will need to deal with it and look into issues like over-fitting. <span style="line-height: 115%; margin-bottom: 12.0pt; margin-left: 0in; margin-right: 0in; margin-top: 12.0pt;">Moreover, we understand that to come up with an efficient model we shall need to discuss with data mining experts about our progress and steps involved. As a part of the same, we might approach Paul Dwyer, Mike Hand and IBM cognos expert Brain for the same. <span style="line-height: 115%; margin-bottom: 12.0pt; margin-left: 0in; margin-right: 0in; margin-top: 12.0pt;">Finally, this project when successful shall be used by economists to understand unemployment, Government agencies to bring support about unemployment (DHS), common man seeking employment and investors trying to judge on mode and source of investment at a period of time.

= = = = =References=

[| http://dspace.wrlc.org/bitstream/1961/9409/1/Saghafi,%20Talla%20-%20Fall%20'09.pdf]
 * Consumer Sentiment: The Economy’s Crystal Ball?. Talla Saghafi Fall 2009**

[| http://www.econedlink.org/lessons/index.php?lid=984&type=student]
 * Econedlink. Economic and personal finance resources**

[]
 * Federal Reserve Bank of San Francisco.**

[]
 * Forecasting Unemployment with Spatial Correlation. University of Missouri**

[| www.questia.com/googleScholar.qst?docId=5002290105]
 * Questia. Trusted Online Research.**

[| www.sciencedirect.com/science?_ob=MImg&_imagekey=B6V8V-472JRC1-2D-4&_cdi=5880&_user=4330739&_pii=S016794730200230X&_origin=gateway&_coverDate=03%2F28%2F2003&_sk=999579996&view=c&wchp=dGLbVzW-zSkWb&md5=92e94e7edc131105cc10bcf4e02bd781&ie=/sdarticle.pdf]
 * Science Direct**


 * The effect of the minimum wage on employment and unemployment. Journal of Economic Literature, Volume XX . June 1982**

[1] Econedlink. Economic and personal finance resources. [2] Questia. Trusted Online Research. [3] Questia. Trusted Online Research. [4] Science Direct: [|www.sciencedirect.com/science] [5] Forecasting Unemployment with Spatial Correlation. University of Missouri [6] The effect of the minimum wage on employment and unemployment. Journal of Economic Literature, Volume XX. June 1982 [7] Consumer Sentiment: The Economy’s Crystal Ball? [8] Federal Reserve Bank of San Francisco [9] [] [10] [|http://www.economicpopulist.org/content/unemployment-realistic-forecast]