Regression for Army Recruiting by Sandra Jackson

Sandra completed her report in May 2015.  She studied the Army's ability to recruit as a function of socio-economic indicators.

Complete Thesis

Jackson, S. Y. (2015).  Utilizing Socio-Economic Factors to Evaluate Recruiting Potential for a US Army Recruiting Company (Masters Report).  The University of Texas at Austin.

Executive Summary

  In order to maintain military strength, the United States Army is consistently challenged with recruiting new Soldiers. Currently the Army evaluates its recruiting capacity by calculating a weighted average of the previous four years of recruiting data. This report analyzes an alternative, statistical approach to computing recruiting capacity. Specifically, the study analyzes the effectiveness of multi-linear regression and Poisson regression models to compute recruiting capacity. The statistical analysis for these models is based on United States Army Recruiting Command data with 10,323 observations, encompassing four years of recruiting from 2011-2014. The data describes recruiting performance for each recruiting company, for each month, along with several other factors such as the number of recruiters in the company, the unemployment rate of the target region, and demographic descriptions of the target region.

The study analyzes two separate regression problems: predicting recruits, and predicting recruiter rates. For each of these problems the study constructs both multi-linear regressions and Poisson regressions, based on different subsets of explanatory variables and evaluates model performance on out-of-sample data. Out-of-sample evaluation increases confidence in statistical models because it demonstrates a level of performance on data that was not used to create the model.

Surprisingly, even though essentially all previous literature on recruiting suggests Poisson regression to model recruiting arrival rates, we show strong empirical evidence that multi-linear regression is a better modeling tool than Poisson regression for the recruiting data. On out-of- sample tests involving 32 competing models, the negative log-likelihood for the multi-linear regression models is, on average over all the models, 11% smaller than the corresponding Poisson regression model. On out-of-sample tests involving an additional 20 models, the negative log-likelihood for the multi-linear regression is on average 85% smaller than the corresponding Poisson regression.

When the number of recruits is the dependent variable, for both the multi-linear and Poisson regression models, the best individual socio-economic factor to predict the number of recruits is the number of qualified military aged persons, followed by the number of micro zip codes. However, the explanatory variable with the most predictive power is the number of recruiters, which is not a socio-economic factor but a measure of the resources the Army devotes to recruiting. The multi-linear regression models have the most predictive power. A multi-linear regression model that includes the number of recruiters and five socio-economic factors has the most explanatory power.

When recruiter rate is the dependent variable, surprisingly, a constant is a great predictive model. Socio-economic factors, specifically the unemployment rate, do add additional explanatory power - particularly for the multi-linear regression models. The statistical analysis of recruiter rate suggests there is great potential for recruiting capacity because socio-economic factors do not limit the number of recruits. In other words, the results suggest that if the Army wants to increase recruits, one additional recruiter results in an additional 0.89 recruits.

Future work should include increasing years of historical data to compensate for possible homogeny in the time period of this study’s data set. Furthermore, the study only includes five socio-economic factors. Other socio-economic factors, such as the vast array of factors collected by the American Community Survey (American Community Survey, 2015), require additional exploration. Another avenue of future study is to potentially apply regression models to recruiting within different PRISM segments, a system that demographically splits the population.

© Copyright 2004-2020 - Ned Dimitrov