skip to primary navigationskip to content

Patricia Mokhtarian: Abstract and background reading

Title image Patricia Mokhtarian


SHORT ABSTACT: Declining survey response rates make it increasingly critical for survey designers across disciplines to utilize mechanisms that facilitate timesaving on the part of respondents. In practice, this often means that questionnaires are shortened, yielding increased response rates but reduced information/variables available for modeling and forecasting purposes. We address this challenge using data-driven approaches such as machine learning within the context of the rapidly growing big data landscape, to develop and apply a transfer learning-based framework for integrating and enriching surveys, thereby expanding the amount of information available for use.


READ MORE: Evidence confirms that survey response rates have been falling steadily for over half a century, and researchers agree that the field may be converging upon a critical point at which the validity of survey findings is increasingly called into question (Lohr & Raghunathan, 2017; National Research Council, 2013; PTV NuStats, 2011). Theories of survey response find that respondents fail to complete surveys for a plethora of reasons – critical among them, increased concerns over intrusions on time and privacy (Goyder, Boyer, & Martinelli, 2006). As demands upon individuals’ time continue to grow, the Social Exchange Theory of survey response explains that perceived benefit for the “cost” of response time is decreasing (Dillman, Smyth, & Christian, 2014). This is supported by empirical evidence showing that collective attention span is decreasing due to an overload of content that exhausts attention resources (Lorenz-Spreen, Mønsted, Hövel, and Lehmann, 2019). Accordingly, it is increasingly important to attend to efforts that facilitate timesaving on the part of respondents. This has resulted in widespread efforts by survey designers to reduce the lengths of survey questionnaires, thereby improving response and completion rates but simultaneously reducing the amount of information obtained. 

The implications of reducing survey length are particularly pertinent within fields like transportation, where engineers and planners depend upon long-form travel diary and survey data to forecast evolving infrastructure needs. Already, the poor performance of travel demand forecasting models is well documented (Bain, 2009; Hartgen, 2013; Nicolaisen & Driscoll, 2014; Parthasarathi & Levinson, 2010; Voulgaris, 2019; Welde & Odeck, 2011), with current models often operating at less than 10% explanatory  power, and requiring subjective alterations to improve performance. Such poor model performance is partially attributable to the lack of diverse variables such as attitudes, preferences, perceptions, social and personal values, and other such system user traits (i.e. psychometric data) for use within forecasting models. Furthermore, the data/variables needed to answer complex research questions are seldom available through a single survey dataset (Sivakumar & Polak, 2009). With the increasing need to shorten questionnaires, this lack of availability of diverse variables promises to be a growing challenge.   

Addressing this challenge will necessitate a broad range of approaches centered around improving data quality and richness. Research in this domain has typically focused on the use of novel non-survey-based data sources to support transportation modeling (Shaw, Wang, Mokhtarian, & Watkins, 2020). However, forecasting travel behavior still depends on household and individual-level survey data, due largely to the user-verified, self-reported nature of survey responses, alongside their ability to obtain domain-specific data that often isn’t (easily) available through other data streams. As such, in this paper, we focus on developing a flexible framework for expanding the data available from surveys by enriching/integrating survey datasets (“recipient surveys”) with survey variables outside of the original survey domain (i.e., from “donor” surveys). To maximize all tools available, the framework uses novel, big data sources alongside data-driven machine learning (ML) algorithms; however, we also show that the essence of the data transfer framework can be applied even in the absence of these tools (i.e., within a simpler context).



Bain, R. (2009). Error and optimism bias in toll road traffic forecasts. Transportation, 36, 469-482. doi:10.1007/s11116-009-9199-7

Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. New York: John Wiley & Sons, Inc.

Hartgen, D. T. (2013). Hubris or humility? Accuracy issues for the next 50 years of travel demand modeling. Transportation, 40(6), 1133-1157. doi:10.1007/s11116-013-9497-y

Lohr, S. L., & Raghunathan, T. E. (2017). Combining survey data with other data sources. Statistical Science, 32(2), 293-312. doi:10.1214/16-STS584

Lorenz-Spreen, P., Mønsted, B. M., Hövel, P., & Lehmann, S. (2019). Accelerating dynamics of collective attention. Nature Communications, 10(1), 1759. doi:10.1038/s41467-019-09311-w

National Research Council. (2013). Nonresponse in Social Science Surveys : A Research Agenda. Washington, DC: The National Academies Press.

Nicolaisen, M. S., & Driscoll, P. A. (2014). Ex-Post Evaluations of Demand Forecast Accuracy: A Literature Review. Transport Reviews, 34(4), 540-557. doi:10.1080/01441647.2014.926428

Parthasarathi, P., & Levinson, D. (2010). Post-construction evaluation of traffic forecast accuracy. Transport Policy, 17(6), 428-443. doi:10.1016/j.tranpol.2010.04.010

PTV NuStats. (2011). Regional Travel Survey: Final Report. Atlanta, Georgia. Retrieved from:

Voulgaris, C. T. (2019). Crystal Balls and Black Boxes: What Makes a Good Forecast? Journal of Planning Literature, 34(3), 286-299. doi:10.1177/0885412219838495

Welde, M., & Odeck, J. (2011). Do Planners Get it Right? The Accuracy of Travel Demand Forecasting in Norway. European Journal of Transport and Infrastructure Research, 11(1). doi:10.18757/ejtir.2011.11.1.2913