Reducing simulation bias in mixed logit model estimation.

Author(s)
Bastin, F. & Cirillo, C.
Year
Abstract

Mixed logit are flexible forms of discrete choice models; their choice probabilities are not in closed form and are currently estimated with simulations using random draws from the mixing distributions. In these cases thelog-likelihood function to be maximized is the sum of the logarithm of the expected choice probabilities; bias is then introduced since each choiceprobability is the argument of a logarithmic operator. Expectations are usually estimated using Monte Carlo (MC) and quasi-Monte Carlo (QMC) simulations; both methods produce unbiased estimators of these choice probabilities. In particular, QMC approaches can be viewed as variance reduction techniques, and as such, allow the reduction of the error for a fixed number of draws. Halton sequences and scrambled Halton sequences have been tersted for mixed logit applications and their use was found to be vastly superior to standard Monte-Carlo methods. Both MC and QMC simulations introduce approximation errors that affect the value of the likelihood function at the optimum and the final estimates of the parameters. Although one can argue that simulated scores could be adopted to produce unbiased estimators or to compute the maximum likelihood estimators, it should be recognized that loglikelihood maximization is by far the most popular approach amongst researchers and practitioners. A Taylor expansion is proposed for the explicit estimation of bias in mixed logit models. The bias is in the order ofO(1/N) for each probability choice, where N is the sample size per individual (i.e. the number of draws used to devise the approximate objective). The standard deviation has an order of (1/N2); for large N, the standard deviation error exceeds the bias error. For moderate sample sizes, and therefore for moderate numerical cost, both errors should be taken into account. Moreover, it is known that bias is in O(I), where I is the population size, while the standard deviation is in O(1/I), so that if the population size is increased, the bias can occupy a significant part in the total estimation error. It should be noted that these orders are valid for the objective function, but nothing is known about the parameters, which makes difficult to assess the estimation error for a given value of N. For practical values of N it is therefore reasonable to consider the bias effect, especially when an estimation of the bias can be cheaply obtained as in Monte Carlo approximations. The method could be extended to quasi-Monte Carlo techniques as long as standard deviations are computed. This estimation is used to correct the loglikelihood objective during the maximization processand shows that significant error reduction is then obtained on the final objective value but also on the optimal parameters, which is an important concern for practitioners. These findings are illustrated using both synthetic and real data, and the consequences are considered when the value of time is considered or when the model is used to make predictions. For the covering abstract see ITRD E145999

Request publication

1 + 6 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.

Publication

Library number
C 49348 (In: C 49291 [electronic version only]) /72 / ITRD E146058
Source

In: Proceedings of the European Transport Conference ETC, Leeuwarden, The Netherlands, 6-8 October 2008, 12 p.

Our collection

This publication is one of our other publications, and part of our extensive collection of road safety literature, that also includes the SWOV publications.