International orientation on methodologies for modelling developments in road safety

Abstract (EN)

This report gives an overview of the models developed in countries other than the Netherlands to evaluate past developments in road traffic safety and to obtain estimates of these developments in the future. These models include classical linear regression and loglinear models as applied in Great Britain, and the ARIMA and DRAG models used in Belgium, Canada, France and Sweden. The linear regression models for Great Britain were used to forecast the number of road crash casualties of different severities in a future year (2000 and 2010). In the model used to predict the number of casualties in 2000, the year 1983 played an important role. In this year compulsatory seat-belt wearing was introduced and this turned out to be of great influence on the number of casualties. To predict the number of casualties in 2010, the effect of three road safety measures was first examined, and the number of casualties over the years was estimated if these measures had not yet been introduced. Based on these results a prognosis was made for 2010. A totally different model, i.e. not a linear regression model, was used to forecast the number of crashes and casualties for drivers older than 60 over 20 years. This model consisted of three submodels, describing the predicted number of older drivers, the predicted number of crashes involving older drivers, and the number of casualties in crashes involving older drivers respectively. An important problem with classical linear and loglinear regression applied to time series data is the assumption of independence of the observations. However, repeated observations over time are usually not independent at all, since last year’s number of casualties is often quite a good predictor for current year’s number of casualties. In a classical linear regression this is reflected in residuals that are serially correlated. This results in statistical tests whose standard errors are too small, and therefore in overoptimistic conclusions about the relations between variables that evolve over time. This in turn results in forecasts that are flawed. ARIMA and DRAG models, on the other hand, do take the dependencies between observations into account. All the developed DRAG models have basically the same structure, they all consist of several layers. The first layer describes the road demand, either expressed as the total road mileage or as the total fuel sales. The next layer is dedicated to the explanation of the number of crashes and victims. Finally, in the last layer the severity of the crashes is explained, where the severity is expressed as the number of persons injured per crash with bodily injury and the number of persons killed in an crash with bodily injury. Each layer consists of one or more models, each of them containing a large amount of explanatory variables, varying from weather conditions to economic activities. DRAG models have several disadvantages. First, being extended ARIMA models, the observations should be stationary. This means that they must have constant mean and variance over time. Since this is not the case for time series, the observations should be filtered first. Another disadvantage is the large amount of explanatory variables. For forecasting purposes the future values of all these variables need be modelled separately. SWOV uses structural time series are used to describe, explain and forecast developments in Dutch road traffic safety. This type of time series does not have the disadvantages described above: structural time series do not require stationarity, and the explanatory and dependent variables are modelled simultaneously.

International orientation on methodologies for modelling developments in road safety

SWOV publication