Road Safety Data, Collection, Transfer and Analysis DaCoTa. Workpackage 6, Driver Behaviour Monitoring through Naturalistic Driving: Deliverable 6.2: Part B: Sampling techniques and naturalistic driving study designs.

Abstract

In this document we provide an overview of sampling and estimation methods that can be used to obtain population values of risk exposure data and safety performance indicators based on naturalistic driving study designs. More specifically, we discuss how to determine the optimal sample size required for the estimation of such population values based on a probabilistic sample of that same population, and with a predefined level of precision. Examples of population values of interest are the mean or the total of the number of kilometres travelled by all drivers of a car in a country, and the percentage of these same drivers wearing a seat belt. We restrict ourselves to probabilistic sampling techniques because non-probabilistic sampling techniques like convenience and snowball sampling do not lend themselves to the evaluation of the statistical properties of parameter estimates from a sample, and are therefore unfit for the estimation of sample size. In Chapters 2 and 3, where simple random sampling and stratified random sampling are introduced, respectively, we assume that a complete sampling frame is available for all units in the population of interest. In Chapter 4 we discuss how to sample from a population with unequal probabilities, and why this can be useful. In Chapter 5 we extend the discussion to the situation where a complete sampling frame is not available, and present multi-stage sampling. In Chapter 6 we present two alternative methods of estimation of population parameters from a sample: the ratio and the regression estimator. In Chapter 7 we consider the possibilities and implications of repeated sampling of the same population. In all these chapters we are only concerned with the quantification of sampling error, and its consequences for the estimation of sample size. However, there are other types of potential errors as well, and these will be discussed in Chapter 8. Finally, considering all these aspects of sampling techniques, in Chapter 9 we provide a list of recommendations for the study design of the collection of data based on naturalistic driving observations. As will become clear in the following chapters, a key concept in deciding about the size of a sample is the concept of precision. Precision quantifies how closely the sample estimate of a population parameter (such as a mean or a total or a percentage) corresponds to the actual value of the population parameter. Keeping everything else fixed, the following rule applies: given a certain sampling strategy, the higher the precision we impose on our estimate, the larger the sample should be. If we only tolerate an absolute 1% error in our estimate of a population characteristic like the percentage of car drivers in a country wearing a seat belt, for example, a larger sample will be required than if we can settle for an absolute 5% error in our estimate. Supposing that the true percentage of seat belt wearing is 80%, the latter more liberal precision implies that we can expect the estimated percentage to be within the range of 80±5% (i.e., somewhere between 75% and 85%), while the former more conservative precision yields an estimated percentage with a range of 79% to 81%. With a precision of only 5% it will therefore not be possible to detect effects of road safety measures on seat belt wearing that are smaller than 10%, while the more conservative precision of 1% allows us to detect much smaller effects, if any. The results presented in this document are based on the classic textbook of Cochran (1977), on Moors and Muilwijk (1975) and Hays (1970), and on several internet sources. (Author/publisher)

Road Safety Data, Collection, Transfer and Analysis DaCoTa. Workpackage 6, Driver Behaviour Monitoring through Naturalistic Driving: Deliverable 6.2: Part B: Sampling techniques and naturalistic driving study designs.

Publication

SWOV publication