Uncertainty Analysis in a Forensic Practice, Part One
By Arthur Croft, DC, MS, MPH, FACO
It is not uncommon in forensic settings to find estimates of unknown quantities expressed with an unlikely degree of assuredness. For example, in previous articles, I have written that in accident reconstruction, better termed auto crash reconstruction (ACR), reconstructionists frequently report estimated crash velocity to two decimal places.1 This practice may be driven by the desire to provide compelling testimony or expert opinion.But whether it is the result of unbridled and uninformed zeal, or whether it is patently and intentionally deceptive, we all must be watchful for it, because it is all too common. Uncertainty is frequently a factor in many other forensic issues as well.
An approach to this problem of uncertainty within the scientific community was the development of an international experimental uncertainty standard, published in 1993 by the International Organization for Standards (ISO).2 Some time later, a very similar guide was developed by the National Conference of Standards Laboratories.3 The American National Standards Institute (ANSI), the American Society of Mechanical Engineers (ASME) and the American Institute of Aeronautics and Astronautics (AIAA) also have adopted standards for uncertainty expression that closely follow the ISO Guide.4
Readers will be familiar with authors reporting the degree of uncertainty in the medical literature in terms of p values. This is a statistical metric that allows the reader to know the approximate level of confidence the researchers had that the results they obtained were statistically significant and not due simply to chance. The most common cutoff value used is p=0.05, above which we generally consider nonsignificant. It means that we are essentially 95 percent confident that the results were not due to chance. Another way of expressing this quantity, and one that is really preferable, is confidence intervals (CI). These are typically reported as a value with the 95 percent CI given as a range. Understanding these simple relationships is crucial for physicians who read the medical literature, but one need not take a course in statistics to understand this sometimes arcane language. To students and my doctor clients who are unfamiliar with terms such as negative predictive values and sensitivity and specificity, or how to recognize confounding and bias, I usually recommend a helpful little book called Studying a Study and Testing a Test, by Reigleman, as a painless way to learn.
But statistics will only take you so far in determining the precision of findings. It is incumbent upon researchers themselves to provide the reader with a gauge of the study's level of certainty. In peer-reviewed literature, this is usually but not always a requirement. For example, in the engineering literature, many authors do not subject their work to statistical analysis, reporting instead only measures of central tendency such as mean, median or mode, or relative percent proportional differences between groups, leaving the reader to wonder whether the findings might actually have been significant. While this omission occurs occasionally in the biomedical literature and frequently in other fields, it is virtually de rigueur in the field of forensic medicine. Thus, while the practical application of ACR is founded on classical Newtonian physics and similarly undisputable principles of mathematics, it frequently requires the estimation of two or more unknown variables. The more estimations that must be made, the greater the uncertainty.
What follows is a fairly simple example of how uncertainty can enter an analysis and how it can be handled. Although I am using an ACR example, this method could apply to any scenario involving two unknowns. The methods we'll look at below range from fairly simple mathematical calculations, which can be easily made on a pocket calculator, to methods utilizing somewhat more rigorous mathematical methods. In part two of this paper, we'll explore more sophisticated statistical solutions. I'll give an example of how Bayes' Theorem allows one to calculate the probability of an event occurring inversely. It is a conditional probability allowing us to calculate the probability of a "before" event, knowing the probability of an "after" event. We also will look at how Monte Carlo statistics can provide us with one of the most powerful ways of estimating the effects of uncertainty by introducing randomness into computerized models and simulations. If the reader gains nothing more from these brief editorials than a new appreciation of the potential degree of uncertainty in forensic practices, I have accomplished my goal.
I will start with a classical example of a false-start rear-end automobile collision in which the defendant- and plaintiff-hired ACRs come up with very different numbers for crash velocities. We'll apply an uncertainty analysis and report a more meaningful range of velocities.
As a common example of how uncertainty can slip into an ACR, take the case of a false-start crash in which the bullet vehicle driver is stopped some distance behind the stopped target vehicle at a traffic signal. As the light turns green, the target vehicle driver proceeds forward, but has to stop again suddenly for a pedestrian making a late crossing. The bullet vehicle driver, who does not see the pedestrian, does not anticipate the stop and strikes the target vehicle from the rear. The usual method to employ here is first to use a vehicle dynamics equation to estimate the closing velocity of the bullet vehicle. Equation 1 is used for this purpose:
where ve is the ending velocity for the bullet, vi is its initial velocity, a is its rate of acceleration, and d is the distance it accelerates before making contact. Here we must make some estimations, since it is impossible to know what the actual distance between the two vehicles was with any precision and since the actual acceleration rate of the bullet is unknown. The most common method of obtaining the distance value is to use the witness' statements, but witnesses are known to be unreliable judges of distances when sitting in their vehicles.
The actual distance to use in the calculation is the distance between the final stopped position of the target vehicle and the initial stopped position of the bullet vehicle, which might confuse things since the driver of the bullet may be thinking in terms of the distance between the two vehicles when they were stopped before the collision (which would likely have been less). Of course, the bullet might also have applied the brakes moments before impact as well, and this must be accounted for by another estimation. As for acceleration, textbooks can be consulted that give a normal acceleration value for a passenger vehicle accelerating away from a stop in traffic as 4.8 fps2.5 However, a recent report of measured acceleration by the editor of Accident Investigation Quarterly suggests the average take-off acceleration from intersections for vehicles, light trucks, vans and SUVs is actually closer to 8.6 fps2, and frequently is higher than 10 fps2.6
The following equations assume only uniform acceleration and are based on a method described by Brach and Brach.7 First we take what we believe to be the upper and lower boundaries of the two independent variables a (acceleration in fps2) and d (distance in ft), which will affect our dependent variable, v (velocity). This solution takes the form of v=V±dv, where the quantity v is equal to the sum of the reference value, V, and a variation dv. Assuming we are choosing the lower and upper boundaries of a of 4.8 fps2 and 8.6 fps2 and of d of the 4 ft and 12 ft, these calculations take this form:
Our dependent variable, v, is calculated using equation 1 (with the Vi2 portion eliminated to simplify, since it is 0) and falls between the minima and maxima of the independent variables:
The uncertainty, dv, is halfway between these bounds:
and the reference value, V, is the average of the upper and lower values of v:
In this case, the calculated value of the minimal values gives 6.2 fps and the maximal values gives 14.4 fps. The reference value gives 10.3 fps and dv gives 4.1 fps. Thus, the value of v is 10.3 fps ± 4.1 fps (7.0 mph ± 2.8 mph). This is the best way to present your calculations in a forensic setting, because it allows others to gauge the degree of uncertainty present in the calculations.
In the above case, we used independent variables to determine V and dv in order to arrive at a value, v. In other words, v is a function of any number of variables, xi:
Another method is to use differential calculus in the form of a Taylor series. This is a more involved method using differential calculus that will give only slightly narrower ranges of speed, but it is best to use the method above, both for its simplicity and its somewhat more liberal (and, I would hasten to add, realistic) ranges. Note, for example, that the values calculated above could still underestimate the actual crash speed since we did not consider the largest reported acceleration value of 10 fps2. The Taylor series calculation will yield a marginally tighter value of 6.9 mph ± 2.3 mph.
Naturally, with more than two variables, these calculations can become rather tedious. Other ways to assess the reliability of derivatives of multiple calculations, which each require potentially uncertain input, are to take partial derivatives of the constitutive equations involved in the analysis. Using these derivatives, the overall uncertainty of a calculation can be estimated using the root-sum-of-squares (RSS) method. Unfortunately, because of the complicated, nonlinear nature of the equations used in reconstruction, this is impractical. Monte Carlo simulations, which are used frequently in statistics, provide a means of quantifying probability distributions and, hence, reliability of estimations used in ACR. Many modern computer statistical programs can run Monte Carlo simulations given the proper input data. These can then produce graphic probability distributions. We'll do this in part two.
Another simple method is to use a log-differential. To do this, we first write the equation in logarithmic form. Note that equation 1 can be rewritten as such:
and then as:
Differentiating, we get:
where 2 is assigned a value of 0, since there is no uncertainty attached to its value.
Then we can rewrite the equation as:
In the case above, the two acceleration factors were 4.8 and 8.6. Thus, the mean is 6.7. The fractional uncertainty (8.6-4.8/2=1.9) is about ±1.9/6.7=0.3.
Similarly, we can find a fractional uncertainty in the distance values of ± 4/8=0.5. Thus, with just two variables to be concerned with, we have a total fractional uncertainty of (1/2)[0.3+0.5]=0.4 or 40%. So, the realistic range of values for velocity, given these two variables, would again be equal to the average velocity of 7 mph ± 2.8 mph (4.2 to 9.8 mph).
It is important to remember that in reconstructing crashes of higher speeds and for the more traditional purpose of attributing culpability, uncertainties of ± 5 mph or even higher are generally acceptable since it may only be necessary to prove that driver A was traveling in excess of the posted speed limit in order to assign fault. Thus, if the calculations in our sensitivity analysis show him to have been traveling at a speed of between 55 mph and 65 mph - a total range of uncertainty of 10 mph - and the posted speed was 45 mph, we have managed to accomplish our goal in spite of that uncertainty. He was certainly speeding. In low-velocity collisions, where the crash speeds are often 10 mph or less, such a degree of uncertainty would clearly be unacceptable.
In the above examples, we considered only two independent variables. In real crashes, as in many forensic situations, there are likely to be more. We might, for example, consider friction coefficients and impulses based on the type of roadway and the relative braking of the two vehicles. We might also consider bumper contact duration which, as it turns out, is covariable with braking effects and the coefficient of restitution. And the coefficient of restitution varies with crash speed. Thus, we could easily have five or more variables to consider. All of them carry uncertainty; this uncertainty propagates throughout the calculations because of their interdependence and/or covariance.
The range of uncertainty given in the earlier example (4.2 mph to 9.8 mph) might be unable to resolve certain issues in litigated cases because the risk for injury at one extreme may be somewhat unlikely (taking into account other risk factors), whereas the risk for injury at the other might be very high.
Yet it is rare, in my experience, for ACRs to provide these uncertainty analyses in cases of low-velocity collision, perhaps in order to avoid calling attention to just how much uncertainty really exists. Instead, it is common practice to consider the so-called "worst-case scenario" and then report results in unjustifiably precise terms, with more significant figures than is justified by their mathematics.
One often sees velocity figures given not as a range but as a single value with three significant figures, such as a delta V of 2.34 mph.
In part two of this article, we'll look at some of the more sophisticated ways of estimating uncertainty.
Click here for more information about Arthur Croft, DC, MS, MPH, FACO.