Evaluation of the WSR-88D Build 10 Tornado Detection Algorithm over Southwest Virginia and Northwest North Carolina
Kenneth A. Kostura, Heath E. Hockenberry, and Stephen J. Keighton
NOAA/ National Weather Service Office
The current WSR-88D radar software load (Build 10) includes a new Tornado Detection Algorithm (TDA) developed primarily by the National Severe Storms Laboratory (NSSL) (Mitchell, et al. 1998). This new algorithm is designed to increase the probability of detection (POD) compared to the Build 9 tornado algorithm (Matson 1998), while maintaining a reasonably low false alarm ratio (FAR). This is accomplished by searching for any strong azimuthally adjacent (gate-to-gate) shears, including those not associated with previously identified mesocyclone signatures. The Build 10 TDA also attempts to differentiate between a Tornadic Vortex Signature (TVS) and Elevated TVS (ETVS). Adaptable parameter settings include criteria for minimum depth required, minimum low level velocity difference and minimum velocity difference anywhere within the three dimensional signature. Mitchell et al. (1998) provide a detailed discussion of the Build 10 TDA, including a comparison to the Build 9 algorithm.
With a set of 10 severe weather cases over a four year period from 1995-1998 (Table 1), we are evaluating the performance of the new WSR-88D TDA specifically for the effective range of the Roanoke, VA WSR-88D (KFCX). We are evaluating verification statistics using archived Level II digital data from these events processed through the WSR-88D Algorithm Testing and Display System (WATADS), and comparing with Storm Data. The verification method was chosen to closely resemble that described by Witt et al. (1998), and will be described in more detail below.
This study consists of two main steps. First, the performance of the TDA "default" and "minimized" adaptable parameters are being evaluated within 150 km of the KFCX WSR-88D to compare the verification scores for our local area to those published by NSSL (Mitchell et al. 1998), and determine which of the two settings we find most suited for our operations. The minimized (or conservative) settings were chosen by the Operational Support Facility (OSF) to resemble the performance of the Build 9 tornado algorithm (OSF, WSR-88D Build 10 Training Notes). Second, we will examine other variations of the settings to the three adaptable parameters that the OSF has given approval to change, in order to determine if the algorithm performance can be improved upon for our County Warning Area (that includes the southern Appalachian mountains and the foothills, and the western Piedmont of Virginia and North Carolina).
At the time of this writing, the latter of these steps is still in the formative stages of development and results will be in the poster presentation. These variations will be chosen to reflect climatologically favored differences between tornadic storms of this region compared to those in the Plains and Midwest (i.e., supercells that are generally smaller both vertically and horizontally, and many non-supercell storms occurring within convective lines).
Our first step was to define the data set (individual events and the times over which to evaluate the statistics (see Table 1). It was the intention of the authors to evaluate the data for at least an hour before and an hour after some form of severe weather occurred over the Blacksburg County Warning Area (CWA). However, due to gaps in archive Level II radar data, the period of evaluation varied with each event. The events were defined using quality control of Storm Data. Witt et al. (1998) provides a thorough examination regarding discrepancies found within Storm Data and issues relating to algorithm scoring. The county and exact location were noted for consideration of population density. To prevent a "miss bias", storms occurring in counties with a population of less than 25 people per square mile, based on the 1990 census, were not candidates for the data set. Event time windows for were chosen around an actual tornado occurrence. The time window extends, as closely as possible, from 20 minutes prior to the beginning time of the tornado, or series of tornadoes, to one volume scan (6 minutes) after the end of the tornado. In addition, TVS and ETVS signatures were treated as one entity in producing the statistics.
Similar to Mitchell et al. (1998), the following definitions were applied: A hit (H) was a TDA detection (per volume scan) during the tornado event time window for the storm cell associated with the tornado . A false alarm (FA) was a TDA detection (per volume scan) outside the tornado event time window, or for any cell not associated with a tornado during the archive Level II playback. A miss (M) was no TDA detection (per volume scan) during the tornado event time window. A null (Null) was no detection (per volume scan) outside the tornado event time window.
The following scoring statistics were used (similar to Wilks 1995):
probability of detection (POD) = H/(H + M) ,
false alarm ratio (FAR) = FA/(H + FA),
critical success index (CSI) = H/(H + FA + M ) and
Heidke skill score (HSS) = [ 2(H*Null - FA*M)]/[(H+M)(M+Null)+(H+FA)(FA+Null)]
We have begun the study by running the Level II data through the WSR-88D Build 10 TDA supplied with WATADS version 10.0 using two separate TDA Adaptable Parameter sets, the "default" and the "minimized", for our local area. The OSF has actually delegated authority for four TDA parameter sets; 1. "Default", 2. "Minimized "(Conservative) , 3." Squall line" and 4." Tropical ". At the time of this writing, the squall line and tropical cyclone sets have yet to be tested for our area, but the results may be available for the poster. The parameters local offices have the authority to adjust within the limits of the four above sets include maximum gate-to-gate delta velocity at the base of the circulation (LADV), maximum gate-to-gate velocity anywhere in the circulation (MXDV), and circulation DEPTH. The "default" values are 25 m/s for LADV, 36 m/s for MXDV and 1.5 km for DEPTH. The "minimized" (conservative) settings are 56 m/s for LADV, 74 m/s for MXDV and 5.0 km for DEPTH.
3. PRELIMINARY RESULTS AND CONCLUSIONS
Individual case statistics are summarized in Table 1.
TABLE 1. Case-by-case statistics for events examined in this study. ERR indicates a division by zero error. POD is Probability of Detection, FAR is False Alarm Ratio, CSI is Critical Success Index, and HSS is the Heidke Skill Score.
|DEFAULT CASE||DATE||EVALUATION PERIOD||HIT||FA||Miss||Null||POD||FAR||CSI||HSS|
|Carroll Co. Wind||08/18/95||2055-2110Z||0||0||0||80||ERR||ERR||ERR||ERR|
|Franklin Co. Wind||07/02/96||1753-2012Z||0||2||0||578||ERR||1||0||0|
|July 4th 1997 Wind||07/04/97||2049-0217Z||0||4||0||891||ERR||1||0||0|
|MINIMIZED CASE||DATE||EVALUATION PERIOD||HIT||FA||Miss||Null||POD||FAR||CSI||HSS|
|Carroll Co. Wind||08/18/95||2055-2110Z||0||0||0||80||ERR||ERR||ERR||ERR|
|Franklin Co. Wind||07/02/96||1753-2012Z||0||0||0||580||ERR||ERR||ERR||ERR|
|July 4th 1997 Wind||07/04/97||2049-0217Z||0||0||0||895||ERR||ERR||ERR||ERR|
One of our initial expectations going into this study was that too many "false alarms " would occur with the default settings. As a result, this would lead to forecaster desensitization to alarms and no faith in the radar's detection of true tornadic signatures. Another expectation was that there would be relatively large differences between statistics generated by the National Severe Storms Lab (NSSL), even perhaps with the non-plains data set, and the Blacksburg local study. While the expectation of the high number of false alarms seems to be true, there are no large differences between the overall Blacksburg statistics and the NSSL independent data set scores for non-plains events. In their Table 5, Mitchell et al. (1998) reported the following statistics for the NSSL TDA: POD=0.25; FAR=0.52; CSI=0.20; HSS=0.32. This is very close to the summed Blacksburg statistics for the default case (our CSI of 0.197 is essentially the same). With the minimized settings, the Blacksburg POD and CSI lower to only 0.019 (leaving the forecaster with very little algorithm guidance), and is also nearly the same as that listed in Mitchell et al. (1998) for the Build 9 algorithm for non-plains data. Figure 1 shows a graphical comparison between the Blacksburg "default" settings and the NSSL TDA non-plains data, while Figure 2 shows a comparison between the Blacksburg "minimized" settings versus the NSSL independent data set scores for the WSR-88D Build 9 tornado algorithm.
Figure 1. Comparison of the TDA default parameter performance statistics using 10 test cases in the Blacksburg Weather Service warning area versus NSSL statistics for non-plains environments. The Blacksburg statistics are placed behind the NSSL scores.
Figure 2. Comparison of the TDA minimized parameter performance statistics using 10 test cases in the Blacksburg Weather Service warning area versus NSSL statistics for non-plains environments. The Blacksburg statistics are placed behind the NSSL scores.
Considering the overall statistics for all ten events we have examined so far, it appears that the default settings for the Build 10 TDA algorithm will produce a larger number of false alarms than detections. In addition, a local forecaster at NWSO Blacksburg can expect that the probability of detection of tornadoes generally remains low. However, if one examines the individual events, the Stoneville tornadic event illustrates the potential of the TDA. The Stoneville F3 tornado was associated with an isolated supercell of horizontal and vertical scale more similar to typical Plains-type tornadic storms, but representing a small minority of tornadic storms in the Blacksburg CWA. The default TDA settings for this event resulted in a very high POD (0.82), which nearly doubles the FAR (0.44) in this case. If the Stoneville statistics are removed from the rest of the sample, the POD is reduced to 0.03 and the FAR rises to 0.92, considerably worse than the NSSL non-plains data set. In fact, for the remaining nine events, which include 6 tornadoes and represent more typical severe and tornadic storms for our region in terms of their size and life cycle, the default TDA settings produced guidance considerable worse than the minimized settings (due to the very high FAR). It is also worth noting that only one of the 15 correct tornado detections (HITs) was independent of a mesocyclone algorithm detection, while eight of the 22 false alarms were independent of algorithm mesocyclones. It appears, at least for our region, that separating the TDA from the mesocyclone algorithm tends to hurt more than it helps.
These preliminary results are discouraging, especially when considering the algorithm performance for the climatologically favored tornadic storms in this region. However, when forecasters expect the possibility of unusually large, isolated, and long-lived supercells, there is promise that the Build 10 TDA default settings will perform quite well in those situations. As this study continues, we hope to determine if there is one parameter setting that will provide reasonable guidance for all scenarios, or if we will want to consider switching between two or more settings depending on the environment. We plan to add several more cases and evaluate at least two other parameters sets to help make this kind of determination, and will include the results in the poster presentation.
Matson, D., 1998: WSR-88D Doppler radar adaptable parameter optimization of the Meso/TVS algorithm. Nat. Wea. Dig., 22, 31-38.
Mitchell, E. DeWayne, Steven V. Vasiloff, Gregory J. Stumpf, Arthur Witt, J.T. Johnson, Kevin W. Thomas, 1998: The National Severe Storms Laboratory tornado detection algorithm.
Wea. Forecasting, 13, 352-366.
Wilks, D. S., 1995: Statistical Methods in Atmospheric Sciences, Academic Press, 467 pp.
Witt, Athur, Michael D., Eilts, Gregory J., Stumpf, E. DeWayne, Mitchell, J. T. Johnson and
Kevin W. Thomas, 1998: Evaluating the performance of WSR-88D severe storm detection algorithms. Wea. Forecasting, 13, 513-518.