Evaluating the Relationship between Operating Speed and Collision Frequency of Rural Multilane Highways Based on Geometric and Roadside Features

Speed is one of the main functional factors that affect road safety in terms of both collision occurrence and collision severity. Previous studies have shown that several roadside and geometric features affect road safety and operating speed. This paper aims to evaluate the effects of roadside and geometric features on operating speed and collision frequency, simultaneously. For this purpose, the operating speed data of 103 segments along with their accident data and roadside and geometric characteristics were collected. Structural equation modelling (SEM) with latent variables was employed to model operating speed and collision frequency, simultaneously. Two latent variables including “geometric effect” and “roadside effect” were defined in SEM. The first latent variable is the combination of the natural logarithm of the segment length, longitudinal slope, the presence of a 2-meter paved shoulder, and curvature of the segment. The indicators of the second latent variable are the number of accesses and the presence of residential land use. The results show that the latent variable “roadside effect” increases collision frequency by a standard regression weight of 3.455; however, it reduces operating speed by a standard regression weight of –0.385. Also, the latent variable “geometric effect” causes an opposite effect on collision frequency and operating speed by the standard regression weight of –5.313 and 0.730, respectively. Besides, lower operating speed causes a reduction in the collision frequency by the standard regression weight of 7.734. The results of this study can be useful for designers and road safety agencies to improve road safety.


Introduction
Speed is one of the main functional factors that affect road safety, which affects both collision occurrence and collision severity [1].According to statistics of World Health Organization [2], one-third of road accidents occur due to speed.Therefore, the relationship between speed and safety is a demanding research interest.Many researchers have reported that increasing vehicle speed is accompanied by increasing crash severity or probability of crash occurrence [1,[3][4][5][6].However, few studies have shown a negative relationship between speed and crash occurrence [7].
Vehicle speed is among the most important and complicated factors during driving that may confuse drivers.As a result, road safety prediction depends on the operating speed more than the posted speed limit.In many cases, the operating speed is higher than the posted speed limit.Therefore, operating speed is an effective representation of drivers' behavior on a given road.
Concepts related to operating speed are defined in different ways, with the best proposed by the AASHTO green book [8]: "Operating speed is the speed at which drivers operate their vehicles during free-flow conditions.The 85th percentile of the distribution of observed speeds is the most frequently used measure of the operating speed associated with a particular location or geometric feature".
Collision frequency is another criterion applied to investigating road safety and crash severity.According to the above-mentioned definition, operating speed is an effective index to assess speed and safety, which was also used in the present study.Previous studies have shown that speed is not the only factor that influences accident occurrence or injury severity.Also, speed is just one of many factors that influence accidents or injuries [1].In this regard, Elvik et al. (2004) presented a classification to represent the factors affecting road safety (Figure 1).

Figure 1. Variables influencing road accident [1]
In Figure 1, dependent variables are defined as the number of accidents or the number of accident casualties, independent variables are variables that affect the dependent variables indirectly, mediating variables are risk factors through which other variables can affect dependent variable indirectly, moderating variables are other risk factors, which may include a safety measure that does not intend to influence, and confounding variables are other variables (in addition to speed) that influence the number of accident or injuries [1].
This classification shows that to investigate the relationship between speed and safety, researchers must model the effect of several variables on operating speed and the effect of operating speed on safety, simultaneously.However, the previous studies mainly considered the direct effect of geometry, traffic (including the speed), environment, and roadside features on safety [4][5][9][10][11][12] with few of them focusing on the effect of different factors on speed and the effect of speed on collision frequency [3].It is worth mentioning that Gargoum and El-Basyouny (2016) neglected the effect of roadside features, such as land use and the number of accesses, on speed and safety.Thus, the present study is conducted to investigate the relationship between operating speed and collision frequency based on the simultaneous direct and indirect effects of all geometric and roadside features on operating speed and collision frequency using structural equation models (SEMs) [3].
The remainder of the present study was organized as follows: In Section 2, a literature review is presented.In Section 3, data collection process and principles of SEM are described.In Section 4, modeling results and discussion is presented.Finally, in Section 5, concluding remarks are provided.

Literature Review
Many studies have been conducted on the relationships among speed, geometric, traffic, environmental, and roadside features with road safety.In what follows, some of these studies are summarized.
A group of studies has focused on the effect of geometry, traffic (including speed), and roadside factors on collision frequency and severity.One of the first studies in this field is the one conducted by Aljanahi (1999), who investigated two groups of sites in the United Kingdom and Bahrain and provided some models for collision frequency.The results of this study for a group of sites revealed a statistically significant relationship between speed and collision frequency, whereas the results of other groups show a relationship between accidents and variability in traffic speed [9].Garber and Ehrhart (2000) investigated the effect of speed, traffic flow, and geometric features on collision frequency.They used mean speed, the standard deviation of speed, flow per lane, lane width, and shoulder width as the independent variables and crash rate as the dependent variable.These researchers applied multiple linear regression and multivariate ratio of polynomials as the modeling approaches.The results showed that among the studied variables, the standard deviation of speed has the highest effect on the crash rate [10].Pei et al. (2012) evaluated the relationship between speed and crash risk with respect to distance and time exposure.Among the noteworthy points of this study is simultaneous modeling of crash probability and crash severity.Moreover, this study is among the few efforts made on evaluating the effect of merging and diverging ramps on crash occurrence probability.Their results showed a positive relation between speed and crash occurrence [11].Imprialou et al. (2016) used two different data aggregation approaches (i.e., condition-based and link-based approaches).Based on the results of condition-based approach, they suggested that high speeds trigger crash frequency [12].
There are several studies for collision frequency modeling based on the Poisson model [5,9,[12][13][14] and negative binomial regression model [4,[14][15][16].In this regard, Tanishita and Wee (2016) showed that both mean speed and changes in mean speed affect collision frequency.These researchers also considered the effect of weather (sunny and cloudy days) on collision frequency in their study [5].
One of the latest studies on the relationship between speed and safety is the one conducted by Gitelman (2017), who used free flow speed collected by GPS devices and presented two models for day and night hours.In this study, the effect of speed along with some geometric factors (including lane width, shoulder width, horizontal radius, vertical radius, and vertical grade) and some roadside factors (including roadside condition and junction density) were investigated for single-carriageway roads.The results showed that the number of crashes increases with increasing segment length and higher traffic volumes.In addition, road segments with better road design standards are associated with lower crash rates compared with those with lower road design standards [4].
Another group of studies presents a power model for extracting the relationship between the change in average speed and the change in accident numbers.The power model was validated by Elvik et al. (2004) by performing a metaanalysis on the findings of 98 studies [1].According to Gitelman et al. (2017), in the power model, the change in accident numbers following a change in average speeds is proportional to the corresponding speed change at a certain exponent, where the value of the exponent is higher for higher severity [4].Elsewhere, Elvik (2013) indicated that the relationship between speed and road safety depends not only on the relative change in speed but also on initial speed [17].Some researchers have investigated the influence of different geometric and traffic variables [14,18] and roadside variables [13,[19][20] but did not present the speed as an effective variable on safety.They applied several variables, including number of lanes, lane width, shoulder width, median width, median type, radius, sight distance, pavement conditions, roadside hazards, road markings, pedestrian distribution, segment length, traffic flow, posted speed limit, clear zone width, side slope, pedestrian volume, guardrail length, fence length, and land use.
Gargoum and El-Basyouny (2016) considered the simultaneous effects of geometric features on speed and collision frequency by using SEM.They explored the relationship between speed and safety and found that among other variables, average speed, traffic flow, segment length, medians, and horizontal curves all have statistically significant effects on collision frequency.In addition, shoulders, speed limits, and vehicle lengths significantly affect the average speed [3].However, this study did not incorporate environmental and roadside variables, even though they were reported as significant variables for predicting crash frequency in intersections by Castro et al. (2012) [21].
A summary of previous studies along with the variables and modeling approaches is presented in Table 1.Among rare studies conducted to evaluate the simultaneous effect of different factors on collision frequency, the study by Shankar et al. (1995) can be mentioned; however, these researchers did not investigate the effect of speed on accident frequency [15].In a similar study, Pei et al. (2012) evaluated the effect of geometric and environmental factors on the crash occurrence and crash severity; however, they neglected the effect of roadside factors such as land use [11].
As can be seen, several geometric, roadside, environmental, and traffic variables, along with speed, affect collision occurrence or the number of the collisions; however, there is no study that evaluates the effects of all above-mentioned factors on collision frequency.In addition, most of these studies did not consider the simultaneous effect of different geometric and roadside factors on speed and their effect on collision frequency.Thus, the objective of this study is to propose a model in which the simultaneous effect of different geometric and roadside features along with the mediator effect of speed on collision frequency is investigated.It should be mentioned that SEM (which is used in the presented study) is the only powerful tool that can estimate the effect of different variables on each other and can estimate latent variables that are the combination of observed variables.

Data Collection
In this study, the data were gathered from the Boroujerd-Khoramabad multilane highway, located in Markazi Province of Iran.The data includes vehicle speed; collision frequency; and geometric, roadside, and environmental features of the highway.For data collection, the highway was divided into 103 homogenous segments.The segmentation criteria used for this purpose are shown in Table 2 and are derived from the site selection criteria in the NCHRP report 504 [24].A laser gun was used to collect spot speed data in each segment.Thus, the speeds of 100 vehicles during day hours and fine weather conditions were collected in each segment (10,300 spot speeds in total).After collecting the spot speed data, a well-equipped, trained team surveyed the entire route.The team took photos using a digital camera and recorded coordinates of the sites using a GPS device.Accident data for a one-year period (March 2014-March 2015) were collected from Ministry of Roads and Urban Development of Lorestan Province.These data, which are based on the number of accidents in each kilometer, incorporate the number of deaths or injuries and the causes of the accidents.Table 3 presents descriptive statistics of all variables used in this study.

Methodology
To investigate the simultaneous effect of variables on operating speed and collision frequency, the SEM method was used.In what follows, a brief introduction of this method is provided.SEM is a multivariate regression model that allows the researcher to simultaneously measure a set of relations between the measured and latent variables.In other words, this method is a combination of principal factor analysis (PFA) and multivariate regression [25].This hybrid model is a mix of measurement model and structural model.In the measurement models, it is determined which latent variable is measured by the observed variables, whereas in the structural models, it is identified which independent variable affects which dependent variables, that is, which variables are correlated.
Accordingly, using this model it is possible to investigate the simultaneous effects of several variables on each other.Since it is possible to reduce the number of the dependent variables, SEM was applied in the present work.The basic equation of the structural model is defined as [26]: Where  is a  × 1 vector of latent endogenous variables;  is a  × 1 vector of the latent exogenous variables;  is a  ×  matrix of the coefficients associated with the latent endogenous variables; Γ is a  ×  matrix of the coefficient associated with the latent exogenous variables; and  is a  × 1 vector of error terms associated with the endogenous variables.The basic equation of the measurement model is the following [27]: where  and  are column q-vector related to the observed exogenous variables and errors respectively; Λ  is a  ×  structural coefficient matrix for the effects of the latent exogenous variables on the observed variables;  and  are column p-vector related to the observed endogenous variables and errors, respectively; and Λ  is a  ×  structural coefficient matrix for the effects of the latent endogenous variables on the observed ones.
There are three different goodness-of-fit indices for SEM: absolute-fit indices, incremental-fit indices, and parsimony-fit indices.to Brown (2014), it is recommended to investigate and report at least one indicator from each category [28].Several indices can be used to assess SEM.Based on Xie et al. (2017) [29], chi-square, root mean square error of approximation (RSMEA), the comparative fit index (CFI) and the Tucker-Lewis index (TLI) are the most widely used measures that are presented with their cut of point's values in Table 4.As can be seen, SEM is the only tool that can estimate several equations simultaneously; in addition, it can estimate latent variables and evaluate the effect of mediator variables on the dependent variable.Therefore, due to this study's objectives (which is to propose a model in which the simultaneous effect of different geometric and roadside features along with the mediator effect of speed on crash frequency is investigated), SEM is the best method for modeling.
As it is mentioned in the previous sections, the research includes the field work, data collection, surveying works, and modeling.The stages of this research are as shown in Figure 2.

Results
The SEM method and latent variables have been used to simultaneously examine the factors affecting speed and safety.AMOS is a useful tool that has been used for speed modeling studies [33]; thus, to perform the modeling, AMOS 24 software was used.The general form of SEM and fitting model results are presented in Figure 3 and Table 5, respectively.
It should be mentioned that in the following tables, regression weight means regression coefficient, standard error is the standard deviation of the sampling distribution of the statistics, the p-value is the result of the statistical significance test of the null hypothesis in that each unstandardized regression coefficient equals zero and the amount of the p-value at less than 0.05 refers to a statistically significant variable, and the standard regression weights represent the amount of change in the dependent variable that is attributable to a single standard deviation unit's worth of change in the predictor variable [34].The chi-square divided by degree of freedom of the SEM is 1.480, less than the acceptable threshold.The model has RMSEA less than 1, which is acceptable.Additionally, the TLI and CFI of the model are near the acceptable criteria.
To achieve the best results, several variables were evaluated, and eventually, the statistically significant ones were used in modelling.Moreover, several paths between variables were tested for modelling, and only the statistically significant ones were kept.Also, to determine the mediator effect of variables, latent variables were defined.Latent variables are the combination and the indicators of observed variables.Therefore, many variables were combined with each other to make latent variables, and finally, two latent variables "roadside effect" and "geometric effect," each one representing different measured variables, were defined for this purpose.All of the indicators of these latent variables are statistically significant and are defined as follows.
The latent variable "roadside effect" is the combination of two observed variables: land use type index (1 if land use type is residential, 0 if otherwise) and the number of accesses.The effect of "roadside effect" on operating speed shows that this latent variable decreases the operating speed by the standard regression weight of -0.385, while it raises collision frequency by about 3.455.The coefficients of this latent variable's indicators suggest that by increasing the number of accesses in a segment and the presence of residential land use, the operating speed decreases while the collision frequency increases.It is worth mentioning that the sign of the effects of these variables on operating speed and collision frequency is correct based on previous studies [13][14]35].
The latent variable "geometric effect" is the combination of four observed variables: logarithm of segment length, slope, curvature, and the presence of a paved shoulder greater than 2 meters.This latent variable raises operating speed by 0.730 kilometers per hour, but it causes a reduction in collision frequency by a standard regression weight of -5.313.Based on the sign of the indicators of the "geometric effect," it can be concluded that segment length and a shoulder width greater than 2 meters leads to a higher operating speed and lower collision frequency.It is noteworthy that in the previous studies, the relationship between the length of segments and number of accidents was positive [4,16], whereas we found this relationship negative.As mentioned in previous sections, segmentation criteria in the present study include variations in geometric and roadside features.Therefore, in cases with large segment length, these variations were not effective, leading to reduced collision frequency in such a segment.Moreover, lower values of slope and curvature contribute to a lower value of operating speed and a higher value of collision frequency.Also, the sign of these effects are correct based on previous studies [12,[14][15][16]18].Above all, operating speed was found to have a statistically positive significant effect on collision frequency by the standard regression weight of 7.734, suggesting that a higher collision frequency is associated with a higher operating speed in road segments.This result is consistent with earlier research [1,[3][4][5][6].
It should be noted that using the latent variables and SEM in the present study enabled us to incorporate not only geometric features (segment length, slope, curvature, and paved shoulder width), which were considered in previous studies but also roadside features (land use type and number of accesses) simultaneously to investigate their direct and indirect effects on operating speed and collision frequency.The analysis also revealed that the operating speed has a mediator effect on collision frequency in that the effect of roadside and geometric features on speed and the effect of speed on collision frequency were found to be statistically significant.This modeling approach, which is the first one to the best of our knowledge, can provide a suitable tool for road designers and managers to enhance road safety.

Conclusion
Vehicle speed as an effective factor on road safety has been the subject of many studies.The present study was conducted to evaluate the effect of operating speed, as a proper representative of drivers' behavior, in multilane highways, on collision frequency.
Using a laser gun, more than 10,300 spot speed data were gathered on the Boroujerd-Khoramabad highway, and the accident data were prepared from the Ministry of Roads and Urban Development of Lorestan Province, Iran.To investigate the simultaneous effect of all geometric and roadside features on operating speed and collision frequency, several observed variables were combined into latent variables and finally two latent variables: "roadside effect" (the indicators are residential land use and the number of accesses) and "geometric effect" (the indicators are segment length logarithm, slope, curvature, and presence of paved shoulder above 2 meters) were defined.
Investigating the direct and indirect effects of latent variables on collision frequency revealed that the combination of the number of accesses in each segment and residential land use ("roadside effect") increases collision frequency by a standard regression weight of 3.455; however, it reduces operating speed by a standard regression weight of -0.385.Also, the latent variable "geometric effect" causes an opposite effect on collision frequency and operating speed by the standard regression weight of -5.313 and 0.730, respectively.
In addition, evaluating the mediator effect of operating speed on collision frequency shows that lower operating speed causes a reduction in collision frequency by the standard regression weight of 7.734.Thus, in addition to studying the effective geometric features, the effect of environmental and roadside features were also employed to model operating speed and collision frequency.Furthermore, the factors affecting operating speed and the factors affecting collision frequency and the impact of operating speed on collision frequency were modeled simultaneously.
The results of this study can be of interest to road designers for geometric design of highways.In addition, road safety agencies can use the finding of this study to improve road safety.One of the limitations of this study is not considering other environmental conditions, such as weather conditions.For future studies, it is recommended to evaluate the effect of weather conditions and traffic flow characteristics as well as other latent variables on operational speed and accident frequency.Besides, incorporating a larger accident data set may enhance the accuracy of the results.

Table 4 .
Fit indices and their thresholds [25chi-square relative to degrees of freedom with an insignificant p-value (p > 0.05)

Figure 2 .
Figure 2. Flowchart of research process

Figure 3 .
Figure 3. Structure of structural equation model (SEM) for speed and safety