Exposure Assessment Models
Virtual Beach (VB)
| Version | Release Date |
|---|---|
| Release Notes | |
| 2.2 | March 2012 |
| 2.0 | September 2010 |
- Introduction
- Audience
- Abstract
- Applications and Possible Uses
- Software History
- Technical Support and Training
- Quality Assurance and Quality Control
- Related Sites
- References
Introduction
Virtual Beach is a software package designed for developing site-specific Multiple Linear Regression (MLR) models for the prediction of pathogen indicator levels at recreational beaches.
Audience
VB is primarily designed for beach managers responsible for making decisions regarding beach closures due to pathogen contamination. However, researchers, scientists, engineers, and students interested in studying relationships between water quality indicators and ambient environmental conditions will find VB useful.
Abstract
VB reads input data from a text file or Excel document, assists the user in preparing the data for a multiple linear regression (MLR) analysis, enables automated model selection using a wide array of possible model evaluation criteria, and provides predictions using a chosen model and new observational data. With an integrated mapping component to determine the geographic orientation of the beach, the software can automatically decompose wind/current/wave speed and magnitude information into along-shore and onshore/offshore components for use in subsequent regression analyses. Data can be examined visually using simple scatterplots to gage relationships between the response and independent variables. VB can quickly produce interaction terms between the primary independent variables, and it can also test an array of transformations on the IV's in order to maximize the linearity of the relationship between the response and IV's. The software includes exhaustive and genetic algorithm (GA) search routines for finding the "best" models from a large array of possible choices. Automated censoring of models with a high degree of multicollinearity (at least one large Variance Inflation Factor) occurs during the selection process. Models can be constructed either using previously collected data or forecasted environmental information. VB includes residual diagnostics for models, and automated outlier indentification and removal using DFFITs or Cook's Distances.
Applications and Possible Uses
Generation of regression models to predict pathogen indicator levels for any freshwater or saltwater beach site. Analyses have been performed at these locations:
- Predicting E.coli levels at Huntington Beach, OH (2000-2010).
- Predicting enterococci levels (culturable and qPCR) at various Great Lakes' beaches: West Beach, Porter, IN; Washington Park, Michigan City, IN; Silver Beach, St. Joseph, MI; Huntington Beach, Bay Village, OH; South Shore, Milwaukee, WI.
- Predicting enterococci levels (culturable and qPCR) at various marine beaches: Goddard Beach, West Warwick, RI; Edgewater Beach, Biloxi, MS; Fairhope Beach, Mobile, AL; Hobie Beach, Miami, FL; La Monserratte, Puerto Rico; Boqueron Beach, Puerto Rico, Surfside Beach, Myrtle Beach, SC.
Software History
Virtual Beach 2.2 grew out of the Virtual Beach Model Builder application (VB 1.0) developed by Walter Frick and Zhongfu Ge. Similar to VB 2.2, VB 1.0 can be characterized as a MLR model building tool that supports data analysis by visual inspection of plots and manipulation of variables (e.g. transformations, creating interaction terms). Unlike VB 2.2, initial data processing in VB 1.0 is followed by a manual, iterative process of testing, comparing and evaluating models. During the development stage, model fitness (as measured by the Mallow's Cp statistic) is computed and tracked, allowing for comparison and eventual selection of a “best” model for the dataset under consideration. This model can then be used to produce estimates of pathogen levels with current or forecasted environmental data from the site. VB 2.2 enhances the functionality of its predecessor in a number of ways:
| Feature | Virtual Beach 1.0 | Virtual Beach 2.2 |
|---|---|---|
| Stastical Modeling Technique | Multiple Linear Regression | Multiple Linear Regression |
| Mapping/GIS | No | Yes - Interface for Determining Beach Location and Orientation |
| Automated Data Source Identification | No | Yes - Using D4EM |
| Variable Transformations | Yes - Manual | Yes - Automated to Maximize Linearity Between Y and X's |
| Processing Directional Data | Yes - Calculation of U and V Components for Generic Directional Data | Yes - Calculation of U and V Components for Wind, Current, and Wave Data |
| Forming Interaction Terms | Yes - Manual | Yes - Automated, and Includes Additional Algebraic Manipulations |
| Model Selection using Goodness of Fit Criteria | Manual - Based on Mallow's Cp (with BIC Reported) | Automated - AIC, Corrected AIC, BIC, PRESS, RMSE, Sensitivity, Specificity, Accuracy, R-Squared, Adjusted R-Squared |
| Handling Large Numbers of Independent Variables | Yes - Manual Stepwise Regression | Yes - Automated Genetic Algorithm |
| Exhaustive Calculation of All Model Permutations | No | Yes |
| Model Sensitivity and Specificity Calculations | No | Yes |
| Independent Variable Colinearity Avoidance | Yes - Manual Pre-Screening | Yes - Automated Filter Using VIF Values |
| Generation of Model Output Text Reports | No | Yes |
| Residual Analyses | Yes - Residual Plots | Yes - Standardized Residual Plots, DFFITS and Cook's Distances, Outlier Removal with Automated Model Re-Fitting |
| Univariate Scatterplots of Y and Individual X's | Yes | Yes |
| Generating Predictions Based on Developed Models | Yes - Manual | Yes - Automated |
| Coded Language | Delphi | C# |
| Input File Formats | Text, Excel 4 | Text, Excel 2003/2007/2010 |
Technical Support and Training
Questions regarding the Virtual Beach application and its supporting software and documents should be submitted to the Center for Exposure Assessment Modeling (CEAM) at the Ecosystems Research Division of EPA’s National Exposure Research Laboratory in Athens, GA.
Quality Assurance and Quality Control
VB 2.2 has undergone quality assurance testing and the user guide has been externally reviewed.
Related Sites
- The Virtual Beach integrated environmental modeling web site on iemHUB
References of Published VB2 Applications and Uses
- Ge, Z. and W. E. Frick. Time-Frequency Analysis of Beach Bacteria Variations and its Implication for Recreational Water Quality Modeling. ENVIRONMENTAL SCIENCE & TECHNOLOGY. 43(4):1128-1133, (2009).
- Frick, W.E., Z. Ge, and R.G. Zepp. 2008. Nowcasting and Forecasting Concentrations of Biological Contaminants at Beaches: A Feasibility and Case Study. Environmental Science & Technology. 42(13):4818-4824.
- Ge, Z. and W.E. Frick. 2007. Some Statistical Issues Related to Multiple Linear Regression Modeling of Beach Bacteria Concentrations. Environmental Research. 103(3):358-364.