Predicting Water Quality: When Data Science meets Marine Biology
By Kevin Stam
Residents of the Pacific Northwest, including myself, are incredibly fortunate to live by the ocean. The Pacific Northwest is home to Washington State’s largest estuary--also known as the Puget Sound--which spans over 1,300 miles of coastline from Victoria (British Columbia) all the way down to Olympia.
The Puget Sound adds to our state’s livelihood and a number of residents enjoy leisurely activities in the Sound including: diving, sailing, paddle-boarding, whale-watching, as well as kayaking.
But sometimes, the water is simply too dangerous for its inhabitants and wildlife.
While the risks of going for a swim during a lightning storm or tumultuous winds might seem obvious to the average beachgoer, many people are less-informed about the dangers of water quality.
In fact, there are a number of contaminants which can affect marine organisms and in turn, us. When these contaminants make their way into marine water bodies, they can cause irreparable damage to organisms and prompt local governments to issue health advisory warnings or close beaches entirely.
One of the most common phenomena is known as runoff. Runoff occurs during periods of heavy rain when pesticides, industrial chemicals (i.e: motor oils), trash, and pathogens are carried down by rain water directly into marine ecosystems.
Another common danger for water quality is HAB (Harmful algal blooms).
Some areas, like the Gulf of Mexico experience “red-tides'', when algal blooms caused by toxic alga such as Karenia brevis can become extremely troubling. Breaking waves during red tides cause K. brevis to release its toxins into the air, which in turn leads to eye, throat, and nose irritation. This can be particularly problematic for people sensitive to respiratory conditions, including asthma. Algal blooms also explain why health authorities regulate shellfish collection in certain areas, as toxins often accumulate in shellfish which humans often consume.
Issues in water quality monitoring and public health management have been tormenting marine scientists for decades:
For one, the data may not always be of adequate quality. Scientists from NGOs or local public health systems often check water quality at the same locations year after year and over brief periods of time, such as a single season. In doing so, they leave many other “humanly-accessible” areas unchecked and have limited measurements of water quality across time.
Secondly, and more alarmingly, most state-issued beach advisory guidelines are subject to a delay from the time the water has been analyzed in the lab to the time advisories are actually issued on beaches.
This means that unsuspecting tourists or beachgoers may be notified of unsafe conditions only once it is too late.
Luckily marine scientists are developing solutions to improve marine water quality measurement and even predict it over time! These tools combine data science, microbiology, and coastal weather processes to forecast water quality.
One of the most recent publishings came from a Ph.D. researcher at Stanford University, Ryan Searcy, whose team ran an experiment to test whether they could accurately predict water quality in the future based on short time-span measurements.
Their experiment used water samples from three different California beaches and measured concentrations of FIB (fecal indicator bacteria), such as Enterococcus and E. Coli.
Even though FIB is one of many possible contaminants in aquatic ecosystems, their team concluded that the predictive model could also apply to other contaminants as well.
To develop their predictive model, the researchers combined measurements of the FIB with historical data from local governments at the same locations. Both of these measurements included high-frequency data, meaning many measurements over a short period of time (every 10-to 30 minutes for a period of one to two days). Once they acquired new microbiological data, Ryan’s team added environmental parameters to their model. Some of these parameters included: wave height, times of tides, sunset, and sunrise, and water temperature found in public online databases. With both historical and current bacterial measurements as well as the environmental parameters, they were able to develop and train a machine-learning algorithm to predict how the bacterial concentrations would vary through time.
The team then compared their predictions for bacterial concentration variation to the actual data later collected by local government officials. When comparing the two for their study using statistical methods, they found a pretty good match!
While these methods are complex and depend on advanced statistical methods, they show that data science combined with marine science can be incredibly useful for predicting water quality in any marine area. These methods could help local governments reduce the predicted 90-million poor water-quality related illnesses as well the over $2.2 Billion expenses
The Stanford team hopes other local governments both in the US and internationally can adopt their own predictive models and has publicly shared the programming code he used to develop his own model.
References:
“Ocean Pollution.” National Oceanic and Atmospheric Administration, www.noaa.gov/education/resource-collections/ocean-coasts/ocean-pollution.
O'Neill, Mike. “Stanford Researchers Develop an Innovative New Way to Predict Beach Water Quality.” SciTechDaily, 22 Jan. 2021, scitechdaily.com/stanford-researchers-develop-an-innovative-new-way-to-predict-beach-water-quality/.
“ORCA FACTS.” Puget Sound Starts Here | Facts, pugetsoundstartshere.org/Facts.aspx.
Ryan T. Searcy and Alexandria B. Boehm Environmental Science & Technology 2021 55 (3), 1908-1918 DOI: 10.1021/acs.est.0c06742
Thyng, Kristen M., et al. “Origins of Karenia Brevis Harmful Algal Blooms along the Texas Coast.” ASLO, John Wiley & Sons, Ltd, 16 Dec. 2013, aslopubs.onlinelibrary.wiley.com/doi/full/10.1215/21573689-2417719.
US Department of Commerce, National Oceanic and Atmospheric Administration. “Harmful Algal Blooms (Red Tide).” (Red Tide), 10 Apr. 2019, oceanservice.noaa.gov/hazards/hab/.