Machine Learning Will Take Prioritization and Capital Improvement Project Selection to the Next Level

May 21, 2020

Machine-learning algorithms are woven into our daily lives. For instance, they use data to predict or recommend our Netflix and YouTube viewing choices. They foretell our Google searches and even dispense our shopping patterns.

There is great potential for machine learning and predictive modeling to be a force in the A/E industry as well to help municipalities with prioritization and asset management activities. Municipalities manage a number of systems—water/wastewater, stormwater and streets, to name a few—upon which their residents rely to provide basic infrastructure services.

And those systems (or parts of them) will fail at some point.

A better understanding of the likelihood of failure and how it varies throughout each utility system is a key to empowering cities to make more decisions that are proactive and data-driven about maintenance, repairs and replacements, versus simply reacting to costly, inconvenient and even catastrophic infrastructure failures.

Condition assessments, such as looking for spiral and longitudinal fractures and cracks in pipes, is an important step in the data-gathering process.

Data-Driven Decisions

Which assets are likely to fail?

Until recently, the best available tool to estimate the likelihood of failure has been a risk-based approach, which utilizes limited historical data such as the number of breaks to a water main over time. However, failure history alone only describes the end result and doesn’t investigate or describe the root cause of asset failures in the system.

Educated guesses and non-statistical assumptions can be made about why failures occur in certain parts of town or along roads that are more heavily travelled, but each part of the infrastructure system can present unique and different circumstances. Establishing patterns about the root cause of failures is generally challenging and often not feasible using the results of a risk-based approach. Although risk-based likelihood of failure findings is supported by basic asset data, it lacks the complex foundation of unique spatial and temporal characteristics that precipitate asset deterioration and ultimately cause asset failure.

The renewal decisions based on simplified, risk-based findings therefore often exclude key components of the failure cycle that can enable more comprehensive, informed decisions.

Machine learning is emerging as a valuable approach to make important decisions about infrastructure and communicate to key stakeholders and city council members why those decisions were made. Machine learning is driven by a comprehensive suite of spatial and temporal data, and its results carry statistical backing and the identification of failure patterns in most cases.

Using a machine learning model to establish the likelihood of a failure of assets based on learned patterns is a low-risk, high-value entry point into predictive modeling. A subsequent goal is to make accurate predictions for which assets will fail over a future time period—the next three months, 12 months or even five years. That is the direction predictive modeling is headed.

A common predictive modeling approach would include the steps above.

Predictive Modeling Approach

What might a model look like to facilitate a machine learning project?

The first step is having a clear understanding of the business insight, or the problem. What are the business reasons for this task?

The question design phase includes defining goals and then challenging those goals. Are the right questions being asked for the purpose of generating the model?

Data cleaning and exploratory data analysis (EDA) are the next two important processing steps. Gathering and establishing quality data—lots of it—is critical, because a good predictive model is only going to be effective if sound data is used for input. Understand where there may be missing information, and how to fill that missing information. Are assumptions being made, for example, about a pipe’s age or material because as-built information wasn’t properly recorded? Speaking with the utility staff and undergoing exploratory data analysis can help build higher confidence in the data.

Feature engineering is the nomenclature used for choosing which attributes within the available data are going to be used to build the model. Examples include geographic features such as land use, demographics, type of soil, the nearest type of road and location of the nearest manhole or inlet (other pieces of the system). The model is grouping these inputs and analyzing the relationship between the input and the outcome—for failures and non-failures.

The next important step is model selection, and many types of algorithms can be used. Some are linear; others are non-linear. Selection requires picking the model that will best answer the problem trying to be solved. There are times when multiple models are selected, allowing for a comparison of results.

Hyperparameter tuning constitutes adjusting settings or parameters within the model to dial it in and produce better agreement between predicted and actual outcomes. It’s a great opportunity to test any assumptions.

The final steps are selecting the most appropriate model, and implementation, which is making the model part of a regular process or workflow for prioritization and capital project selection.

Machine learning models will serve as valuable tools in the future, informing municipalities about the likelihood of a failure before it happens.

Efficiencies and Time Savings

Predictive modeling is similar to a regression equation in statistics, which is used to find which relationships, if any, exist between data sets. Models are learning patterns and establishing those relationships. They perform at a level most people can’t conduct themselves.

It might also be thought of as an optimization exercise. A data-intensive model is, as much as possible, trying to reduce the error between the actual outcome and the predicted outcome.

Machine learning models can be valuable tools—tools that promote a more efficient method for prioritizing capital improvement projects. Municipalities can leverage model results to provide better, safer service to customers by planning for and preventing failure.


Would you like to gain data-driven insight into your asset failures, or do you wonder if you have enough data to compute the likelihood of failure of your assets using machine learning techniques? If so, please contact us to schedule a brief phone consultation! Write to info@halff.com.