Identifying major drivers of daily streamflow from large-scale atmospheric circulation with machine learning
Hagen, Jenny Sjåstad; Leblois, Etienne; Lawrence, Deborah; Solomatine, Dimitri; Sorteberg, Asgeir.
Journal of Hydrology

Previous studies linking large-scale atmospheric circulation and river flow with traditional machine learning techniques have predominantly explored monthly, seasonal or annual streamflow modelling for applications in direct downscaling or hydrological climate-impact studies. This paper identifies major drivers of daily streamflow from large-scale atmospheric circulation using two reanalysis datasets for six catchments in Norway representing various Köppen-Geiger climate types and flood-generating processes. A nested loop of roughly pruned random forests is used for feature extraction, demonstrating the potential for automated retrieval of physically consistent and interpretable input variables. Random forest (RF), support vector machine (SVM) for regression and multilayer perceptron (MLP) neural networks are compared to multiple-linear regression to assess the role of model complexity in utilizing the identified major drivers to reconstruct streamflow. The machine learning models were trained on 31 years of aggregated atmospheric data with distinct moving windows for each catchment, reflecting catchment-specific forcing-response relationships between the atmosphere and the rivers. The results show that accuracy improves to some extent with model complexity. In all but the smallest, rainfall-driven catchment, the most complex model, MLP, gives a Nash-Sutcliffe Efficiency (NSE) ranging from 0.71 to 0.81 on testing data spanning five years. The poorer performance by all models in the smallest catchment is discussed in relation to catchment characteristics, sub-grid topography and local variability. The intra-model differences are also viewed in relation to the consistency between the automatically retrieved feature selections from the two reanalysis datasets. This study provides a benchmark for future development of deep learning models for direct downscaling from large-scale atmospheric variables to daily streamflow in Norway.