AquiferPulse—groundwater early-warning (in preparation)
I’m building a clean monthly view of groundwater for Senegal. The map shows each basin as alert, watch, or normal, plus a small top-10 to watch and a quick time series per basin. It blends public datasets (storage from satellites, rainfall, and land data).
Supervised machine-learning drought detection at station scale (1994–2024)
Droughts pose a significant threat to ecosystems, agriculture, and water security. Timely detection is critical for implementing effective mitigation strategies. This project develops a binary classification model to identify drought conditions on a monthly basis at specific weather stations in Southeastern Australia. By leveraging both climate and vegetation indices, we aim to create a more responsive and data-driven tool compared to traditional methods.
Project Overview
This project presents a machine learning framework for predicting drought events across nine stations in Southeastern Australia from 1994 to 2024. The workflow is divided into two distinct phases: (1) comprehensive data preparation and feature engineering in R, and (2) model development, validation, and testing in Python. A key aspect of our investigation is evaluating the model's performance under a transfer learning scenario to a previously unseen station.
Methodology
Phase 1: Data Preparation & Feature Engineering (R)
Data Ingestion: Climate and vegetation data for nine stations were ingested.
Feature Engineering:
Drought Indices: The Standardized Precipitation Index (SPI) was calculated for 3-month (SPI-3) and 12-month (SPI-12) timescales using the SPEI package to capture short-term and long-term precipitation anomalies.
Vegetation Indices: Normalized Difference Vegetation Index (NDVI) data were forward-filled using only past data to avoid look-ahead bias.
Label Creation (Drought Definition): A binary drought label was created based on root-zone soil moisture. A month is classified as a drought (1) if its soil moisture value falls below the expanding 20th percentile of all preceding months. This threshold is calculated step-by-step to ensure no data leakage, making the label suitable for a real-world predictive model.
Phase 2: Model Development & Evaluation (Python)
Data Splitting: The data was split chronologically, with the first 80% for training and the most recent 20% held out for testing.
Baseline Model: A class-weighted Random Forest classifier was established as a robust baseline to handle potential class imbalance.
Model Calibration: A validation set from the latter part of the training period was used to calibrate the prediction threshold for optimal performance.
Advanced Models: The baseline was compared against a small Neural Network (NN) and ensemble methods.
Performance Metrics: Models were evaluated using Accuracy, F1-score, and ROC-AUC to provide a comprehensive view of predictive performance, with a focus on the F1-score due to the likely class imbalance.
Transfer Learning Experiment: The final model was tested on data from a new, unseen station to assess its generalizability.
Phase 2—Models
Models were trained on the first 80% of data and tested on the remaining 20%. Two model families were trained: a class-weighted random forest and a compact neural network. Because drought months are rare (~17% in the test tail), we calibrated the operating point with the precision–recall curve rather than the default 0.5.
The Random Forest classifier exhibited strong discriminative performance (ROC-AUC: 0.90) on the test set. When optimized for recall with a decision threshold of 0.30, the model achieved a recall of 0.83 for drought months with an F1-score of 0.615. The neural network attained slightly lower performance (ROC-AUC: 0.865, F1: 0.588) after similar calibration. Time-series cross-validation results (mean F1: 0.588) confirmed the stability of these estimates.
The optimized Random Forest (ROC-AUC: 0.901, F1: 0.638) demonstrated superior performance compared to both neural networks and ensemble methods, establishing it as the most effective classifier for this task.
Phase 3-Spatial Generalization
This phase evaluated spatial generalization by applying the source-trained model to the unseen station. The model showed good ranking capability (ROC-AUC: 0.822) but poor calibration for the new domain (F1: 0.448), particularly underestimating drought prevalence. Light fine-tuning and retraining significantly improved performance (accuracy: 0.856, F1: 0.662).