Skip to content

NØMAD

NØde Monitoring And Diagnostics — Lightweight HPC monitoring, visualization, and predictive analytics.

"Travels light, adapts to its environment, and doesn't need permanent infrastructure."

What is NØMAD?

NØMAD is a self-contained monitoring and prediction system for HPC clusters. Unlike heavyweight solutions requiring complex infrastructure, NØMAD deploys quickly, runs with minimal resources, and provides actionable insights through:

  • Real-time monitoring of disk, CPU, memory, GPU, and SLURM jobs
  • Predictive analytics using machine learning and similarity networks
  • Educational analytics tracking computational proficiency development
  • Multi-cluster dashboards with partition-level views
  • Derivative analysis detecting accelerating trends before thresholds

Philosophy

Inspired by nomadic principles:

Principle Implementation
Travels light SQLite database, minimal dependencies, no external services
Adapts to environment Configurable collectors, flexible alerts, cluster-agnostic
Leaves no trace Clean uninstall, no system modifications required

Quick Start

pip install nomad-hpc
nomad demo                    # Try with synthetic data
nomad dashboard               # Open http://localhost:8050

For production deployment, see Installation.

Features at a Glance

Feature Description Learn More
Dashboard Multi-cluster real-time monitoring Dashboard
Educational Analytics Track proficiency development Edu Module
ML Prediction Job failure prediction ML Framework
Network Analysis Similarity-based clustering Network Methodology
Alerts Threshold + predictive alerts Alerts
Community Export Anonymized cross-institutional data CLI Reference