Data Readiness¶
The nomad readiness command helps administrators determine when sufficient data has been collected for reliable ML predictions.
Quick Start¶
nomad readiness # Basic readiness report
nomad readiness -v # Verbose with feature details
nomad readiness --db mydata.db # Specify database
What It Assesses¶
The readiness estimator evaluates four dimensions:
1. Sample Size¶
| Level | Jobs | Score |
|---|---|---|
| Minimum | 100+ | 40% |
| Recommended | 500+ | 70% |
| Optimal | 1000+ | 100% |
2. Class Balance¶
Compares successful vs failed jobs. Ideal ratio is between 60:40 and 90:10. Extremely imbalanced data (>95% one class) reduces prediction reliability.
3. Feature Coverage¶
Checks that all 17 feature dimensions have sufficient variance:
- CPU efficiency
- Memory utilization
- I/O wait percentage
- NFS write ratio
- Runtime characteristics
- Exit signals
- And more...
Features with zero variance (all same value) don't contribute to predictions.
4. Data Recency¶
Recent data is weighted more heavily. The estimator checks:
- Jobs from last 24 hours
- Jobs from last 7 days
- Jobs from last 30 days
Stale data (>30 days old only) triggers a warning.
Output Example¶
======================================================================
NOMAD Data Readiness
======================================================================
Overall Score: 72% (Recommended)
======================================================================
Sample Size ================.... 847 jobs (85%)
Class Balance ==============...... 82:18 ratio (70%)
Feature Coverage ==================== 17/17 features
Data Recency ==========.......... 3 days old (50%)
======================================================================
Forecast: At 125 jobs/day, optimal threshold (1000)
will be reached in approximately 2 days.
======================================================================
Verbose Mode¶
With -v or --verbose, see per-feature statistics:
Shows coefficient of variation (CV%) for each feature, helping identify which metrics contribute most to predictions.
Recommendations¶
The command provides actionable recommendations:
- Low sample size: "Continue collecting data. Run
nomad collectto gather more jobs." - Class imbalance: "Dataset is heavily skewed toward successful jobs. Consider longer collection period to capture more failure modes."
- Missing features: "GPU metrics unavailable. Install nvidia-smi for GPU-enabled nodes."
- Stale data: "Most recent job is 15 days old. Ensure collectors are running."