Software Reliability Engineering

🔹 SRE Process

Requirement Analysis – Define reliability goals (e.g., “99.9% uptime”).
Modeling – Use reliability models (like exponential, Weibull, Musa-Okumoto) to predict failures.
Testing – Reliability growth testing, stress testing, and fault injection.
Measurement – Collect metrics (failure rates, defect density, uptime).
Improvement – Refine design, testing, and processes to meet reliability goals.

MTTF (Mean Time To Failure) → Average time software runs before first failure.
MTTR (Mean Time To Repair) → Average time taken to fix a failure.
MTBF (Mean Time Between Failures) = MTTF + MTTR.

Define Reliability Requirements
- Example: “System uptime should be 99.95% per month.”
- “Web app must handle 1M transactions with <0.1% failures.”
Develop Operational Profile
- Identify how users interact with software.
- Example: Login (40%), Search (30%), Payment (20%), Others (10%).
- Helps in testing the most-used functions more thoroughly.
Reliability Modeling
- Apply Statistical Reliability Models to predict failures.
- Popular models:
  - Musa-Okumoto Model (Exponential growth)
  - Jelinski-Moranda Model
  - Goel-Okumoto Model
  - Weibull Distribution
Test Planning & Execution
- Reliability Growth Testing (RGT).
- Stress testing (under heavy load).
- Fault injection (deliberately causing errors).
Measure & Monitor
- Collect real-time failure data.
- Calculate reliability metrics (MTTF, MTBF, Failure Rate).
Improve & Maintain