Overview
The Steatosis-Associated Fibrosis Estimator (SAFE) score is a multivariable index derived to help clinicians stratify liver fibrosis risk in adults with a diagnosis of non-alcoholic fatty liver disease (NAFLD). Unlike several older non-invasive scores that were optimized primarily for advanced fibrosis or cirrhosis, SAFE targets discrimination between minimal fibrosis (histologic stages F0–F1) and clinically significant fibrosis (F2 and above). That distinction matters because stage F2 and higher has been linked to materially worse long-term outcomes, while many patients with milder disease can often be monitored in primary care when risk is low and follow-up is structured appropriately.
The score was developed using biopsy-classified cohorts and evaluated in independent testing sets, then applied in population-based data to explore relationships with survival among people with ultrasonographic steatosis. The published implementation uses routine demographics and laboratory values that are widely available in electronic health records, which supports deployment outside tertiary hepatology clinics—provided results are interpreted as probabilistic aids, not stand-alone diagnoses.
Clinical problem: why primary care needs a fibrosis-oriented tool
NAFLD is highly prevalent. Many patients first encounter the diagnosis through abnormal liver tests, imaging suggestive of steatosis, or metabolic risk profiles. Simple aminotransferase levels correlate poorly with fibrosis stage, so “normal ALT” does not reliably exclude important fibrosis. Liver biopsy remains informative but is invasive, costly, and impractical at population scale. Device-based elastography improves access to stiffness-based estimation but is not uniformly available in community practice.
Existing non-invasive scores such as FIB-4 and the NAFLD fibrosis score (NFS) are familiar and useful in many contexts, yet their original framing and validation often emphasize different endpoints (for example, advanced fibrosis). SAFE was designed to address an operational question common in generalist settings: among patients already labeled with NAFLD, who is more likely to harbor ≥F2 fibrosis and merits expedited assessment, versus who may be managed with structured surveillance when overall probability of significant fibrosis is lower?
How the model was built and tested
Development combined statistical modeling with clinical plausibility. Candidate predictors reflected stable, routinely measured characteristics rather than short-term metabolic fluctuations (for example, point-of-care glucose). In the primary development cohort, fibrosis was staged on liver biopsy using standard NASH trial histology. Independent testing included a rigorously enrolled trial population with biopsy and a real-world cohort in which fibrosis risk was approximated using magnetic resonance elastography thresholds validated against histology in that workflow.
Multiple modeling approaches were compared, including logistic regression and several machine-learning algorithms. While flexible methods can achieve very high apparent accuracy within a training sample, the logistic model offered a favorable balance of external performance, interpretability, and ease of implementation across settings. The final SAFE logistic model therefore became the published clinical tool.
Predictors in the final model
The retained variables align with pathophysiology and epidemiology of progressive NAFLD:
- Age: fibrosis prevalence increases with age in many NAFLD cohorts.
- Body mass index (BMI): higher BMI associates with more severe disease on average; the published model caps BMI at 40 kg/m² because the estimated effect plateaued at very high BMI values in the development analysis.
- Type 2 diabetes mellitus: diabetes marks insulin-resistant metabolic milieu and higher risk of progressive liver disease in NAFLD populations.
- Aspartate aminotransferase (AST) and alanine aminotransferase (ALT): both contribute, with AST entering positively and ALT negatively in the log-transformed formulation—reflecting complex relationships between necroinflammatory activity, body habitus, and fibrosis probability rather than a simple “higher ALT equals worse fibrosis” rule.
- Globulin concentration: computed as total serum protein minus albumin (g/dL), capturing aspects of immune–inflammatory protein fractions that differ between mild and more advanced disease in the derivation cohorts.
- Platelet count: lower platelet counts associate with portal hypertension and advanced fibrosis in many chronic liver diseases; the model uses the natural logarithm of platelet count expressed in conventional ×10⁹/L units.
Underlying logistic structure (log-odds coefficients)
For transparency, the multivariable logistic regression used log-transformed laboratory values for AST, ALT, globulin, and platelets. The direction and magnitude of associations (odds ratios per log-unit change) are summarized conceptually below; these are the statistical building blocks that were later rescaled into the user-facing SAFE score.
| Predictor (as modeled) | Direction of association with ≥F2 | Clinical intuition |
|---|---|---|
| Age (per year) | Higher age → higher odds of ≥F2 | Chronicity and cumulative injury |
| BMI (capped at 40 kg/m²) | Higher BMI → higher odds | Metabolic stress; capped to reflect plateau |
| Type 2 diabetes (yes vs no) | Diabetes → higher odds | Metabolic driver of disease severity |
| ln(AST) | Higher ln(AST) → higher odds | Hepatocyte injury axis; not interchangeable with ALT alone |
| ln(ALT) | Higher ln(ALT) → lower odds in the adjusted model | Reflects multicollinearity and phenotype; interpret only within the full equation |
| ln(globulin) | Higher globulin → higher odds | Inflammatory protein fraction signal |
| ln(platelets) | Higher platelets → lower odds | Inverse association with portal hypertension severity in many settings |
Rescaled SAFE formula (user-facing score)
To simplify bedside use, the linear predictor was linearly transformed so that two operational anchors map to 0 and 100: a lower threshold chosen to meet a prespecified high sensitivity for excluding ≥F2, and a higher threshold chosen to support rule-in behavior with useful specificity and positive predictive value in the development set. The published rescaled equation is:
SAFE = 2.97×age + 5.99×min(BMI, 40) + 62.85×T2DM
+ 154.85×ln(AST) − 58.23×ln(ALT) + 195.48×ln(globulin)
− 141.61×ln(platelets) − 75
Units: age in years; BMI in kg/m²; AST and ALT in U/L; globulin in g/dL; platelet count in ×10⁹/L (the same numeric value commonly reported on a complete blood count). Natural logarithms are used throughout. Any implementation must reject non-positive inputs inside logarithms and should surface clear validation errors rather than silent extrapolation.
How to interpret rescaled thresholds
The score is continuous, but clinical communication benefits from bands that mirror the paper’s risk strata:
- SAFE < 0 (low-probability band): corresponds to the rescaled region aligned with the sensitivity-oriented “rule-out” anchor. In the original reports, negative predictive values for excluding ≥F2 at the low threshold were strong in testing sets, but performance is prevalence-dependent and must be interpreted locally.
- 0 ≤ SAFE < 100 (intermediate band): patients in this range should not be reflexively reassured or referred solely from the number; this is the zone where additional data—repeat labs, elastography where available, specialist input, or structured risk-reduction plans—are most often appropriate.
- SAFE ≥ 100 (high-probability band): aligns with the higher anchor associated with greater specificity for ≥F2 in development. These patients merit prioritized evaluation for advanced fibrosis and comorbidity management, still recognizing that no blood-based score replaces histology or high-quality non-invasive testing when the clinical stakes are high.
Important: SAFE is intended for patients with an established NAFLD diagnosis after reasonable exclusion of competing chronic liver diseases (for example, substantial alcohol use, chronic viral hepatitis, and other disorders), consistent with the derivation cohorts.
Performance relative to FIB-4 and NFS (discrimination of F0/1 vs ≥F2)
In the published head-to-head comparisons for the specific task of separating F0/1 from ≥F2, SAFE achieved higher area under the ROC curve than FIB-4 and NFS across the primary training and testing datasets, with statistically significant differences by conventional tests in most comparisons. That advantage should be understood narrowly: it reflects this particular discrimination task and spectrum of patients, not a universal statement that SAFE replaces other scores in every clinical question (for instance, screening for cirrhosis or varices may still lean on different tools and imaging).
Population data: SAFE strata and long-term survival (NAFLD with steatosis)
When applied to population-based participants with ultrasonographic steatosis and NAFLD definitions excluding alternative etiologies, SAFE partitioned the cohort into groups with markedly different long-term survival curves. Low-score strata showed survival patterns that were not materially worse than comparator groups without steatosis in adjusted models reported in the primary analysis, whereas higher SAFE strata tracked progressively worse survival. These observations support the idea that the score captures biologic risk beyond a binary steatosis label, although ecologic and residual confounding always remain possible in observational national survey data.
Practical workflow suggestions (not prescriptive)
A reasonable educational workflow is to confirm NAFLD context, ensure laboratory inputs are contemporaneous and not distorted by acute illness or recent hemolysis, compute globulin from paired total protein and albumin on the same sample when possible, and then place the SAFE result alongside blood pressure, glycemic control, weight management plans, and alcohol use review. Low scores may support continued primary care management with guideline-concordant interval reassessment; intermediate scores suggest individualized next steps; high scores should lower the threshold for specialist referral and advanced non-invasive fibrosis assessment where available.
Limitations and precautions
- Spectrum and prevalence: positive and negative predictive values change with underlying disease prevalence; thresholds validated in referral cohorts behave differently in unselected populations.
- Not a steatosis detector: SAFE assumes NAFLD is already diagnosed; it does not establish steatosis or replace imaging indications.
- Biology vs artifact: acute hepatitis, hemolysis, congestive splenomegaly, medications, pregnancy, and post-bariatric physiology can perturb AST, ALT, platelets, and protein fractions.
- Diabetes coding: the original variable reflected documented type 2 diabetes in source study coding; undiagnosed dysglycemia may be misclassified.
- Equity and generalizability: performance may vary by ancestry, body composition, comorbidity mix, and laboratory assay methods; local validation is ideal where feasible.
- Legal and ethical use: this content is educational; calculators must not be used as autonomous clinical decision systems without clinician oversight.