Distinguish disabled AI classifier from exact-match auto-resolve when debugging triage
Debugging why an inbox/triage UI showed unexpected auto-classification while spam still leaked through.
Two unrelated engines can both look like AI in a triage UI: a shadow-mode LLM spam classifier that only writes a verdict label and feeds a collapsed UI bucket (never routes at ingest), and a non-LLM person-matching engine that auto-resolves a sender when its email exactly matches a known contact identifier (score 1.0 >= threshold). A tier-null field in the auto-resolve log is the tell that the LLM never touched that row. A feature-flag clobber bug silently froze the classifier, so the newest auto-labeled row dating to a past date is the smoking-gun for when scoring stopped, not evidence the AI is wrong.
Trace one concrete row through the actual DB and the auto-resolve log first; the tier1/tier-null marker and the max scored-at timestamp instantly separate -classifier off- from -classifier wrong-.