Measure What Matters: How We Evaluate Impact and Strengthen Detection Across Safety Workflows

Fiona Thornton
VP, Partnerships, Resolver Trust & Safety
Stephen Nugent
Stephen Nugent
Head of Data Services, Resolver Trust & Safety
· 5 minute read
Graphic showing two people shaking hands with data charts around them, next to a gold and blue badge that says “celebrating 20 years of online safety” to commemorate resolver's trust & safety anniversary

This is the eighth installment in our series: “20 Years in Online Safety: Reflecting, Evolving, and Adapting,” leading to the 20th anniversary of Resolver’s presence in the Trust & Safety community on Nov. 25, 2025. Originally launched as Crisp Thinking in 2005, we've had multiple generations of online safety professionals carry the mission forward. For this series, we asked our team, managers, and leadership to reflect on the journey we’ve taken so far and the road ahead, as we continue in our core mission to protect children online.


In online safety, proving what works isn’t just about numbers. It’s about understanding what those numbers mean for users, for partners, and for the ecosystems we protect.

For Resolver, measuring effectiveness isn’t a static process. It’s a cycle of learning, testing, refining, and reapplying across technologies, sectors, and use cases. This work helps us understand where our services perform well and where our expertise can support new challenges.

From tactical fixes to deep partnerships: The power of earning trust

Many partnerships begin in moments of urgency — a compliance deadline, a public incident, or the discovery of a gap in detection. These engagements are often tactical at first, focused on immediate results: fast takedowns, high enforcement rates, or visible outcomes.

But genuine progress in Trust & Safety doesn’t stop there. True partnership begins when the conversation shifts from reaction to reflection. When both sides move beyond quick fixes and start working together to build consistency, alignment, and shared understanding .

“Trust isn’t something we ask for. It’s something we earn every day. It’s built in the follow-up calls, the data reviews, and the willingness to listen and adjust. That’s when a tactical solution becomes a trusted alliance.”
— Fiona Thornton, VP Partnerships, Resolver

Over the years, we’ve seen that the partnerships with the strongest outcomes are the ones where trust becomes mutual. One partner once shared that trust isn’t one-way — it’s something built through consistent action, empathy, and transparency. That perspective stayed with us because it reflects exactly how meaningful collaboration takes shape in this work.

Get free monthly risk and threat insights with the Resolver Safety Brief.
Subscribe Today

Defining what “working” means

Success in online safety can’t be captured by a single metric. Enforcement rates and takedown speeds are vital, but they’re only one part of the story.

Resolver evaluates effectiveness through three lenses:

April 28 terror attack blog icon 1

Quantitative impact: We review things like policy action rates, precision and recall, the quality of analyst escalations (“subs”), and reduction in repeat harm. These indicators show whether the signals we surface are accurate, timely, and aligned with expected outcomes.

April 28 terror attack blog icon 2

Qualitative insight: We look at partner feedback, policy alignment discussions, and collaborative reviews that clarify why certain content did or didn’t meet a threshold. These insights help refine detection and highlight emerging patterns that may need policy or workflow updates.

April 28 terror attack blog icon 3

Strategic value: We assess how intelligence gathered in one area strengthens protection across others. This includes identifying new evasion tactics, spotting transferable behavioral patterns, and expanding the “blast radius” of impact when partners combine our signals with their internal intelligence.

Alongside these lenses, we track indicators that show how signals perform in real environments. These include weekly internal QA checks, daily action-rate data from partner platforms, and feedback loops that highlight both progress and areas for refinement. Together, these multi-dimensional metrics help benchmark improvement over time and show how quickly partners act on the intelligence we provide.

Dark teal quote graphic with white text "we’re always asking what the data tells us about behavior and risk. Both evolve constantly. Measurement isn’t just validation — it’s discovery. "
 from stephen nugent, head of data services, resolver trust & safety

Measurement helps us uncover new patterns and questions that guide how we refine signals and strengthen protection across partner environments.

Turning data into direction

Every improvement at Resolver begins with an observation. Our subject-matter experts identify emerging harms, from cultural nuance to linguistic shifts, and from real-world behavior to online manifestation. That intelligence feeds into our technical teams, where data scientists and risk detection specialists look for patterns that surface the most egregious content. Throughout this process, SMEs and technical teams work side by side to ensure we deliver accurate, actionable value for partners.

Here’s how we turn those observations into reliable, defensible detection:

Icon quantitative

Internal validation loops: New rules go through multiple rounds of testing between data scientists and analysts until the signal delivers consistent, high-value results. There are no fixed industry benchmarks; value depends on the quality of insight, not a single number.

Icon qualitative

Measuring improvement: We assess whether a workflow performs “better” by looking at uplift in value. This might include more accurate analyst escalations (“subs”), higher policy alignment, reduced noise, and clearer downstream action from partner teams.

These improvements often surface across frontline workflows — from detection triage and reviewer guidance to classifier updates and enforcement queues — where clearer signals support quicker, more accurate action. Once validated, these learnings are operationalized by our engineers and integrated into our systems. And just when something seems solved, the language shifts, evasion techniques evolve, or new information sources appear — and the cycle begins again.

These outcomes often involve multiple partner stakeholders — from Policy Enforcement and Policy Development to Product — who join the collaboration to understand how bad actors adapt to avoid detection. Their involvement signals that a partnership has moved beyond initial problem-solving into true strategic alignment.

Ultimately, trust is what transforms a reactive engagement into a long-term alliance. It creates the shared foundation needed to move from metrics to meaning and from fixes to foresight.

In one long-term engagement, our analysts identified a new evasion pattern that repeatedly slipped past a partner’s internal filters. After sharing early examples and validating the trend with their policy team, the partner updated enforcement guidance and adjusted thresholds across several workflows. In the following review cycle, action accuracy improved and repeated attempts by bad actors decreased.

We’ve seen similar patterns across other partnerships, where early intelligence from our teams has supported policy refinement, improved detection precision, and clearer reviewer guidelines — all of which contribute to reduced downstream risk.

Applying insights across sectors

The behavioral, linguistic, and contextual signals that indicate risk are often transferable across domains. We see them shift from child safety to anti-fraud, from grooming to harassment, and from dark-web intelligence into mainstream platform moderation.

These transferable signals often strengthen our partners’ detection stacks. They help refine reviewer guidance, improve classifier confidence, and adjust enforcement thresholds to reflect emerging harm patterns.

By working with partners across industries, Resolver helps translate learnings from one environment to another, adapting detection and policy logic to new contexts without losing nuance.

“With the regulatory landscape placing greater emphasis on platforms’ detection tech stacks — particularly around unknown CSAM detection — our engineers and data scientists are working closely with partners to support API integrations.”
— Stephen Nugent, Head of Data Services, Resolver

But it doesn’t stop there. We continue to calibrate and refine thresholds to ensure our partners receive the performance metrics they require. This is how our systems evolve: by sharing insight, scaling knowledge, and ensuring each advancement contributes to safer online spaces more broadly.

From metrics to meaning

At Resolver, we still care about metrics, and we care even more about what they represent. High action rates, faster removals, and increased detection accuracy matter only when they lead to safer outcomes for users. Measuring what matters means tracking not only what we help take down but also what we help prevent. This includes the harm that never happens because a detection worked or a policy was strengthened at the right moment.

For many partners, this prevention shows up in quieter ways. They see fewer repeated attempts by bad actors, clearer reviewer decisions, and earlier visibility into emerging risks. These signals show progress behind the scenes long before they appear in platform-wide reporting.

Ultimately, trust is what turns reactive engagements into long-term alliances. It supports the shift from metrics to meaning, from fixes to foresight, and from vendor to partner. Trust is what turns numbers into knowledge and partnerships into progress.

Looking ahead, we expect more pressure on platforms to detect harm earlier and explain how their systems work. Our focus is on supporting partners with the intelligence, collaboration, and clarity they need to stay ahead of risk as it evolves.

A new standard in proactive CSAM elimination

As we reflect on two decades of protecting children and safeguarding online spaces we’re also looking ahead to the next frontier of Trust & Safety: the proactive, intelligent elimination of CSAM. Resolver’s new Unknown CSAM Detection Service represents the culmination of 20 years of learning, evolving, and purpose. It’s built to identify, prevent, and remove child sexual abuse material at speed and scale, while protecting humans behind the screen.

Learn more about how we’re redefining child safety for the next generation.


More from Trust & Safety’s 20th Anniversary series:

  1. Two Decades of Protection: Resolver’s Constant Evolution in Online Child Safety
  2. What We Call Threats: Evolving Taxonomies and the Role of Regulation
  3. From "Chicken Soup" to Catastrophe: The Dangers of an English-Only Trust & Safety Model
  4. The Human at the Heart of the Machine: A 20-Year Lesson in Online Safety
  5. From Reactive to Predictive: Why It’s No Longer Enough to Spot What’s Already Happened
  6. Wearing Many Hats: The Power of Generalists and Specialists in Online Safety
  7. Friction is Key: How Behavioral Design Enhances Online Safety
Table Of Contents

    Speak to an Expert

    By clicking the button below you agree to our Terms of Service and Privacy Policy.
    If you see this, leave it blank.