Active Learning

TAR and Active Learning for Document Review Workflows

May 1, 2025 Naomi Ashford 7 min read

TAR and active learning document review workflow

A 50,000-document collection lands on a litigation support firm's desk. Outside counsel wants first-pass review complete in three weeks. The math is brutal: at a comfortable pace, a single attorney reviews 40-50 documents per hour. Working that collection down to a manageable privilege and responsiveness call takes somewhere between 600 and 800 attorney hours. That's before any second-level review.

This is the problem TAR with active learning was built to solve. Not just to speed things up, but to fundamentally change which documents a human ever needs to look at.

How the Active Learning Loop Actually Works

Technology-Assisted Review has been around long enough that most litigation support teams have heard the pitch. What's worth understanding is the specific mechanics of the active learning variant, because the implementation details matter for defensibility.

The process starts with a seed set. In our experience, 500 documents is a workable starting point. Reviewers code them for relevance, and the model uses those judgments to build an initial relevance ranking across the full collection. Then the loop begins: the system surfaces the documents it's most uncertain about, reviewers code them, and the model updates its ranking. Rinse, repeat.

What makes this different from a static keyword filter or a one-shot predictive model is the feedback mechanism. Each coding decision doesn't just classify a document. It teaches the model something about the conceptual territory of the case. A keyword search front-loads all the human judgment into query construction. Active learning distributes that judgment across iterations, continuously refining as the model encounters edge cases and ambiguous documents.

The result: by the time a firm has reviewed 30-35% of the collection, the model has typically located 85-90% of all responsive documents. The remaining 65-70% of unreviewed documents contain, on average, only 10-15% of what's actually responsive. That's not a guess. That's the recall curve that emerges consistently across case types when the process is run correctly.

What This Means for First-Pass Review Hours

Let's be concrete. On a 50,000-document collection, reviewing 30-35% means a human reviews 15,000-17,500 documents rather than all 50,000. At 40-50 documents per hour, that's roughly 300-440 hours instead of 600-800. Cut roughly in half. Without cutting recall.

The reduction is real. So is the caveat: getting to those numbers requires discipline in the review process itself. Reviewers who code inconsistently, or who rush through the seed set because it feels like housekeeping before the real review, undermine the model's ability to learn. The quality of the seed set is everything.

We've seen firms try to shortcut this by throwing generic subject-matter-adjacent documents into the seed set rather than actually representative ones. The model trains on what it sees. If the seed set is biased toward one document type or one custodian's communication style, the early iterations reflect that bias. By round three or four, you're chasing ghost patterns. The QC process catches it eventually, but it's an expensive detour.

Done right, the efficiency gain is durable. It doesn't compress the timeline by cutting corners on recall. It compresses it by eliminating the long tail of clearly non-responsive documents that would otherwise eat attorney time.

QC Sampling and the Defensibility Question

Here's the thing about TAR defensibility in litigation: opposing counsel and courts have become more sophisticated about it. "We used TAR" is no longer a sufficient answer. The question they're asking now is how you validated recall, and what your sampling methodology actually was.

Statistical sampling at 95% confidence with a margin of error of plus or minus 2% is the standard we hold the process to. That means pulling a random sample from the documents the model ranked as non-responsive and coding them to verify that responsive documents aren't hiding in the set you didn't review. At 95/2, you need roughly 2,400 documents in the validation sample. That's not optional.

The documentation trail matters as much as the result. Courts want to see the seed set selection criteria, the iteration log, the validation sample size, and the recall rate at cutoff. Discovarc logs all of this automatically during the review process, which means the defensibility record is a byproduct of doing the work, not a separate documentation effort after the fact.

Audit trails, not heroics. The firms that run into trouble with TAR defensibility challenges aren't usually the ones who used bad methodology. They're the ones who used reasonable methodology but couldn't reconstruct it.

Platform Integration and Workflow Fit

One of the friction points we hear about consistently is integration with existing review platforms. Litigation support firms aren't going to rip out Relativity or DISCO or Everlaw because a new tool has a better active learning engine. The new tool has to fit the existing workflow, not replace it.

Discovarc's active learning module integrates directly with Relativity, DISCO, Everlaw, and Reveal. The integration isn't cosmetic. Active learning rankings surface as a native field within the review platform, and coding decisions made by reviewers in the platform feed back into the model without requiring an export-import cycle. The model trains on live review data.

This matters more than it might seem. Every export-import cycle introduces lag between reviewer decisions and model updates. In an active learning workflow, lag means the model is training on stale data. Tight integration keeps the feedback loop tight, which keeps the recall curve improving at the rate the theory predicts.

Honest note on implementation: the first time a firm runs an active learning workflow, there's a calibration period. Reviewers need to trust the model's ranking enough to let it guide the sequencing of their work, which is a behavioral shift from traditional linear review. In our tracking, teams typically take two or three cases to fully internalize the workflow. After that, it becomes second nature. Fast, even.

Where TAR Fits in the Broader Review Strategy

Active learning TAR isn't a substitute for every stage of document review. It's a first-pass tool. It answers one question well: which documents in this collection are probably responsive and which probably aren't?

Privilege review still requires human judgment on each document. Near-deduplication and email threading happen before the active learning loop begins. And in complex cases with narrow legal theories, a well-constructed Boolean search still has a role to play as a pre-filter.

The firms getting the most value from active learning are the ones who treat it as a precision instrument with a defined scope, not a black box that handles everything. Use it to collapse the review universe. Then apply the appropriate tools to what's left.

For litigation support firms competing on turnaround and cost, that's the edge. Not a vague promise of AI efficiency, but a concrete, auditable method that consistently surfaces 85-90% of responsive documents by reviewing 30-35% of a collection. The numbers hold. The defensibility record holds. That's what matters.

How the Active Learning Loop Actually Works

What This Means for First-Pass Review Hours

QC Sampling and the Defensibility Question

Platform Integration and Workflow Fit

Where TAR Fits in the Broader Review Strategy

Related Articles

Predictive Coding Defensibility and TAR Protocol

Concept Clustering for Collection Analysis

Document Review Cost Reduction with Active Learning