A GP sends you a three-page referral letter as a scanned PDF. It’s slightly crooked, the handwriting in the margins is illegible, and the patient’s date of birth is buried in paragraph four. A human takes 5 minutes to extract the key details. AI takes 5 seconds — but is it right?

That’s the question every practice manager asks when they first hear about AI document processing. The honest answer: it depends on the document. But the technology has reached a point where it’s genuinely practical for specialist practices — not as a replacement for your team, but as a tool that handles the repetitive work so they can focus on the exceptions.

Here’s how it actually works, step by step.

Step 1: OCR — turning an image into text

When a referral arrives as a scanned PDF, a photograph, or a faxed document, the first challenge is basic: the computer needs to read it. That’s the job of Optical Character Recognition (OCR).

Modern OCR engines are remarkably good at reading typed text, even from mediocre scans. A referral letter printed from a GP’s practice management system — Best Practice, Medical Director, Genie — will be read accurately well over 95% of the time. The text comes out clean and structured.

Where OCR struggles: handwritten notes, stamps, annotations in margins, and documents scanned at an angle or with parts cut off. A GP who handwrites a referral letter (increasingly rare, but it happens) will produce text that OCR can partially read but will contain gaps and errors. The system needs to know that it doesn’t know — and that’s where confidence scoring comes in later.

Step 2: Text extraction — finding the signal in the noise

Once the document is converted to text, the AI has a wall of words. A typical referral letter contains the referring GP’s name and practice details, the patient’s demographics, a clinical history, the reason for referral, any relevant test results, and sometimes a preferred urgency level.

The AI’s job is to identify which parts of the text correspond to which fields. Patient name, date of birth, Medicare number, referring doctor, clinical summary, urgency — each field needs to be found and extracted.

This works through a combination of pattern recognition and language understanding. Medicare numbers follow a specific format (10-11 digits). Dates of birth appear near labels like “DOB” or “Date of Birth.” The referring doctor’s name is usually in the letterhead or sign-off. The AI has been trained on thousands of referral letter formats and knows where to look.

Step 3: AI categorisation — understanding context

This is where modern AI goes beyond simple pattern matching. The system doesn’t just find text — it understands what it means in context.

For example, a referral letter might say “I would appreciate your opinion on this 67-year-old gentleman with progressive right knee pain, ?meniscal tear.” The AI needs to identify that “67-year-old” relates to age, “right knee pain” is the presenting complaint, “?meniscal tear” is a provisional diagnosis (the question mark indicating uncertainty), and that this is an orthopaedic referral.

It also categorises urgency. A letter that says “routine review at your convenience” is different from “urgent — patient has red flag symptoms, please see within 2 weeks.” The AI parses this language and assigns a priority level. The Australian Digital Health Agency has been working on standardised clinical document formats, but in practice, referral letters come in every format imaginable, so the AI needs to be flexible.

Step 4: Confidence scoring — knowing what it doesn’t know

This is the step that separates useful AI from dangerous AI. Every extracted field gets a confidence score — a percentage indicating how certain the system is about the extraction.

A clearly printed Medicare number in a standard location might get 98% confidence. A date of birth that appears twice in the letter with the same value: 99%. A patient’s surname that’s partially obscured by a fold in the scanned page: 62%.

High-confidence fields (typically above 90%) are auto-populated. They go straight into the patient record, pre-filled and ready for a quick visual check. Low-confidence fields are flagged for human review — the system highlights them, shows the original source text, and asks a staff member to verify or correct.

This is how SimpleRef’s DocBot works. It doesn’t pretend to be infallible. It processes what it can with high certainty, and explicitly asks for help on the rest. The goal is to handle 70-80% of the data entry automatically and make the remaining 20-30% faster by showing exactly where to look.

What AI handles well

Typed referral letters from GP software are the sweet spot. These follow predictable formats, use standard fonts, and include structured fields. The AI reads them accurately and quickly.

Pathology reports are similarly well-suited. They’re typically generated by laboratory information systems with consistent formatting. Patient identifiers, test names, result values, and reference ranges are all in predictable positions.

Standard referral forms — the structured templates that some hospitals and GP practices use — are easiest of all. Fields are labelled, positions are fixed, and the AI barely needs to interpret anything.

What AI struggles with

Handwritten notes remain the hardest challenge. A GP who scrawls a referral on a prescription pad is producing text that even another human struggles to read. AI can attempt it, but confidence scores will be low across the board.

Poor scan quality degrades everything downstream. If the OCR can’t read the text, the AI has nothing to work with. Documents that are scanned at 75 DPI, photographed with a phone camera in poor lighting, or faxed through three machines will produce inferior results.

Clinical nuance is genuinely difficult. “Query lymphoma” means something different from “confirmed lymphoma.” “Previous history of diabetes” is background context, not a current referral reason. AI is getting better at these distinctions, but it’s not at specialist-clinician level — and it shouldn’t need to be. The clinician reviews the clinical content. The AI handles the demographic data entry.

Unusual letter formats — referrals from overseas, letters from allied health practitioners, or correspondence that combines multiple patients — can confuse the extraction pipeline. These are edge cases, but they exist in every practice.

The human review queue

A well-designed system doesn’t just extract data — it creates a workflow for reviewing what was extracted. Staff see a queue of incoming referrals with their extracted data. High-confidence items are pre-filled. Low-confidence items are highlighted in yellow or orange. Missing fields are marked in red.

The reviewer’s job shifts from “type everything from scratch” to “check what the AI found and fix what it got wrong.” In practice, this cuts referral intake time by 60-70% for most documents. The worst-case scenario — a handwritten, poorly scanned letter — still requires manual entry, but that’s now the exception rather than the rule.

Is it safe for medical data?

A reasonable question. AI document processing in a medical context needs to meet privacy obligations under the Privacy Act 1988 and the Australian Privacy Principles. Key requirements: patient data must be processed in a secure environment, not stored or used for training by third-party AI providers, and accessible only to authorised staff.

The processing happens server-side, not on a third-party consumer AI platform. Documents are processed, data is extracted, and the original document is stored securely within the practice’s system. No patient data is sent to ChatGPT or any public AI service.

The practical bottom line

AI document processing doesn’t replace your team — it gives them back the hours they were spending on data entry. A practice processing 40 referrals a week at 5 minutes each is spending over three hours on pure data transcription. Cut that by 70% and you’ve recovered two hours a week — time that goes back into patient contact, follow-ups, and the work that actually matters.

If you want to see how this works in practice, SimpleRef’s DocBot feature processes referral letters, pathology reports, and clinical correspondence with the confidence-scoring approach described above. You can check our pricing to see what’s included, or reach out to discuss whether it’s a fit for your practice.

Start with a stack of 20 typical referral letters. Run them through. See what the AI catches and what it misses. That’s the only evaluation that matters.

How AI Document Processing Actually Works in a Medical Practice