Project Showcase: Fine-Tuning Llama 3.2

95.0% Parse Success Rate

Specialized Fine-Tuning (LoRA) of Llama 3.2-3B for high-precision, machine-parseable JSON extraction from unstructured documents.

Before (Baseline)
15.0%
After (Fine-Tuned)
95.0%
Key Efficiency
5.3x

Improvement in machine-parseability

Before vs. After Comparison

Baseline Model Output
"Here is the extraction for the invoice. 
Note that tax was not listed.
```json
{
  \"vendor\": \"Cloud Net\",
  \"inv_no\": \"CN-1\",
  \"total\": 500.0,
  \"date\": \"Jan 10 2024\"
}
```
I hope this helps!"
Errors: Markdown fences, Prose prefix, Non-ISO date, Key name mismatch.
Fine-Tuned Model Output
{
  "vendor": "Cloud Net",
  "invoice_number": "CN-1",
  "date": "2024-01-10",
  "due_date": null,
  "currency": "USD",
  "subtotal": 500.0,
  "tax": null,
  "total": 500.0,
  "line_items": [
    {"description": "Service charge", "quantity": 1.0, "unit_price": 500.0}
  ]
}
Success: Clean JSON, Mandatory keys present, ISO date format.

Deep Failure Analysis

We didn't stop at success. We analyzed 5 specific edge cases where the fine-tuned model still failed (e.g., European number formats, nested tables). This quantitative rigor is what drives production-grade AI.

01
Thousands Separator Confusion
Misidentified '.' as decimal point in European formats.
02
Multi-Page Continuity
Loss of context across page breaks in long POs.

Prompting vs. Fine-Tuning

"While few-shot prompting can achieve high accuracy on simple tasks, fine-tuning remains the superior choice for production systems requiring extreme reliability, low latency, and consistent structural adherence across diverse layouts."

Read Full Analysis →