Trustworthy Document AI: From Good Enough to Mission-Critical
Speaker: Haixun Wang – ISSAQUAH, WA, United StatesTopic(s): Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing , Information Systems, Search, Information Retrieval, Database Systems, Data Mining, Data Science , Applied Computing
Abstract
Generative AI can now tackle what once seemed impossible: parsing dense contracts, deciphering physician notes, and interpreting photographed invoices with surprising fluency. Yet in domains like law, medicine, and finance, “pretty good” still isn’t good enough. This talk reframes Document AI (the use of generative systems for document understanding) as an end-to-end product challenge, not just a modeling exercise. We’ll show that dependable performance comes less from scaling LLMs and more from sophisticated fine-tuning methods such as reinforcement learning with automated reward signals, combined with principled evaluation, structured workflows, and real-world accountability. By grounding outputs in QA-driven metrics, Document AI can self-improve at scale while replacing vague benchmarks with measurable guarantees. We’ll also explore how agentic systems—where schema-driven tasks are decomposed into orchestrated micro-agents—offer a more transparent, interpretable, and cost-effective path than monolithic black-box models. Across use cases and modalities, from legal extraction to conversational voice agents, we’ll argue that Document AI must evolve into true infrastructure: modular, measurable, and mission-ready.About this Lecture
Number of Slides: 50Duration: 30 or 60 mins minutes
Languages Available: English
Last Updated:
Request this Lecture
To request this particular lecture, please complete this online form.
Request a Tour
To request a tour with this speaker, please complete this online form.
All requests will be sent to ACM headquarters for review.