Building a Resilient AI Credentialing Assistant

How I navigated model deprecations, encoding hurdles, and hybrid RAG architectures to build a production-ready healthcare prototype.

In the complex world of healthcare, verifying provider credentials shouldn't be a manual bottleneck. I set out to build an AI assistant that could:

  1. Extract data from medical licenses using multimodal LLMs.
  2. Verify NPI data against federal registries in real-time.
  3. Analyze risk using a custom RAG (Retrieval-Augmented Generation) pipeline fueled by state medical board disciplinary records.

The War Room Lessons

The path to a functional demo on Hugging Face was filled with real-world engineering challenges that separate a "local script" from a "deployed product."

1. The Moving Target of Model IDs

I started with Gemini 2.0, but as the API evolved, I encountered errors.

  • The Learning: Model aliases like gemini-flash-latest are convenient but can be unstable in production. By pivoting to Gemini 3.1 Flash-Lite, I leveraged the latest sub-second latency models optimized for high-volume data extraction.

2. Data Ingestion & The "Invisible" UTF-8 Bug

A CSV isn't always just a CSV. My RAG pipeline initially crashed because of encoding mismatches from state board exports.

  • The Learning: Always implement defensive ingestion. Switching to encoding='utf-8-sig' in Pandas allowed the system to handle Excel’s Byte Order Marks (BOM) without crashing the pipeline.

3. Hybrid RAG: Local Embeddings vs. Cloud APIs

When the cloud embedding API hit regional restrictions, I pivoted to a hybrid architecture.

  • The Learning: I integrated langchain-huggingface to run Sentence-Transformer embeddings locally on the server. This decoupled our semantic search from the Google API, ensuring the "Risk Intelligence" layer remained operational regardless of external API stability.

Technical Stack

  • Frontend: Streamlit
  • LLM: Gemini 3.1 Flash-Lite (via Google GenAI SDK)
  • Vector DB: ChromaDB
  • Embeddings: all-MiniLM-L6-v2 (Local Transformers)
  • Deployment: Hugging Face Spaces

Credentialing Prototype AI - a Hugging Face Space by pilayar
Prototype for credentialing

https://huggingface.co/spaces/pilayar/CredentialingPrototype

Subscribe to Prompt to Prototypes

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe