Building a Resilient AI Credentialing Assistant
How I navigated model deprecations, encoding hurdles, and hybrid RAG architectures to build a production-ready healthcare prototype.
In the complex world of healthcare, verifying provider credentials shouldn't be a manual bottleneck. I set out to build an AI assistant that could:
- Extract data from medical licenses using multimodal LLMs.
- Verify NPI data against federal registries in real-time.
- Analyze risk using a custom RAG (Retrieval-Augmented Generation) pipeline fueled by state medical board disciplinary records.
The War Room Lessons
The path to a functional demo on Hugging Face was filled with real-world engineering challenges that separate a "local script" from a "deployed product."
1. The Moving Target of Model IDs
I started with Gemini 2.0, but as the API evolved, I encountered errors.
- The Learning: Model aliases like
gemini-flash-latestare convenient but can be unstable in production. By pivoting to Gemini 3.1 Flash-Lite, I leveraged the latest sub-second latency models optimized for high-volume data extraction.
2. Data Ingestion & The "Invisible" UTF-8 Bug
A CSV isn't always just a CSV. My RAG pipeline initially crashed because of encoding mismatches from state board exports.
- The Learning: Always implement defensive ingestion. Switching to
encoding='utf-8-sig'in Pandas allowed the system to handle Excel’s Byte Order Marks (BOM) without crashing the pipeline.
3. Hybrid RAG: Local Embeddings vs. Cloud APIs
When the cloud embedding API hit regional restrictions, I pivoted to a hybrid architecture.
- The Learning: I integrated
langchain-huggingfaceto run Sentence-Transformer embeddings locally on the server. This decoupled our semantic search from the Google API, ensuring the "Risk Intelligence" layer remained operational regardless of external API stability.
Technical Stack
- Frontend: Streamlit
- LLM: Gemini 3.1 Flash-Lite (via Google GenAI SDK)
- Vector DB: ChromaDB
- Embeddings:
all-MiniLM-L6-v2(Local Transformers) - Deployment: Hugging Face Spaces

Link to the Prototype:
Credentialing Prototype AI - a Hugging Face Space by pilayar
Prototype for credentialing

https://huggingface.co/spaces/pilayar/CredentialingPrototype