From Messy Loan Applications to AI Voice-Powered Forms

In lending, the small details matter. I’ve seen how a single inconsistency in a loan application can throw everything off. One borrower might say they make $1,000 a week, another writes $52,000 a year, and a third types $4,333 per month.

From a human perspective, all three mean the same thing. But for a loan processor — or for the system calculating Debt-to-Income (DTI) ratios — those variations can lead to errors, delays, and extra back-and-forth.

That’s the problem I wanted to solve: how do we take the way people naturally express themselves and translate it into clean, structured data that systems can trust?

The answer wasn’t another field on a form or a stricter validation rule. It was rethinking the experience entirely.

What if applicants didn’t have to think about the “right” way to type their income?
What if they could simply say it out loud — and the system would understand, normalize, and calculate it correctly?

That’s the vision that led me to build a Voice-to-Form demo. It goes beyond simple speech-to-text. This is about using AI flows and schemas to turn human conversation into structured, reliable data — the kind of data that makes a DTI calculation accurate on the first try.

The Spark: Rethinking Forms Through a Product Lens

As a Product Manager, I’ve learned that friction in a process isn’t always about the interface itself — it’s about the assumptions we make. Traditional forms assume users will adapt to the system’s rules: pick the right frequency, type numbers in the right format, avoid typos. But that’s not how people think or talk.

The spark for this demo came when I asked myself a simple question: why should the burden be on the user to adapt, when the system could adapt to them instead?

That’s where Voice-to-Form comes in. Instead of fighting through dropdowns, applicants can just explain their situation in their own words:

“I work at the hospital, I earn about $1,000 every week, and I also pick up shifts that bring in another $500 a month.”

The system’s job is no longer just capturing text. It needs to:

Extract the key details (employer, job title, income streams).
Normalize them into consistent units.
Feed them into the right fields so calculations like DTI are accurate.

By shifting the lens from “how do we make a better form?” to “how do we let people speak naturally and still get clean data?”, I realized the real product opportunity wasn’t just in user experience — it was in data transformation.

From Speech to Structure: Where AI Makes the Difference

Voice-to-Form isn’t about replacing typing with talking. A simple speech-to-text engine could do that — but it would still leave us with the same messy data. The real challenge is translating free-form human speech into structured, validated information that downstream systems can trust.

That’s where AI flows and schemas come in. Instead of dumping raw transcription into fields, I designed the system to:

Listen and transcribe speech in real time.
Extract entities like employer, job title, and income amounts.
Normalize values into consistent units — for example:
- “Daily” income is multiplied to a weekly standard.
- “Bi-weekly” paychecks are mapped to a monthly equivalent.
- Percentages (like bonuses) are calculated against base salaries.
Validate outputs against schemas to make sure the data is complete, accurate, and usable.

The difference is subtle but powerful. Instead of asking the user to fit into the rigid structure of a form, the system flexes to meet the user where they are — in conversation.

From a product perspective, this unlocks two key outcomes:

Clarity for users → they speak naturally and feel understood.
Confidence for the business → the data comes back structured, ready for calculations like DTI without manual cleanup.

This is the step where AI shifts from a convenience feature to a core enabler of accuracy and scale.

Building the Demo: The Voice-to-Form Prototype

Turning this idea into reality meant balancing two worlds: the user-facing experience (which needed to feel natural and conversational) and the system-facing architecture (which had to be precise and dependable).

Where I Started — One Big AI Flow

My first instinct was to build a single, end-to-end AI flow that could handle everything: personal details, income, debts, you name it. On paper, it felt elegant. In practice, it became brittle:

The flow often got confused between steps.
Income fields would get overwritten by unrelated data.
Debugging was nearly impossible — when something went wrong, I had no clear way to pinpoint where the failure started.

This was my first big lesson: a monolithic AI flow is a single point of failure.

The Shift to Modular AI Flows

I re-architected the system around specialized flows, each with one responsibility:

personalInfoFlow → Name, email, phone.
incomeInfoFlow → Employer, job title, income sources.
debtInfoFlow → Types of debts and monthly obligations.

Each flow was much simpler, more reliable, and easier to test in isolation. And if one failed, it didn’t take the whole process down. The frontend became the “router,” deciding which flow to call based on the user’s progress.

Schemas as Guardrails

Every flow output was validated against a Zod schema. This was critical. For example, in the income flow:

amount must be a valid number.
frequency must be normalized to weekly, monthly, or annual.
employer must be a string.

If the AI produced something invalid, it was caught immediately before reaching the form. In product terms, schemas acted as guardrails, turning unpredictable AI output into data the system could trust.

State Management in the UI

The React component did more than just display results — it orchestrated them. Instead of overwriting everything, it merged new AI results with existing form state:

Simple fields (like name or email) updated only if new valid data was found.
Lists (like income sources or debts) appended incrementally, so users could build their profile conversationally.

This ensured users always had control, while still benefiting from automation.

Why This Matters

For me, the demo wasn’t just about showcasing a cool interaction. It was about proving a principle: AI can’t just “sound smart” — it has to produce data that systems can trust.

By making conscious architecture decisions — modular flows, schema validation, state-aware UI — I built something that felt conversational on the surface but was engineered for clarity, traceability, and accuracy underneath.

That balance is exactly where Product Management shines: connecting the human experience to the technical reality.

Why This Matters for Product Leaders

At first glance, Voice-to-Form might look like a UX experiment — a way to make forms more engaging. But underneath, it highlights some of the most important principles of product management:

1. Reduce Friction to Drive Outcomes

Every unnecessary click or confusing field is an opportunity for users to abandon the process. By letting people speak naturally, we remove that friction. The outcome?

Higher completion rates
Faster submissions
Better customer satisfaction

2. Improve Accuracy at the Source

Bad data compounds as it moves through a system. Incorrect income inputs don’t just cause frustration — they can break downstream calculations, like DTI, that determine whether someone qualifies for a loan. With schema-driven AI, we catch and correct errors at the source, before they become operational problems.

3. Modern Experiences Differentiate

In industries like finance, healthcare, and insurance, forms are everywhere — and they all feel the same. Voice-to-Form turns a commodity interaction into something modern and memorable. That differentiation matters in competitive markets.

4. The Bigger Picture: Beyond Lending

While my demo focused on loan applications, the same principles apply anywhere forms create friction:

Healthcare: Patients describing symptoms instead of clicking boxes.
Insurance: Policyholders explaining claims in plain language.
Government services: Citizens applying for benefits without navigating complex portals.

The product lesson is clear: when we design systems that adapt to people, instead of forcing people to adapt to systems, we create both better user experiences and stronger business outcomes.

Conclusion & Call to Action

Building the Voice-to-Form demo was more than just an experiment in AI — it was a reminder of what great product management is all about:

Spotting real-world problems (like inconsistent income entries).
Reframing them as opportunities to improve both user experience and data quality.
Designing with intention — modular flows, schemas, and state management — to turn messy human input into structured, reliable outcomes.

The future of digital interaction won’t be defined by more fields, stricter validations, or longer forms. It will be defined by interfaces that understand us, adapt to our natural behaviors, and make complex processes feel effortless.

Voice-to-Form is just one example of how AI can bridge that gap. But the opportunity goes far beyond lending.

If you’re thinking about how conversational AI could transform your product’s user experience, I’d love to hear your ideas. Let’s explore how we can take them from concept to reality.