← blog

Teaching NeetoCal to answer questions

Notes from building an AI assistant that reads your calendar, and why the model behind it never touches the database.

"How many yoga classes are booked today?"

Neeti answered straight away: 23. Confident, and wrong. The real number was 62.

The interesting part is the kind of wrong it was. You've probably heard that AI tools sometimes make things up. They'll state something with total confidence that simply isn't true: a fact that was never real, a quote nobody said, a number pulled from nowhere. That's the failure most people brace for, and it's the one I was watching for too. This was the opposite. Neeti hadn't made anything up. The 23 was a real count of real bookings. It had just counted part of the picture and reported that, with no idea the rest was there.

Neeti is an AI assistant I built into NeetoCal over the last few weeks, and that gap, between what it could see and what was actually true, is most of what this post is about. First, though, what Neeti is and how it works, because the bug only makes sense once you can see the machine behind it. You don't need to know anything about AI or databases to follow along. I'll explain as I go.

What Neeti is

NeetoCal is a scheduling product. You share a link, people pick a time, it lands on your calendar like Calendly. If you run a busy calendar, or a team of them, you end up with questions about it all day long. How many bookings do I have today? Who on my team is free this afternoon? Did anyone get double-booked? How many of last month's calls actually happened?

You could answer all of that by clicking through screens and filters. Neeti lets you just ask.

You open a small panel from the sidebar, type your question in plain English, and it answers. The reply streams in as it's written, so you're reading along instead of staring at a spinner. Ask "how many bookings today," get a number. Ask "who's free at 3pm," get names. Ask for something big and it hands you a table you can download as a spreadsheet.

It's for admins only, by design. The people who can ask Neeti about an organisation's calendar are the same people who could already see all of it.

How it works

Here's the one idea that shapes everything else: the AI never touches our database.

That sounds backwards. The whole point is to answer questions about data that lives in a database. So let me explain what it does instead.

An AI model on its own (Neeti runs on Google's Gemini) is just very good at predicting text. It can't look anything up. It doesn't know how many bookings you have today any more than a stranger who has never seen your calendar does. To make it useful, you give it tools.

A tool is a small, pre-written function the model is allowed to call. I wrote a handful of them for Neeti:

  • count_bookings, to count bookings matching some filter
  • find_bookings, to fetch a sample of bookings
  • find_available_slots, to find open time on a calendar
  • check_booking_conflicts, to look for overlaps
  • search_hosts, to look up team members

When Neeti needs to know something, it doesn't write a database query. It calls one of these tools. The tool runs a safe query that I wrote and reviewed, and hands back the result. The model reads that result and decides what to do next.

So a single question turns into a small loop:

Question:            "how many yoga bookings today?"
 
Neeti picks a tool:  count_bookings(meeting: "yoga", when: "today")
Tool returns:        62
Neeti writes:        "You have 62 yoga bookings today."

Sometimes it takes more than one step. "Is my morning free, and if not, who's booked" might call one tool to check for conflicts, read the answer, then call another to look up the people involved. The model decides the order. My job was to give it good tools and let it choose.

Two things I added on top, both about trust.

The first is that Neeti shows its work. Under every answer there's a small trace you can expand to see exactly which tools ran and what they returned. If it tells you 62, you can open the trace and see the count that produced 62. No black box.

The second is that every tool is read-only. Not one of them can change, delete, or create anything. The worst Neeti can do is read something and tell you about it. I'll take that ceiling on the downside any day.

This is also why I didn't let the model write its own database queries, which is the more obvious way to build this. A model writing raw queries is powerful, and on a bad day that power points the wrong way: it reads something it shouldn't, or runs something heavy enough to slow the database for everyone. With a fixed set of read-only tools, the complete list of things Neeti can do is a list I can read on one screen. Fewer powers, easier to trust.

The yoga bookings

Now the bug.

To answer "how many yoga bookings today," the model reached for the wrong tool. Instead of count_bookings, it used find_bookings, the one that fetches a sample.

That tool is capped on purpose. If a customer has fifty thousand bookings, you don't want a tool that drags all fifty thousand back, both because it's slow and because it would bury the model in data. So find_bookings returns the first page and stops. The first page had 23 rows. The model counted what it could see, 23, and reported it. Confidently, because from where it stood there was nothing to suggest there was more.

The fix wasn't a cleverer prompt or a smarter model. It was a clearer set of tools.

I split two jobs that I had quietly let blur together: "show me some bookings" and "count all the bookings." Those feel similar, but they're not. A sample is allowed to be capped, because you only need a few examples to look at. A count is never allowed to be capped, because a partial count is just a wrong number wearing the right outfit.

So the listing tools kept their caps, and the counting questions got their own path: one that runs a real total in the database and comes back uncapped.

Before:  find_bookings(...)   →  first 23 of who knows how many  →  "23"
After:   count_bookings(...)  →  62                              →  "62"

The same fix cleaned up a row of similar mistakes. "How many team members do I have" had been answering 50 when the real number was 57, for exactly the same reason. Once counting had its own honest path, those snapped into place.

The lesson stuck with me. When an AI gets a fact wrong, the instinct is to blame the model, tweak the prompt, reach for a bigger one. But often the model is reasoning fine over bad inputs. It answered the question it was actually able to answer, "how many of these can I see," not "how many exist." The bug wasn't in the model. It was in the tools I handed it.

Making it faster

The other thing I spent real time on was speed. Early on, answers were slow. Some took the better part of a minute, which is a long time to sit and watch a "thinking" dot blink.

Modern AI models have a dial for how hard they think before they answer. Turn it up and the model reasons more carefully, which genuinely helps for hard problems: multi-step logic, maths, tricky planning. I had it turned up, on the reasonable-sounding theory that more thinking means better answers.

But think about what Neeti actually does. Most of its work is looking things up. Count these, find those, check that calendar. That's retrieval, not deep reasoning. The hard part isn't the thinking, it's fetching the right rows, and the tools already handle that. I was paying for careful reasoning the task never asked for.

So I turned the dial down. Answers came back about a third faster, and the quality didn't move, because the extra thinking had never been making them more correct in the first place. It was just making them slower. Every one of those answers also costs real money in calls to the AI provider, so leaner and faster was cheaper too. A nice three-for-one.

One more speed lesson, less obvious. When Neeti answers a question, almost all of the elapsed time is spent waiting for the AI provider to reply over the network, not working our own database. That changes how you scale it. The bottleneck isn't database muscle, it's a lot of waiting, so the right move was to let many more answers run at the same time. They're mostly idle anyway, each one waiting on a reply. Matching the setup to where the time actually goes made the whole thing handle load far more calmly.

The takeaway underneath all of it: match the effort to the job. A lookup doesn't need a philosopher.

Where it fits in Neeto

NeetoCal is one product in a larger family. Neeto is a few dozen of them under one roof: scheduling, help desk, chat, forms, knowledge bases, and on. Different products, the same underlying shape of problem. People pour data into them and then have questions about that data, and the answers are usually a few clicks and filters away when they'd rather just ask.

The useful part of Neeti isn't really the calendar-specific bits. It's the machinery around them. The loop that lets a model pick tools and read results. The read-only discipline. The visible trace. The honest split between sampling and counting. Swap NeetoCal's tools for a help desk's tools and most of that machine still stands. Build it carefully once, and the pattern travels.

That's the quiet promise here. Not one assistant for one product, but a shape for letting people talk to any of their data, safely, in plain language.

What's next

A few directions I'm interested in.

More tools means more questions it can answer. Every tool I add widens the circle of things you can ask without learning a single new button.

Right now Neeti only reads. The obvious next step is letting it do things: "move my 3pm to tomorrow," "cancel that and tell them why." That's a bigger jump than it sounds, because the moment an assistant can change your calendar, the cost of a wrong guess goes way up. Reads you can be relaxed about. Writes need to ask first. I'd rather get reading genuinely trustworthy before handing it the pen.

The part I'm most curious about is reach: taking the same machine and pointing it at the rest of the Neeto family, so the assistant in your help desk or your forms feels like the same helpful thing you met in your calendar.

What I'd tell you if you're building one

A handful of things I keep coming back to.

  1. The model is rarely the problem. When it gets a fact wrong, look at the tools and the data you fed it before you blame the model or rewrite the prompt.
  2. "Show me some" and "count them all" are different jobs. A sample can be capped. A count can't. Don't let one tool quietly pretend to do both.
  3. Make the model think only as hard as the task needs. More reasoning isn't free, and for a plain lookup it buys you nothing but a slower answer.
  4. Give it fewer powers, not more. A small set of read-only tools you wrote beats a model improvising its own queries, both for safety and for your own peace of mind.
  5. Show your work. A trace of exactly what the assistant did turns "trust me" into "see for yourself," and people relax the moment they can check.
  6. Know where the time goes. If most of the wait is the model and not your database, scale for waiting, not for muscle.

Mostly plumbing

When people imagine building an AI feature, they picture clever prompts. Almost none of my time went there. It went into writing good tools, getting counts right, deciding what the model wasn't allowed to do, and turning a dial back down once I understood the job.

The prompt is the easy part. The plumbing is the product. A trustworthy assistant isn't the one with the cleverest wording behind it. It's the one that can only tell you true things, shows you how it got them, and doesn't make you wait.

The yoga number is 62, by the way. It always was. Neeti just had to be taught how to count.