Skip to content
Strategy · Mar 18, 2026 · 7 min read

What we tell first-time AI buyers when they ask where to start

The conversation we have with first-time AI buyers on a discovery call. Start with one bounded workflow — not a tool, not a budget, not an agent.

A few times a week, we get an email or a discovery-call request from someone whose company is buying AI consulting for the first time. The question is almost always some variant of: where should we start?

Most of them are expecting an answer with a brand name in it. Sometimes the answer they want is Claude or GPT-5 or Gemini. Sometimes it is Cursor or Copilot or one of the agent platforms. Sometimes it is a vendor, a category, or a budget number.

The honest answer does not have a brand name in it. It is structural, and the shape doesn’t change with the buyer or the industry.

Start with one bounded workflow that someone in the company already wishes were faster. Audit the result honestly. Decide what you have learned.

That is the whole answer. The rest of this post is why it is the answer, what each phrase actually means, and what changes if you take it seriously.

The case against starting with a tool

When you start with the tool, you spend the first six weeks evaluating which platform — and never confront the underlying question of what you would actually do differently. The tool evaluation has its own forward motion: demos, comparisons, RFPs, vendor calls. It feels like progress because it produces artifacts. By the end you have a recommendation memo and an enterprise license. You have not, at any point, done the harder thing, which is to identify a piece of work in your company where the AI question is concrete enough to test.

This is not a hypothetical failure mode. The pattern is recurring: a buyer arrives with a signed annual contract for a model provider, an agent platform, or both, and no shipped use case to evaluate against. The contract was the easy part. The use case is the hard part. The contract does not help.

The bounded-workflow filter

A workflow worth starting with has three properties. None of them are negotiable.

A well-defined input. Something has to come in. It has to come in often enough and in a similar enough shape that you can measure performance. Customer support tickets qualifies; all incoming email does not. Vendor invoices qualifies; all PDFs does not. The shape of the input determines the shape of the eval, which determines whether you can ever know if the system works.

A checkable output. Someone in the company has to be able to look at the output and say yes, that is right or no, that is wrong — without a six-week training program. A summary the lead reviews before sending qualifies; a strategic recommendation does not. The check is what gives you data. Without the check you have no signal, and without signal you cannot decide what to do next.

Recurrence. The workflow has to happen more than a few times a month. Recurrence is what gives you the volume to evaluate. A workflow that happens four times a year cannot be evaluated; it can only be hoped for. A workflow that happens 300 times a month gives you a calibrated sense of how often it works, when it fails, and what the failure modes are within thirty days.

The intersection of those three properties is small. That is the point. Most of the workflows your team executes do not pass all three filters.

A clarification, because this is where the conversation usually goes sideways: AI in the workflow-or-agent sense — the kind that gets scoped into a build engagement and shows up as software in your stack — does not help with workflows that fail the filter. AI in the chat-and-brainstorm sense — opening Claude or ChatGPT to draft a memo, talk through a decision, sanity-check a plan — can still help almost anyone, almost anywhere. Those are different uses of the same technology, and the bounded-workflow filter applies only to the first.

The workflows that do pass all three filters — contract review, weekly report generation, accounts-payable triage, customer-support routing, code-PR summarization, internal-document Q&A — are where the first engagement should sit.

What “audit honestly” actually means

Once the workflow is shipped, the audit is the part most teams skip. It is also the part where most of the value compounds.

The audit needs measurement. Not vibes. Specifically:

  • How long did it actually save, end to end? Including the review time, the failure handling, and the time spent re-prompting when the output was wrong.
  • What was the error rate? Stratified by input type when input types are heterogeneous.
  • What were the failure modes? Not summary statistics — the actual cases. The specific input that produced the wrong output. Pattern-recognition on the failures is where the next iteration of work comes from.
  • Did the people doing the work want to keep using it? Adoption is a result, not an input. If the answer is no, the system has failed regardless of the metrics.

You do not need a sophisticated eval platform for the first audit. A spreadsheet will do. Once the workflow earns a second iteration, the open-source Inspect AI framework from the UK AI Safety Institute is a clean place to formalize the eval. Commercial options like Braintrust and LangSmith earn their cost when you start running multiple workflows in parallel. None of them earn their cost on day one.

What you actually buy

Most first-time buyers arrive with a budget that is too large for the right scope and too small for the wrong scope. They have heard that AI transformations cost $200K and they have a willingness-to-spend in that range. The right first engagement is usually $5K to $25K. It is one workflow. It is two weeks of scoping, four weeks of building, six weeks of running. It produces a runbook, an eval suite, and a clear answer about whether to extend.

The remaining $175K is better held back. Spend it on the second and third workflows, after the first one tells you what kinds of work pay off. Spend it on the people who maintain the systems once they exist — the unglamorous half that decides whether the system is still running in month four. Do not spend it on a platform license you are not yet using.

Why this works

The companies that get AI right are not the ones who picked the best vendor. They are the ones who picked the right first workflow. Once you have one workflow shipped, audited, and decided, the next decision is not abstract anymore. You know what your team’s review tolerance looks like. You know which model handled your data well and which did not. You know what the 80th-percentile failure looks like. You know whether your people wanted the system to keep running.

Most of the strategic AI questions a leadership team is asking — build vs. buy, which vendor, which platform, what is the right architecture — are easier to answer with one shipped workflow behind you than with zero. The first workflow is not the AI strategy. It is the thing that lets the AI strategy be written by people who know something concrete.

That is also why we lead with this answer. The buyers who take it seriously have a different conversation in three months than the ones who do not. The buyers who skip it usually call us back from the same place they started, six months later, with a contract they have not yet used.

— Oasium AI · Applied AI consulting
← All writing
More from us

If this was useful, the guides go deeper.

Long-form work — frameworks, scoring sheets, and worked examples — free to read and download. Subscribe and we'll send new ones as they ship.