You're a non-technical founder looking to hire an AI development agency.
You can't read the code. You can't review the architecture. You can't tell a good API integration from a technical mess until it breaks at 2am the night before your investor demo.
So how do you evaluate them?
Most founders rely on three things: vibes from the sales call, testimonials on the website, and portfolio screenshots. All three can be fabricated or cherry-picked. None of them tell you whether an agency will actually deliver what you're paying for.
Here are 8 questions that reveal whether an agency knows what they're doing — and whether they're the right fit for your build.
Table of Contents
- Why Standard Evaluation Criteria Fail Non-Technical Founders
- The 8 Questions
- Red Flags to Watch For
- Green Flags That Signal a Good Agency
- The Evaluation Call Structure
- What to Do After the Call
Why Standard Evaluation Criteria Fail Non-Technical Founders
The standard advice for hiring developers is: review their code, check their GitHub, assess their technical skills. If you're non-technical, you can't do any of that reliably.
You need a different evaluation framework — one that reveals quality and reliability through indirect signals that don't require technical knowledge to interpret.
The 8 questions below are designed to surface those signals. Each one is asking about something specific that separates good agencies from mediocre ones.
The 8 Questions
Question 1: "Walk me through how you'd scope this project."
Ask this before they give you any pricing. Watch what happens.
What a good agency does: They ask you questions. A lot of them. Who are the users? What's the core workflow? What integrations do you need? What does success look like in 30 days? They're trying to understand your problem before they map it to a solution.
What a bad agency does: They give you a quote within 24 hours of a 20-minute intro call. They've mapped your project to a template before understanding your specific requirements. The quote looks professional, but it's not based on your actual project.
The depth of their questions tells you how much they'll understand your product before they build it.
Question 2: "What's explicitly out of scope in this engagement?"
Every agency says "no hidden fees." The meaningful version of this claim is a specific list of what creates additional charges.
Ask them to be specific: What happens if you want to add a feature mid-build? What if a third-party API changes and integration breaks? Who handles post-launch bugs? What about performance issues discovered after deployment?
What you want to hear: A specific, written list of what's out of scope. If they say "we'll discuss it if it comes up," that's a hidden fee dressed in polite language.
What this reveals: Whether they've done this enough times to know what the common dispute points are. Experienced agencies have clear boundaries because they've learned the hard way what ambiguity costs.
Question 3: "Do I own the code from Day 1, and on what terms?"
This question has three important sub-questions:
- Who owns the intellectual property during the build?
- When exactly is the code transferred?
- Are there any proprietary frameworks or tools in the build that require ongoing payments or the agency's involvement?
Some agencies retain IP until final payment. Some use internal frameworks that tie you to them. Some build on platforms with monthly licensing fees that aren't disclosed upfront.
What you want to hear: "The code is yours. We'll transfer the repository to your GitHub account at project kickoff / at launch. We don't use proprietary frameworks that require us to maintain them."
If the answer is anything other than clear, unambiguous yes — ask a follow-up.
Question 4: "What happens if the project runs over the timeline?"
Every agency promises to hit deadlines. The meaningful test is what accountability looks like when they don't.
Options:
- They extend for free (their risk, not yours)
- They charge extra (your risk)
- They reduce scope to hit the date (your product gets smaller)
- They offer a partial refund if materially late (skin in the game)
What you want to hear: A specific answer. Not "we always hit our timelines" — every agency says that, and it's not an answer to the question.
What this reveals: Whether they're confident enough in their process to put accountability in writing. Agencies that won't commit to a consequence for missed deadlines are telling you they don't fully believe in their own timeline estimates.
Question 5: "Can I see the last three products you shipped — live, clickable?"
Not logos. Not case study PDFs with screenshots. Actual live products you can open in a browser and use.
If an agency has shipped real products, they have live URLs they can share. If they show you mockups and Figma designs, that's not evidence of shipping — it's evidence of designing.
Follow-up: "Can I speak with the founder who commissioned that build?"
A 15-minute call with a previous client will tell you more than any testimonial on a website.
What this reveals: Whether they actually ship. A portfolio of live products is hard to fake. A portfolio of case study PDFs is not.
Question 6: "What AI models and tools do you use, and why do you make those choices?"
This question separates agencies that actually understand AI from ones that added "AI development" to their service offering because it's in demand.
A competent AI agency can explain:
- The difference between GPT-4o and Claude and when they'd choose each
- Why they'd use LangChain for one project and direct API calls for another
- What vector databases do and when you actually need one
- The tradeoffs between fine-tuning a model vs. prompt engineering
They should have opinions based on real experience, not just recite the names of tools.
What a bad answer sounds like: "We use the best tools for each project" or "it depends on your needs." These are non-answers that signal they don't have enough depth to have real opinions.
What a good answer sounds like: "We default to Claude for document processing because the long context window handles large files better. We use OpenAI for tasks where the ecosystem tooling matters more than context length. We avoid LangChain for simple integrations because the abstraction overhead isn't worth it."
Question 7: "What does the handover process look like after launch?"
The build is only half the engagement. The other half is whether you can operate and maintain what was built without being dependent on the agency going forward.
Ask specifically:
- Will I receive the code repository?
- Will I receive all credentials, environment variables, and access to all services?
- Is there documentation for how the product works?
- Is there a walkthrough call where you explain what was built?
What you want to hear: A defined handover checklist. Repository transfer + credentials + documentation + walkthrough call at minimum.
What this reveals: Whether they treat the handover as a real deliverable or an afterthought. Agencies that don't have a defined handover process either expect ongoing dependency or don't care about your ability to operate independently.
Question 8: "What would you NOT build for me, and what are you not good at?"
This is the most revealing question on the list.
Good agencies know their limits. They'll tell you what they don't handle well — mobile app development, compliance-heavy builds, machine learning research, hardware integrations. They'd rather pass on work that's outside their competency than take your money and figure it out at your expense.
Bad agencies say yes to everything. The logic is simple: saying no means losing the deal. So they say yes and figure it out later — on your timeline and budget.
What you want to hear: A genuine list of things they're not set up to do well, delivered without embarrassment or qualification. The agency that tells you what it can't do is the one you can trust about what it says it can.
Red Flags to Watch For
These aren't automatic dealbreakers, but each one should trigger follow-up questions:
- No fixed timeline or fixed price — if they can't commit to either, who's taking the risk?
- Vague scope documentation — a bulleted list of "features" is not a spec
- No live product portfolio — case studies without live URLs are not evidence
- Technical jargon as a deflection — if the explanation gets more complex when you ask follow-up questions, they may be obscuring uncertainty
- Pressure to decide quickly — "this rate is only available this week" is a sales tactic, not a business reality
- No questions about your users — an agency that doesn't ask who your users are before scoping your product isn't thinking about the right problem
Green Flags That Signal a Good Agency
- They push back on your requirements during scoping — they're thinking about what you actually need, not just taking the order
- They have clear documentation for what's in and out of scope before you sign
- They can name previous clients willing to speak to you
- They have specific opinions about technology choices with reasoning behind them
- They tell you something you didn't know about your own problem during the scoping call
- They're honest about what they're not the right fit for
The Evaluation Call Structure
Here's how to structure a 45-minute evaluation call to get maximum signal:
Minutes 0–10: Let them ask you questions about the project. Evaluate the quality of their questions.
Minutes 10–25: Ask Questions 1–4 from the list above. Take notes on specificity.
Minutes 25–35: Ask Questions 5–6. Request live URLs and watch how they respond.
Minutes 35–45: Ask Questions 7–8. The last question especially — it tells you the most.
What to Do After the Call
If you talked to three agencies, you'll likely have one clear front-runner based on the quality and specificity of their answers. But before you decide, do this:
- Ask for a written scope document before you sign anything. The document will tell you whether their verbal answers match their written ones.
- Check one reference — a 10-minute call with a previous client is more valuable than any testimonial.
- Compare the out-of-scope lists — the agency with the most specific exclusions has thought about this the most carefully.
The right agency is the one that makes you feel informed, not the one that makes the best first impression.