How much have AI API prices actually dropped in the last year?

GPT-4-level capability that cost around $30 per million output tokens in early 2023 is now available for under $2 on comparable models. A workflow processing 500 emails per month that cost $50–$60 at 2023 rates can now run for under $5 with prompt caching applied correctly.

Is DeepSeek safe to use for business data?

It depends on your data. Some DeepSeek models route data through infrastructure subject to Chinese law. For workflows involving customer PII, contracts, or anything sensitive, verify data routing before using DeepSeek. For many SMBs, the smarter play is using DeepSeek-driven competition to negotiate better terms with US-based providers.

What is prompt caching and how do I use it?

Prompt caching lets providers reuse computation from repeated prompt prefixes, cutting your cost on those tokens by 50–90%. To use it, put all static context (system instructions, templates, background info) at the top of your prompt before the variable input. OpenAI and Anthropic both support this on their main models.

AI Prices Are Falling Fast. Here's What SMBs Should Do Now.

Why are AI prices dropping so fast right now?

AI pricing has fallen faster in the past year than almost anyone predicted. If you priced out an API-based workflow in early 2024 and shelved it because the costs didn't make sense, it's time to look again. The economics have shifted enough that tools which were marginal bets are now obvious ones.

Two specific developments are driving this, and understanding them helps you make better decisions about where to deploy AI in your business.

What did DeepSeek actually change about AI costs?

DeepSeek's release of its V3 model family forced a repricing moment across the industry. Their V3-Pro model delivers performance that benchmarks competitively with much more expensive models, at a fraction of the cost. According to reporting from Berea Online, DeepSeek's pricing on V3-Pro undercuts comparable Western models significantly, and the competitive pressure pushed providers like OpenAI, Anthropic, and Google to respond with their own cuts.

This isn't a niche development. When the cheapest credible option drops, everyone else adjusts or loses customers. That's what happened here, and SMBs are direct beneficiaries.

The competitive pressure DeepSeek created did more to lower AI costs for small businesses than any single product launch from a US provider.

For operators who don't want to route data through DeepSeek's infrastructure for policy or compliance reasons, the practical effect is still real: the price war they triggered brought costs down across the board, including on OpenAI and Anthropic APIs.

What is cache-hit pricing and why does it matter for repetitive business work?

Cache-hit pricing is the second big development, and it's underappreciated. Here's how it works: when you send a prompt to an AI API, the provider processes your input tokens to generate a response. If you send a very similar prompt again (same system instructions, same context), the provider can reuse the cached computation instead of starting from scratch. Cached tokens cost significantly less, often 50–90% less than standard input pricing.

For most business workflows, this is a big deal. Think about what repetitive AI tasks actually look like in practice:

Drafting responses to customer emails using a consistent system prompt
Running the same data extraction logic across new invoices each week
Summarizing support tickets using a template you've already written
Generating product descriptions from a fixed format

All of these tasks reuse large chunks of the same prompt context every single run. With cache-hit pricing, you pay full price once and a fraction of that for every repeat call. If you're running a workflow hundreds of times a month, the savings compound fast.

OpenAI introduced prompt caching for GPT-4o in late 2024, and Anthropic has offered it on Claude models as well. The feature exists. Most SMBs just haven't structured their prompts to take advantage of it.

How much cheaper has AI actually gotten in real numbers?

The price compression over the last 12–18 months is striking when you look at specific models. GPT-4-level capability that cost roughly $30 per million output tokens in early 2023 is now available for under $2 per million on comparable models, according to publicly available API pricing pages from OpenAI and Anthropic.

For context on what that means in practice: a workflow that processes 500 customer emails per month, averaging 800 tokens of input and 300 tokens of output per email, would have cost roughly $50–$60/month at 2023 pricing. At current rates, with caching on the system prompt, that same workflow runs under $5/month.

That's not a rounding error. That's a business case that didn't exist before.

Which AI tasks are now worth doing daily that weren't before?

The cost drop doesn't just make existing use cases cheaper. It makes new categories of use cases viable. When you're paying fractions of a cent per run, you can justify running AI on every transaction, every ticket, every lead, not just batching it weekly.

Here are the task categories where daily AI use now makes clear financial sense for most SMBs:

| Task Category | Example | Why It Works Now | |---|---|---| | Customer communication | Draft reply suggestions for support inbox | High prompt reuse = cache savings | | Document processing | Extract fields from invoices, contracts | Repetitive structure = cheap per-doc | | Lead qualification | Score and summarize inbound leads | High volume + low token count | | Internal reporting | Weekly summaries from CRM or ops data | Template-heavy = cacheable | | Content operations | First-draft generation from a brief | Consistent format = reusable prompt |

None of these are new ideas. What's new is that the cost-per-run no longer requires you to batch, throttle, or justify each call individually.

What should you actually watch out for when costs drop?

Cheaper isn't automatically better. A few things to keep in mind:

Model selection still matters. Cheaper models sometimes hallucinate more or follow instructions less reliably. The right move is to test the specific task you're automating on the model you're considering, not assume that cheap equals good enough.

Data routing is a real consideration. Some DeepSeek models route data through infrastructure subject to Chinese data law. If your workflows touch customer PII, contracts, or anything sensitive, verify where your data goes before optimizing for the lowest price.

Volume can create its own costs. Running AI on every event sounds great until you've got a runaway loop calling the API 10,000 times because of a bug. Set rate limits and cost alerts in your API dashboard before you automate anything at scale.

What we'd actually do

Audit one repetitive workflow you're already doing manually and price it out. Pick something with consistent structure (email replies, invoice processing, lead notes) and calculate what it would cost to run via API at current rates with caching. Most SMBs are surprised how low the number is.
Structure your prompts for caching. Put your static system instructions and context at the top of every prompt, before the variable input. This is the single easiest change that cuts API costs for repetitive tasks, often by 50% or more.
Don't optimize for price before you've validated the output. Run your chosen model on 20–30 real examples from your business before you commit to a provider or build automation around it. Cheap and wrong is worse than slightly more expensive and reliable.

If you want to work through this with other SMB operators who are building real workflows, not just reading about them, that's exactly what we do at skool.com/aiforbusiness.

AI Prices Are Falling Fast. Here's What SMBs Should Do Now.

Why are AI prices dropping so fast right now?

What did DeepSeek actually change about AI costs?

What is cache-hit pricing and why does it matter for repetitive business work?

How much cheaper has AI actually gotten in real numbers?

Which AI tasks are now worth doing daily that weren't before?

What should you actually watch out for when costs drop?

What we'd actually do

FAQ

Want this running in your business?

More on AI Strategy

Why Starbucks Killed Its AI Tool After 9 Months

Why Cheaper AI Means SMBs Will Spend More, Not Less

What Separates AI ROI From AI Waste?