The Short Answer
To cap what an n8n AI Agent can spend, route its language model through your TokenSense endpoint and set a budget in the TokenSense dashboard. Drop the TokenSense Chat Model sub-node into the Agent's 'Chat Model' slot (or point your existing model node's Base URL at https://api.tokensense.io). When the budget is exhausted, TokenSense returns a clean 402 and the Agent stops calling the model — automatically, with no code and no extra logic in your workflow.
This matters far more for an Agent than for a single AI node, because an Agent decides on its own how many model calls to make. A budget is the one thing that turns 'I hope it does not loop' into a hard ceiling you control.
Workspace Budget Guardrail
Why AI Agents Run Up Unpredictable Costs
A normal AI node makes exactly one model call per run, so its cost is easy to predict. An AI Agent is different: it reasons in a loop, calls tools, retries, and keeps going until it decides the task is done. A single trigger can fan out into anywhere from two to twenty model calls — and you do not know which in advance.
Put numbers on it. Say an Agent averages 8 model calls per run at about $0.02 a call, running 500 times a day. That is roughly $80 a day, around $2,400 a month — from one workflow. Now a prompt tweak makes the Agent loop on a tool it cannot satisfy, and the calls-per-run quietly doubles. You find out when the provider invoice arrives. Alerts tell you after the money is gone; a budget cap stops it while it is happening.
TokenSense draws a clear line here: quotas warn, budgets block. An over-quota request still goes through, so a soft limit never disrupts a live workflow. A budget-exceeded request is stopped immediately with a 402. The budget is the hard ceiling you set, in dollars.
Set Up Budget Enforcement for Your AI Agent
The setup is four steps and takes about five minutes. Nothing in your Agent's logic changes — you are only changing where its model calls are routed. Click through the walkthrough below, then follow the same four steps in your own n8n.
Set up the TokenSense node in n8n
Step 1 of 4In n8n, open Settings → Community Nodes and install the package n8n-nodes-tokensense.
n8n-nodes-tokensenseSelf-hosted today. The verified node is rolling out to n8n Cloud with n8n’s next release.
1. Add your provider key. In the TokenSense dashboard, go to Settings → Providers and paste your OpenAI, Anthropic, Google Gemini, xAI, or Mistral key. TokenSense uses it to forward the Agent's calls on your behalf; keys are encrypted and never exposed inside your workflow.
2. Point the Agent's model at TokenSense. Drop the TokenSense Chat Model sub-node into your AI Agent's 'Chat Model' slot and pick your model from the dropdown. On self-hosted n8n, install the verified n8n-nodes-tokensense community node first (Settings → Community Nodes). The verified node is rolling out to n8n Cloud with n8n's next release; until then, Cloud users get the same routing by setting the OpenAI Chat Model node's Base URL to their TokenSense endpoint.
3. Set a budget. In the dashboard, set a workspace budget (a total dollar cap across everything) or a per-project budget (a cap for one client or workflow group, on Pro and Agency plans). This is the dollar ceiling the Agent can never cross.
4. Test it. Run the Agent — the call appears in your TokenSense Logs within seconds, with its exact cost, model, and the workflow that made it. To prove the cap works, set a tiny budget and run again: once it is exhausted, the next call returns a 402 and the Agent stops cleanly.
For the full reference, see the budgets documentation and the n8n integration guide.
See Exactly Which Step Spent the Money
Setting a cap stops overspend; attribution tells you where the spend went. When your Agent runs on TokenSense nodes, every call is tagged automatically with three things — the workflow, the step, and the execution ID. So inside a single Agent run you see each step's cost: the Agent's own reasoning calls through the TokenSense Chat Model, and any models it delegates to as tools through the TokenSense AI Tool — not one lump sum for the whole workflow.
How TokenSense Attributes Costs in n8n
When your n8n workflow executes, the community node automatically tracks the name of each node and the execution number. Watch how TokenSense maps the exact cost of every single workflow step in real time.
#98231This is the part most setups cannot do. Workflow, step, and execution tagging is automatic with any TokenSense node — no headers, no code. n8n's built-in model nodes, and a plain Base URL swap, have no way to attach the step name and run ID, so they top out at request-level totals; Make, Zapier, or raw code can send the same metadata, but only by hand. That per-step, per-execution visibility — sitting alongside hard budget caps — is what routing-only gateways don't give you.
Give Each Client Their Own Agent Budget
If you run Agents for multiple clients, put each client in its own TokenSense project and give that project its own budget (available on Pro and Agency plans). Each client's Agent is then capped independently: when one client's budget is exhausted, their Agent stops at a 402 while every other client keeps running untouched.
That turns a nervous 'please do not overspend' into a number you can defend to a client, and a per-client cost line you can bill against. There is more on that in our guide to multi-client AI management for agencies.
Want to see what your Agents actually cost? Start free — the TokenSense Starter plan includes 10,000 requests a month, full cost tracking, and budget caps, and you can have your first Agent reporting its own cost in about five minutes.
