Skip to content

Commit e81ebdd

Browse files
am-steadsubatoi
andauthored
Updates "Optimize AI usage" to include additional best practices (#61635)
Co-authored-by: Ben Ahmady <32935794+subatoi@users.noreply.github.com>
1 parent 85f68dd commit e81ebdd

1 file changed

Lines changed: 75 additions & 24 deletions

File tree

content/copilot/tutorials/optimize-ai-usage.md

Lines changed: 75 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -13,19 +13,13 @@ category:
1313

1414
When agents are well-scoped, well-instructed, and operating within clear guardrails, token efficiency improves as a natural outcome. High-quality agents complete tasks in fewer attempts, follow clearer workflows with less rework, and avoid expensive debugging and correction cycles.
1515

16-
This article outlines five strategies for improving both agent quality and {% data variables.product.prodname_ai_credits_short %} efficiency:
17-
18-
* [Choose the right model for the right task](#1-choose-the-right-model-for-the-right-task)
19-
* [Provide clear guidance in your prompts](#2-provide-clear-guidance-in-your-prompts)
20-
* [Research, plan, then implement](#3-research-plan-then-implement)
21-
* [Add deterministic guardrails](#4-add-deterministic-guardrails)
22-
* [Maintain a concise `copilot-instructions.md`](#5-maintain-a-concise-copilot-instructionsmd)
16+
Follow the strategies outlined in this article to improve both agent quality and {% data variables.product.prodname_ai_credits_short %} efficiency.
2317

2418
## 1. Choose the right model for the right task
2519

2620
Model choice is one of the fastest ways to improve both agent quality and cost efficiency, but it is often overlooked. A common pattern is to default to the most capable model for every task—but this often increases token usage without improving the outcome. In some execution-heavy scenarios, overusing reasoning models can reduce quality because the model may overthink the task or introduce unnecessary changes.
2721

28-
Choose the model based on the work at hand. {% data variables.copilot.copilot_auto_model_selection %} can also handle this automatically based on real-time system health and model performance.
22+
Choose the model based on the work involved:
2923

3024
* **Reasoning models**: Best for architecture decisions, complex debugging, system design, and tasks that require deeper analysis.
3125
* **Mid-tier models**: Best when the plan is already clear and the agent needs to execute efficiently.
@@ -35,6 +29,18 @@ Use as much capability as the task requires, and as little as necessary. Matchin
3529

3630
For a breakdown by model and task type, see [AUTOTITLE](/copilot/tutorials/compare-ai-models).
3731

32+
### Configure the reasoning level of the model
33+
34+
Some models also support configurable reasoning levels, which control how much the model reasons before it responds. A higher level can improve answers to complex problems, but it consumes more tokens, and therefore more credits, so you should use the regular level by default and raise it only for harder tasks. Configurable reasoning is available for {% data variables.product.prodname_vscode %} and {% data variables.copilot.copilot_cli_short %} for supported models.
35+
36+
See [AUTOTITLE](/copilot/reference/ai-models/supported-models#models-with-extended-capabilities).
37+
38+
### Use {% data variables.copilot.copilot_auto_model_selection %}
39+
40+
{% data variables.copilot.copilot_auto_model_selection %} chooses a capable model for you, based on the intent of your task.
41+
42+
See [AUTOTITLE](/copilot/concepts/models/auto-model-selection).
43+
3844
## 2. Provide clear guidance in your prompts
3945

4046
Your prompt sets the direction for everything the agent does. When a prompt is vague, the agent has to infer intent, explore more context, and make judgment calls. That often leads to retries, scope drift, and unnecessary token usage.
@@ -49,33 +55,42 @@ This added guidance doesn't meaningfully increase token usage, but it can signif
4955

5056
For prompt engineering best practices, see [AUTOTITLE](/copilot/concepts/prompting/prompt-engineering).
5157

52-
## 3. Research, plan, then implement
58+
## 3. Keep your context lean
5359

54-
One of the biggest shifts in working effectively with agents is moving away from doing everything in a single session. When research, planning, and implementation all happen together, context grows quickly, irrelevant information accumulates, and agent quality degrades over time.
60+
{% data variables.product.prodname_copilot_short %} sends the context it has access to as input tokens, and that context adds up: open editor tabs, attached files, and the full back-and-forth of a long conversation all count as context.
5561

56-
Break work into clear phases:
62+
To keep context under control, consider doing the following:
5763

58-
* **Research:** Use the agent to explore the codebase, identify relevant files, and understand dependencies.
59-
* **Plan:** Create a detailed, structured plan or specification before making changes. This is where reasoning models are most valuable.
60-
* **Implement:** Execute against the plan using focused context and a model suited for execution.
64+
### Start a new conversation when you switch problems
6165

62-
Starting a new session between phases prevents carrying unnecessary context forward. A single session completed within a reasonable scope takes advantage of caching. Carrying forward context from earlier phases can increase token usage, introduce bias, and reduce clarity for the agent. Each phase should operate with only what it needs. For guidance on scoping sessions effectively, see [AUTOTITLE](/copilot/tutorials/cloud-agent/get-the-best-results).
66+
A long thread carries its entire history into every new request. When you move on to an unrelated task, start a new conversation. For example:
6367

64-
## 4. Add deterministic guardrails
68+
* In {% data variables.copilot.copilot_cli_short %} use `/new` (or `/clear`)
69+
* In {% data variables.copilot.copilot_chat_short %}, start a new chat session.
6570

66-
Agents are non-deterministic and won't be correct every time, especially in multi-step workflows. Without guardrails, small errors can compound quickly: agents build on incorrect outputs, drift further from the goal, and make debugging more expensive and time-consuming.
71+
### Compact long {% data variables.copilot.copilot_cli_short %} sessions that you want to continue
6772

68-
Deterministic controls introduce clear pass/fail signals:
73+
When you need the thread to keep going but it has grown large, run `/compact` in {% data variables.copilot.copilot_cli_short %} to summarize the history and shrink the context window, optionally focusing the summary (for example, `/compact focus on the auth module`).
6974

70-
* **Unit tests** verify the agent's changes produced the expected behavior.
71-
* **Linters** enforce structure and consistency, preventing formatting issues, style drift, and avoidable cleanup work.
72-
* **Security scans** catch risky patterns early, before they are harder to unwind.
75+
In addition, you can use `/context` to check current usage at any time.
7376

74-
Together, these controls create a tight feedback loop: the agent makes a change, a test, rule, or scan evaluates it, and the agent adjusts before moving forward. This prevents long chains of incorrect changes, which are one of the biggest drivers of token waste.
77+
See [AUTOTITLE](/copilot/concepts/agents/copilot-cli/context-management).
7578

76-
Teams that invest in these guardrails see fewer retries, faster task completion, and more predictable agent behavior. They often reduce total token consumption even if individual steps use slightly more tokens upfront.
79+
### Give {% data variables.product.prodname_copilot_short %} a map of your project
80+
81+
A well-maintained custom instructions file, such as an `AGENTS.md` or `.github/copilot-instructions.md` file, gives agents a structural overview of your repository so they don't have to read large numbers of files just to orient themselves. See [AUTOTITLE](/copilot/reference/custom-instructions-support).
7782

78-
## 5. Maintain a concise `copilot-instructions.md`
83+
### Bring in only the tools you need
84+
85+
Large tool sets (for example, a full MCP server's worth of tools) add to the context on every request. Where it fits your workflow, enable only the toolsets relevant to the task.
86+
87+
See [AUTOTITLE](/copilot/how-tos/provide-context/use-mcp-in-your-ide/configure-toolsets).
88+
89+
### Take advantage of context caching
90+
91+
{% data variables.product.prodname_copilot_short %} reuses context you've already sent through caching, which lowers the cost of follow-up turns. However, cached context expires after a period of inactivity and isn't reused when you switch models mid-session. In both cases, the context is re-sent and billed again as fresh input tokens. To get the most from caching, keep related work in one continuous session and avoid switching models partway through.
92+
93+
## 4. Reduce repeated errors with a `copilot-instructions.md` file
7994

8095
Persistent instructions improve consistency across agent interactions, but their value depends entirely on how they are written. A `copilot-instructions.md` file at the repository level is the most direct way to encode this guidance. Personal and organization-level instructions can layer on top for broader consistency.
8196

@@ -99,3 +114,39 @@ The best instructions are short, specific, and grounded in real observed agent b
99114
Keep instructions updated as your codebase, architecture, standards, and workflows evolve. Because these instructions are included in the agent's context on every run, even small improvements can reduce repeated errors and lower wasted token usage over time.
100115

101116
For more information, see [AUTOTITLE](/copilot/how-tos/copilot-on-github/customize-copilot/add-custom-instructions/add-repository-instructions).
117+
118+
## 5. Research, plan, then implement
119+
120+
One of the biggest shifts in working effectively with agents is moving away from doing everything in a single session. When research, planning, and implementation all happen together, context grows quickly, irrelevant information accumulates, and agent quality degrades over time.
121+
122+
Break work into clear phases:
123+
124+
* **Research:** Use the agent to explore the codebase, identify relevant files, and understand dependencies.
125+
* **Plan:** Create a detailed, structured plan or specification before making changes. This is where reasoning models are most valuable.
126+
* In {% data variables.copilot.copilot_cli_short %}, use `/plan`.
127+
* In {% data variables.copilot.copilot_chat_short %} in {% data variables.product.prodname_vscode %}, select "Plan" from the agent dropdown, or type `plan` in the context window.
128+
* **Implement:** Execute against the plan using focused context and a model suited for execution.
129+
130+
Starting a new session between phases prevents carrying unnecessary context forward. Carrying forward context from earlier phases can increase token usage, introduce bias, and reduce clarity for the agent. Each phase should operate with only what it needs. For guidance on scoping sessions effectively, see [AUTOTITLE](/copilot/tutorials/cloud-agent/get-the-best-results).
131+
132+
## 6. Add deterministic guardrails
133+
134+
Agents are non-deterministic and won't be correct every time, especially in multi-step workflows. Without guardrails, small errors can compound quickly: agents build on incorrect outputs, drift further from the goal, and make debugging more expensive and time-consuming.
135+
136+
Deterministic controls introduce clear pass/fail signals:
137+
138+
* **Unit tests** verify the agent's changes produced the expected behavior.
139+
* **Linters** enforce structure and consistency, preventing formatting issues, style drift, and avoidable cleanup work.
140+
* **Security scans** catch risky patterns early, before they are harder to unwind.
141+
142+
Together, these controls create a tight feedback loop: the agent makes a change, a test, rule, or scan evaluates it, and the agent adjusts before moving forward. This prevents long chains of incorrect changes, which are one of the biggest drivers of token waste.
143+
144+
Teams that invest in these guardrails see fewer retries, faster task completion, and more predictable agent behavior. They often reduce total token consumption even if individual steps use slightly more tokens upfront.
145+
146+
## Next steps
147+
148+
In addition to improving agent efficiency, you can also monitor and manage your spending to get the most out of your {% data variables.product.prodname_ai_credits_short %}:
149+
150+
* **Use your dashboard and budget controls**. The "AI usage" page, under https://github.com/settings/billing, breaks down consumption across every feature and model, so you can see where your credits are actually going and adjust accordingly.
151+
* **Identify expensive patterns before they add up**. Within a {% data variables.copilot.copilot_cli_short %} session, use `/usage` to see session-level metrics and to spot expensive patterns as you work. In addition, `/chronicle tips` analyzes your recent session history and surfaces opportunities to use {% data variables.product.prodname_copilot_short %} more efficiently.
152+
* **Upgrade for a larger allowance**. If you regularly approach your monthly limit, a higher plan may be more economical than paying for additional usage, as higher plans have more {% data variables.product.prodname_ai_credit_singular %} allowance. See [AUTOTITLE](/copilot/concepts/billing/individual-plans#github-ai-credits-allowance-by-plan) and [AUTOTITLE](/copilot/how-tos/manage-your-account/view-and-change-your-copilot-plan).

0 commit comments

Comments
 (0)