In the rapidly evolving landscape of artificial intelligence, a new trend is emerging among companies as they shift away from an aggressive utilization strategy known as “tokenmaxxing” to a more strategic approach: model switching. This transitional phase is being embraced widely, particularly as organizations reassess their AI expenditures and explore ways to maximize efficiency and cost-effectiveness.
Morgan Linton, the chief technology officer at the AI startup Bold Metrics, exemplifies this new approach. Based in Lake Tahoe, Linton communicates with his team of 16 engineers twice a week to guide them on which AI models to deploy for various tasks. On a recent occasion, he advised one group to implement the Claude Fable model at a low capacity while another team utilized the more advanced GPT-5.5 on a high setting. A third team employed Cursor with Composer 2.5, reportedly yielding “totally perfect results.” Linton emphasized that this tailored model usage allows his team to achieve peak performance without resorting to hard token caps, leading to more efficient workflows.
In stark contrast to the first half of 2026, characterized by a push for extensive AI use—known as tokenmaxxing—many companies, including tech giants like Uber and Microsoft, are now adopting a more discerning strategy. This shift is partly due to the rising costs associated with excessive AI usage, prompting leaders across sectors to implement stricter budget reviews and usage limitations.
Key players in the tech community, ranging from founders and software engineers to UX designers, have begun to prioritize model switching. This approach enables them to assign high-complexity tasks to premium models while reserving more straightforward tasks for older, more affordable alternatives. Coinbase CEO Brian Armstrong has articulated this sentiment, forecasting that within the next year or so, 80% of workloads will leverage “99% cheaper models,” as only a small fraction will require the latest advancements where “IQ maxxing” is critical.
Among those advocating for model versatility, Chris Maconi, cofounder of Hechura, has maintained a “human-in-the-loop” philosophy that eschews constant automation. Reflecting on the pitfalls of tokenmaxxing, he notes the inherent dangers of relying solely on high-capacity AI models and shares his experiences experimenting with more economical options.
Users across the industry have also adapted their workflows to save tokens. Tanvi Pisal, a UX designer, shared her journey toward efficiency, describing how she once wasted significant resources by relying too heavily on advanced models for preliminary brainstorming tasks. Now, she adopts a design-first strategy, utilizing tools like Figma and employing AI at a later stage to refine her work.
Software engineer Alejandra Thomas emphasizes her methodical approach to utilizing AI models, testing new iterations to find the right fit for her tasks while consistently opting for lighter, cost-effective models for simplicity. Similarly, Ed Stevens, CEO of AI sales company Scoot, illustrates a systematic process of evaluating and potentially switching models based on their performance over time.
Behavioral economics expert Dan Ariely notes the psychological implications of token limits, drawing a parallel to the late 1990s cellphone plans that imposed minute restrictions. He argues that the existence of token budgets can lead to inefficiencies as users strive to meet targets, inadvertently facilitating model switching as they reach their limits.
Supporting this shift towards strategic decision-making are a growing number of model routing startups, like Rayline. These companies provide solutions that intelligently assign tasks to various AI models based on their respective capabilities. According to industry experts, the adoption of model routing platforms, previously limited to a small percentage of firms, is on the rise, heralding a new wave of strategic AI utilization.
Despite this trend, some organizations remain entrenched in the mindset of relying on the latest models without fully understanding their applications. Maconi attributes this reluctance to a lack of initiative in exploring the most effective model strategies, leading to missed opportunities for cost-saving and efficiency gains. As the AI sector continues to mature, it seems likely that a focus on model switching and smart resource allocation will define the future landscape of artificial intelligence application.



