Can Claude Run a Small Shop? (And Why Does That Matter?)

Can AI run your business? Anthropic (maker of the Claude AI models) wanted to find out:

"We let Claude manage an automated store in our office as a small business for about a month. We learned a lot from how close it was to success—and the curious ways that it failed—about the plausible, strange, not-too-distant future in which AI models are autonomously running things in the real economy.”

The short answer: No. But there is a whole lot more to look at and learn from the experiment:

"It's worth remembering that the AI won't have to be perfect to be adopted; it will just have to be competitive with human performance at a lower cost in some cases.”

"An AI that can improve itself *and* earn money without human intervention would be a striking new actor in economic and political life.”

And it comes with a bunch of warnings/red flags:

"We do think this illustrates something important about the unpredictability of these models in long-context settings and a call to consider *the externalities of autonomy*.”

"In a world where larger fractions of economic activity are autonomously managed by AI agents, odd scenarios like this could have cascading effects—especially if multiple agents based on similar underlying models tend to go wrong for similar reasons.”

In summary:

"Although this might seem counterintuitive based on the bottom-line results, we think this experiment suggests that AI middle-managers are plausibly on the horizon.”

The whole study is worth perusing.

Pascal Finette @radical