Why AI Can’t Replace Experts (Yet): Word vs. World Models

— by

Earn CPE with this course

I keep seeing the headlines. You have probably seen them too. They claim accountants, lawyers, and coders will be obsolete in a year.

If you ask a startup founder, they look at an AI-generated legal memo and say, “This is perfect. Lawyers are done.”

If you ask a lawyer, they look at the same memo and say, “If I send this, I get disbarred.”

I wanted to understand this disconnect. Why do outsiders think AI is ready to take over while experts see critical failures? I dug into the “Everyday CPE” course transcript and the underlying research from Latent Space to find the answer.

It comes down to a fundamental difference in how machines and humans operate: Word Models vs. World Models.

The Difference Between Sounding Right and Being Right

Large Language Models (LLMs) like ChatGPT are “Word Models.” They are statistical engines. They predict the next likely word based on billions of parameters. Their goal is coherence, tone, and plausibility.

Experts possess “World Models.” They simulate the environment. They predict the consequences of an action. Their goal is survivability and outcome.

I analyzed the course data to break down why this distinction stops AI from replacing true experts.

Artifacts vs. Moves

The course highlights a concept called “Artifacts vs. Moves.”

An outsider judges the Artifact: Is the email polite? Is the code grammatical? Does the memo look professional?

An expert judges the Move: Will this email get a response? Will this code crash when a user does something weird? Will this memo survive a judge?

The “Priya” Email Test

I looked at a specific example from the lecture regarding a Slack message to a colleague named Priya.

The AI Draft:

“Hey Priya, no rush, but whenever it fits your schedule, could you take a look at this?”

  • The Artifact: It is polite. It is grammatical. It looks like a good email.
  • The Reality: It sends the wrong signals. “No rush” means “ignore this.” “Take a look” is vague. The recipient will likely delete it.

The Expert Draft:

“Need 15 minutes before Friday or we are blocked.”

  • The Move: It is direct. It establishes stakes. It sets a deadline. It gets the result.

The AI optimized for text generation. The expert optimized for the outcome.

Chess vs. Poker: The Information Gap

To understand why LLMs fail at complex tasks, I looked at the analogy of Chess versus Poker.

Chess is deterministic. All pieces are visible. There is a “best move” for every state. LLMs are decent here because the logic is solvable.

Poker is probabilistic. You have hidden information. You have to model what your opponent is thinking. You have to bluff.

Most jobs are not Chess. They are Poker.

In the real world, you are dealing with imperfect information. You are dealing with human psychology.

  • Sales: You have to know when a client is lying about their budget.
  • Coding: You have to predict how a user will try to break your app.
  • Law: You have to anticipate the opposing counsel’s strategy.

LLMs struggle here. They explain game theory perfectly, but they do not feel the pressure of a counter-move. They are static.

The Chatbot That Gave an 80% Discount

The lecture referenced a specific failure mode regarding a UK chatbot.

A user spent an hour talking to a customer service AI. They slowly walked the bot through logical circles. Eventually, they convinced the bot that it made sense to offer an 80% discount on a product. The bot generated a code.

Why did this happen?

The AI does not have a “World Model.” It does not know it is in a negotiation. It does not know it is being gamed. It just predicted the next polite, agreeable word until it agreed to give away the store.

Humans probe for weakness. We adapt. When we see a pattern, we exploit it. AI currently cannot model the fact that it is being modeled.

Key Takeaways

  • Outsiders vs. Experts: Outsiders judge AI on coherence. Experts judge AI on survivability.
  • Text is not the Product: A polite email that gets ignored is a failure. A direct email that gets a meeting is a success.
  • The Poker Problem: AI works well in perfect-information environments (Chess) but fails in hidden-information environments (Poker).
  • Static Vulnerability: LLMs do not realize they are being analyzed. They can be talked into bad decisions because they lack a model of the adversary.
  • The Future: True reliability requires models that simulate outcomes, not just language.

Want to earn CPE for this topic?

  1. Compare Options: See how we stack up against others in our 2025 Flexible CPE Guide
  2. Understand the Format: Read how Nano-Learning works for CPAs.
  3. Check Your State: Ensure you are compliant with our State Requirements Guide.
  4. What is EverydayCPE?
Home » Lessons » Business Management & Organization » Why AI Can’t Replace Experts (Yet): Word vs. World Models

Related Courses:

Today’s lesson

Discover more from EverydayCPE

Subscribe now to keep reading and get access to the full archive.

Continue reading