Jensen Huang × Lex Fridman: The AI Revolution Behind NVIDIA’s $4 Trillion Market Cap

by 10 min read

Jensen Huang sat down for a conversation lasting over two hours when NVIDIA’s market cap reached $4 trillion: from Amdahl’s Law to human consciousness, from the darkest moment when the market cap plummeted to $1.5 billion to “hoping to die instantly on the job.”

He is the longest-serving CEO of a tech company globally, having led NVIDIA for 34 years. The company is now the most valuable publicly listed company in human history, but he says, holding up a chip for photos is “still kind of adorable, but that’s not what’s in my mind anymore.” What’s in his mind is planetary-scale infrastructure.

Original video: https://youtube.com/watch?v=vif8NQcjVf0

Key Takeaways:

• When NVIDIA put CUDA into GeForce, costs increased by 50%, almost swallowing all profits. Market cap dropped from around $8 billion to $1.5 billion, taking a decade to recover. Jensen considers this the closest the company ever came to an existential decision.

• Jensen proposed four AI scaling laws (pre-training, post-training, inference at test time, agents). The core conclusion: Intelligence is ultimately determined solely by computing power. Token cost decreases by an order of magnitude each year.

• The Grace Blackwell architecture is designed for large language model inference; the Vera Rubin architecture is designed for agents—the latter started design two years before Claude Code and OpenClaw appeared.

• Jensen has collaborated with TSMC for thirty years, handling tens of billions of dollars in business without a single contract. In 2013, Morris Chang invited him to become TSMC’s CEO; he declined within 10 minutes.

• He believes the definition of programming has changed; the number of people who can program will expand from 30 million to 1 billion. The case of radiologists proves: after AI surpassed humans, the number of professionals in that field actually grew.

• Jensen’s management style: over 60 direct reports, no one-on-one meetings, all problems discussed collectively. Company structure should mirror product architecture.

• Regarding AGI, Jensen believes it has already been achieved, but “the probability of 100,000 Agents building NVIDIA is zero.”

Lex asked: NVIDIA has expanded from designing single GPUs to simultaneously designing GPUs, CPUs, memory, networking, storage, power, cooling, software, racks, Pods, and even entire data centers. With so many complex variables, what’s the hardest part?

Jensen first explained why this is necessary. Past problems could be accelerated by one computer, one GPU. Not anymore. You add 10,000 computers but want it to run 1 million times faster. This is the problem with Amdahl’s Law: if computation is only 50% of the total workload, even if you accelerate computation infinitely, total speed can only double.

So when you distribute work across thousands of machines, every component becomes a bottleneck—CPU, GPU, network, switches, workload distribution. “This is an extremely complex computer science problem. We must mobilize all technologies. Otherwise, we can only scale linearly, or scale according to the already slowing Moore’s Law.”

This also directly explains Jensen’s management style. He has over 60 direct reports, almost all with engineering backgrounds—experts in memory, CPU, optics, GPU architecture, algorithms. No one-on-one meetings.

We don’t do one-on-ones. We present a problem and all of us attack it. Because we’re doing extreme co-design. And literally, the company is doing extreme co-design all the time.

Even when discussing a specific component, like a cooling solution, everyone is listening. The memory person might say ‘this won’t work for memory,’ the power person might say ‘power consumption can’t handle it.’ If someone feels they have something to contribute on a topic but hasn’t spoken up, Jensen pulls them back in.

Jensen believes most companies’ organizational structures make no sense. “Hamburger type, software type, car company type—they all look the same.” His logic is: A company is a machine that produces products; its architecture should reflect the product and its environment.

A Vera Rubin Pod requires 7 types of chips, 5 types of racks, and coordination with 200 suppliers, so the company must have 60 domain experts solving problems in the same room.

[Note: Vera Rubin is NVIDIA’s next-generation AI computing platform announced at GTC 2026, named after astronomer Vera Rubin, expected to ship in the second half of 2026. A Vera Rubin Pod contains about 40 racks, over 1,100 Rubin GPUs, with total computing power of 60 exaflops.]

The CUDA Bet—The Decision That Dropped Market Cap from $8B to $1.5B

Lex asked about the most critical strategic decision in NVIDIA’s history: putting CUDA on GeForce. Jensen said this was the company’s closest moment to an “existential threat.”

The background story is this. NVIDIA started as an accelerator company, optimizing for specific tasks. But the problem with specialization is the market is too narrow; market size determines R&D investment, which in turn determines how much impact you can have in computing. So NVIDIA always wanted to become a “computing company,” but there’s a fundamental tension between being a “better computing company” and a “more specialized accelerator”; these two directions are contradictory.

The first step was programmable pixel shaders. The second step was adding IEEE-compliant FP32 (32-bit floating-point) to the shaders, which allowed scientists and researchers to discover GPUs could be used for general-purpose computing. Then came adding C language (Cg) on top of FP32, evolving into CUDA.

But the key strategic decision was: Putting CUDA into GeForce, embedding CUDA in every consumer graphics card, whether users used it or not.

Jensen used the x86 example to explain why. x86 is perhaps the most criticized architecture, but it defines computing today. Those elegant RISC architectures designed by top computer scientists failed. There’s only one reason: installed base.

Installed base defines an architecture. Everything else is secondary.

At the time, GeForce was selling millions of units annually. Putting CUDA in meant putting a supercomputer in every PC user’s hands. NVIDIA simultaneously taught courses at universities, wrote books, promoted CUDA. Researchers and scientists discovered CUDA through gaming graphics cards—many were PC gamers themselves, many labs built clusters using PC components.

But the cost was devastating.

CUDA increased GeForce’s cost by about 50%, and NVIDIA was a company with only 35% gross margin at the time.

This meant CUDA almost swallowed all gross profit. Market cap dropped from around $6-8 billion to about $1.5 billion, stayed at that level for a while, slowly climbed back, all while carrying CUDA on GeForce. It took ten years.

Lex pressed: How did you psychologically handle making this decision?

Jensen said he had to make the board understand the goal; the management team knew gross margin would be crushed. One could reasonably deduce—one day CUDA would enter workstations and supercomputers, where higher margins could be achieved. “You can convince yourself with reasoning that you can afford it, but it actually took ten years.”

You manifest a future and that future is so convincing, there’s no way it won’t happen. There’s a lot of suffering in between, but you’ve gotta believe what you believe.

[Note: CUDA has now iterated to version 13.2 and is the de facto standard in global AI and high-performance computing. NVIDIA official data shows the CUDA developer ecosystem exceeds several million. OpenCL, which competed with CUDA, still exists but has far less influence in AI.]

Change Everyone’s Beliefs Before Announcing

Jensen then explained his unique leadership method. He never makes sudden big pivots—no massive layoffs, major reorganizations, new mission statements, new logos. When he learns about a new trend that starts influencing his thinking, he immediately starts talking about it in front of people around him—”This is interesting,” “What impact will this have,” “This will change that.”

He does this every day. To the board, to management, to employees. He’s gradually shaping everyone’s belief system, so when he one day says “Let’s go all-in on deep learning,” everyone’s reaction isn’t “What are you talking about?” but “What took you so long?”

He used the Mellanox acquisition as an example: On the day of the announcement, everyone felt it was obvious. Because he had laid the groundwork for a long time.

“On the day of the announcement, I want the employees’ reaction to be: ‘Jensen, what took you so long?'”

GTC itself is also his belief-shaping tool—not just for internal employees, but for the entire industry’s partners. At GTC, he synchronizes his world model with suppliers, developers, partners. “I don’t think there have been hundreds of CEOs attending a single keynote in history.” He treats GTC as collective path alignment for the entire ecosystem.

[Note: Mellanox is an Israeli high-speed networking technology company. NVIDIA announced its acquisition for about $6.9 billion in 2019, completed in 2020. This acquisition is considered a key step for NVIDIA entering the data center networking market. Mellanox’s InfiniBand and Ethernet technologies are now core components of NVIDIA AI systems; its networking business quarterly revenue exceeds $10 billion.]

Inference is Harder Than Reading—Four AI Scaling Laws

Lex asked if Jensen still believed in Scaling Laws. Jensen replied: Not only believe, but there are more Scaling Laws now.

He listed four stages.

The first is pre-training scaling.

The larger the model, the more data, the smarter the AI. When Ilya Sutskever said something like “We’re running out of data,” the industry panicked, thinking AI had hit a wall.

[Note: Ilya Sutskever later left OpenAI and founded SSI (Safe Superintelligence Inc.). He expressed views in a 2024 public event that pre-training data might be nearing a bottleneck.]

Jensen thinks this is “obviously wrong.” Synthetic data will continue to expand training data scale. He pointed out an overlooked fact: Most data we use for teaching is already “synthetic”—people create it, consume it, modify it, regenerate it. AI can now synthesize massive new data based on real data. The result: Training is no longer limited by data, but by computing power.

The second is post-training scaling (fine-tuning, reinforcement learning, etc.).

The third is inference at test time scaling.

Jensen recalled that many people told him “inference is simple, chips will be small, no need for expensive NVIDIA stuff.”

Inference is thinking, and I think thinking is hard. Thinking is way harder than reading. Pre-training is just memorization and generalization, finding patterns. Thinking is reasoning, planning, searching, breaking new problems into solvable pieces, solving with first principles or past experience. How could that not require massive computing power?

The fourth is agent scaling.

A large language model can do research, query databases, use tools during inference, and more importantly, spawn numerous sub-agents. “The easiest way to scale NVIDIA is to hire more people, not make myself stronger.” Agent scaling is AI’s “hiring.”

These four stages form a cycle: Agents generate data, good data flows back to pre-training, refined in post-training, enhances inference at test time, deployed by agents.

Core conclusion: The scaling of intelligence ultimately depends on only one thing—computing power.

[Note: This conclusion has obvious interest alignment. NVIDIA is the world’s largest AI computing power supplier; the narrative “intelligence = computing power” directly benefits its business. However, from current trends, major AI labs are indeed increasing computing power investment at an unexpected pace.]

Drew OpenClaw’s Architecture Diagram Two Years Ago

Lex asked: Hardware can’t pivot in a week, AI model architectures change every six months, how do you predict two to three years ahead?

Jensen gave a three-layer answer.

First layer is internal R&D.

NVIDIA does its own basic research