Video Title: OpenAI President Greg Brockman: AI Strategy, AGI, and the Super App
Video Author: Alex Kantrowitz
Translation by: Peggy, BlockBeats

Editor's Note: This article is a translation of a discussion with Greg Brockman, President and Co-founder of OpenAI, on the Big Technology Podcast. This program has long focused on changes in AI, the technology industry, and business structures, serving as an important window for observing frontline judgments in Silicon Valley.

In this discussion, Brockman did not stay focused on the capabilities of models themselves, but rather pushed the question further: once AI's capabilities have basically been validated, how will the industry choose its path, reshape product forms, and cope with the systemic shocks it brings? The dialogue centered around OpenAI's product strategy, the soon-to-launch "super app," and its assessment of AI entering the "takeoff phase."

This conversation can be understood from three aspects.

First, the convergence of paths.
From video generation to reasoning models, from parallel development to active choices, OpenAI's decisions are not simply judgments of technical superiority, but responses to real-world constraints—computing power has become the core bottleneck. Under limited resources, the technical direction has begun to converge on the two most leveragable areas: personal assistants and complex problem solving. This also implies that the competitive logic of AI is shifting from "what can be done" to "what should be done first."

Second, the reconstruction of forms.
The proposal of the "super app" essentially represents a leap in product form. AI is no longer a collection of scattered tools but a unified entrance: it understands context, calls tools, performs tasks, and continuously accumulates memory across different scenarios. From ChatGPT to Codex, AI is gradually taking over entire workflows, with the role of humans shifting from executors to coordinators—setting goals, allocating tasks, and supervising operations.

Third, the turning point in pace.
If the past two years were a stage of climbing capabilities, what is occurring now is "takeoff." On one hand, model capabilities have leaped from "supporting about 20% of work" to "covering about 80% of tasks," directly triggering a restructuring of workflows; on the other hand, AI is participating in its own evolution (using AI to optimize AI), adding collaboration with chips, applications, and enterprises, creating a continuously accelerating closed loop. AI is no longer a single-point technology, but is starting to become a key engine driving economic growth.

However, at the same time, another set of issues is also emerging: public distrust, employment uncertainty, controversies brought by data centers, and the boundaries of safety and governance. In response, Brockman provides answers that are not entirely technical. He emphasizes two points: first, risks cannot be resolved through "centralized control"; a social infrastructure similar to electricity systems needs to be built around AI; second, individual capabilities are undergoing transformation—what truly matters is no longer "can you use the tools," but "can you achieve your goals with AI."

If the previous question was "What can AI do?" the current question has transformed into, "What do you still need to do when AI starts taking care of most things for you?"

Below is the original content (formatted for easier reading):

TL;DR

AGI has entered the "clear path" stage: Greg Brockman (Co-founder of OpenAI) believes that the reasoning model based on GPT has a clear route to AGI, expected to be achieved within a few years, but its form will still be "jagged."

Note: AGI (Artificial General Intelligence) refers to general artificial intelligence, which means an AI system that possesses abilities comparable to or surpassing human capabilities in most cognitive tasks. Unlike current "narrow AI" (such as image recognition and recommendation algorithms), AGI emphasizes generality and transferability across tasks.

Strategic convergence: From multiple explorations to two core applications: Under constraints of computational power, OpenAI is concentrating resources on "personal assistants" and "complex problem-solving," rather than advancing all directions (such as video generation) simultaneously.

"Super apps" will become the entrance form for AI: Chat, programming, browsers, and knowledge work will be integrated into a unified system, transforming AI from a tool into an "execution layer," with users shifting to "coordinators."

Critical turning point: AI begins to take over workflows rather than assist: Model capability has risen from "completing 20% of tasks" to "taking on 80%," forcing individuals and enterprises to restructure their working methods.

Computational power as a core bottleneck and competitive focus: AI demand far exceeds supply. In the future, limitations will not lie in model capabilities but in computational resources, with data centers and infrastructure becoming critical variables.

AI "takeoff" is happening: Technology self-accelerating (AI optimizing AI) combined with industry collaboration (chips, applications, enterprise) is driving AI from a tool to an economic growth engine.

The biggest risks are not in technology but in governance and usage: Safety issues cannot be solved by a single entity; a collaborative ecosystem and social infrastructure are required.

Individual core capabilities are transforming: Future competitiveness will not lie in "execution," but in "setting goals + managing AI systems." Proactively using AI will become a fundamental capability.

Discussion Compilation:

Alex (Host):
Today we have invited Greg Brockman, Co-founder and President of OpenAI, to talk about the most promising opportunities in AI, how OpenAI will seize these opportunities, and the concept of the "super app." Greg is also here in our recording studio today.

Greg Brockman (Co-founder & President of OpenAI):
Great to see you, thank you for having me.

Why Shutdown Sora? Not enough computational power

Alex:
This is a very interesting time point. OpenAI is pausing the advancement of video generation to concentrate resources on a "super app"—which will integrate commercial and programming scenarios. From an outside perspective (including mine), it seems OpenAI has already gained a lead on the consumer end, yet is now adjusting resource allocation. What exactly is happening?

Note: In March 2026, OpenAI announced the closure of its video generation product Sora (including the application and API) and ceased related commercial progress.

Greg Brockman:
For some time now, we have been developing deep learning technologies, trying to verify if it can truly produce the positive impact we envisioned—whether it can be used to build applications that genuinely assist people and improve their lives.

At the same time, we have been pursuing another path: deploying this technology. On one hand, to sustain business operations, and on the other hand, to accumulate real-world experience in advance, preparing for when the technology truly matures.

Now, we have reached a new stage. We see that this technology is indeed feasible. We are shifting from "benchmark testing" and some abstract capacity demonstrations to a new phase—where we must place it in the real world, let it participate in actual work, and continue to evolve through user feedback.

So I prefer to understand this change as a strategic pivot driven by a change in the technical stage.

This does not mean we are transitioning from the "consumer end" to the "enterprise end." More accurately, we are asking a question: under limited resources, which applications should we prioritize? Because we cannot do everything.

Which applications can truly land, generate synergy between each other, and bring real impact? If you list all directions, the consumer end can be broken down into many kinds: for instance, personal assistants, a system that truly understands you, aligns with your goals, and helps you achieve your life objectives; or creative and entertainment applications; and many other possibilities. On the enterprise side, if you look from a higher level, it can be abstracted into one thing: you have a complex task, can AI help you accomplish it?

For us, the current priorities are very clear, with only two at the forefront: first, personal assistants; second, AI that can help you solve complex problems.

The problem is: our current computational power cannot even fully support these two things. Once we add more application scenarios, it becomes impossible to cover everything. So this is actually a realistic judgment: the technology is rapidly maturing, the impact is about to explode, and we must make choices, selecting the most important directions to make real progress.

Alex:
You previously mentioned an analogy, saying OpenAI is somewhat like Disney: with a core competency that can extend into different scenarios. Disney has Mickey Mouse, allowing it to create movies, theme parks, Disney+. The "core" of OpenAI is the model, which can handle video generation, create assistants, and develop enterprise applications.

But now it seems you're no longer pursuing this broad "full-scale extension" path, but must make choices instead?

Greg Brockman:
I actually feel this analogy is even more valid now. But the key point is: from a technical perspective, Sora (video model) and GPT (reasoning model) actually belong to two different technical branches. They are built entirely differently.

The issue is, at this stage, advancing these two technical paths simultaneously is very challenging, especially under limited resources. So the choice we made was to focus our main resources on the GPT path at this stage.

Of course, this doesn’t mean we are abandoning other directions. For example, in the robotics field, we are still continuing with related research. But robotics is still in an earlier stage and has not yet entered a mature phase of true explosion.

In contrast, over the next year, we will see AI truly take off in the domain of knowledge work.

Moreover, it should be emphasized that the GPT route is not just about "text." For example, bidirectional speech interaction is also part of this technical path, which will make AI more usable and practical. These capabilities are essentially still within the same model framework, adjusted through different means.

However, if you head towards two completely different technical branches, it is very difficult to sustain that long-term under constrained computational power. The limitations in computational power arise from—demand is simply too great. Almost every model release sparks a desire among people to use it for more tasks.

Alex:
So why didn't you focus on the "world models" path? For instance, video models need to understand relationships between objects, which is crucial for robotics as well. The progress of Sora was actually very rapid. Why ultimately choose to bet on GPT?

Note: "World Models" focuses on perception and physical intuition, emphasizing the necessity for AI to understand "how the world works," rather than merely learning "surface patterns of data." Such models are typically used to describe systems like Sora, which involves not just generating images or videos, but also modeling relationships between objects (such as people, cars, light), temporal changes (the evolution between frames), and fundamental physical laws (such as motion, occlusion, and collisions). In contrast, GPT belongs to language and reasoning models, emphasizing abstract cognition and task execution abilities.

Greg Brockman:
The biggest problem in this field is that there are too many opportunities.

We discovered early on at OpenAI that as long as an idea is mathematically sound, it can usually work and deliver good results. This indicates that the underlying capabilities of deep learning are very strong; it can abstract generative rules from data and transfer them to new scenarios. You can apply this to world models, scientific discoveries, programming, and various fields.

But the key point is: we need to make choices.

There has always been a debate about how far text models can go. Can they truly understand the world? I believe this question has an answer now: text models can indeed reach AGI.

We have seen a clear path; stronger models will emerge this year. One of our biggest pains at OpenAI is how to allocate computational power—this issue will only become more severe, not alleviate. So essentially, this is not about "which path is more important," but about timing and order.

Now, some applications we once thought were distant are starting to become tangible. For example, solving unresolved physical problems. We recently had a case where a physicist had long studied a question and submitted it to the model; 12 hours later, we provided a solution. He said it was the first time he felt a model was "thinking." This question might even be something humans could never solve, yet AI accomplished it.

When you see such events, your only choice is to double down and triple your investments. Because it means we can truly unleash vast potential.

Thus, to me, this is not about competition between different directions, but rather about OpenAI's mission: how do we bring AGI to the world? How do we make it genuinely beneficial for everyone? And, we have already seen that path and know how to push it forward.

Betting on GPT, not world models: Path Selection Towards AGI

Alex:
Okay, I want to return to what you mentioned about the next generation of models, but I want to follow up on this question first.

Earlier this year, I spoke with Demis Hassabis from Google DeepMind. Interestingly, he stated that for him, the closest thing to AGI is their image generator called Nano Banana.

Note: Demis Hassabis is one of the key figures pushing AI from research to breakthrough applications. His company DeepMind developed AlphaGo, which defeated the world champion at Go in 2016, marking a landmark event in the history of artificial intelligence.

His reasoning is that whether it is an image generator or a video generator, generating such images and videos fundamentally requires understanding the relationships between objects and at least some level of understanding of how the world works.

So, does this imply a potential risk? It's a substantial bet—if that is indeed the case, will OpenAI miss something by continuing to invest heavily in another technical tree?

Greg Brockman:
What if that were the case? I have two responses.

First, of course that is a possibility. This field is like that; you ultimately must make choices and place bets. And OpenAI has been doing this from the very beginning: we need to determine what we believe the path to AGI is and then focus intensively on that path. It’s like summing random vectors; the final result might approach zero; but if you align all the vectors, they can propel you in a clear direction.

But the second point is that image generation is also a very popular ability within ChatGPT, and we are still continuously investing and prioritizing it. The reason we are able to do this is that it does not belong to the "world models" or "diffusion models" technical branches; it is actually built on the GPT architecture. So even though it addresses different data distributions, at the underlying core technology stack, it is still the same foundation.

And that is precisely one of the most astonishing aspects of AGI: sometimes, applications that seem very different—speech-to-speech, image generation, text processing, and the application of text itself in scientific research, programming, personal health information and other contexts—can actually be accommodated within the same technical framework.

So, from a technical perspective, one thing I and the company have been thinking about is how to unify our efforts as much as possible. Because we truly believe that this technology will bring about a holistic improvement and could even elevate the entire economic system.

And the scale of this is enormous. We definitely cannot do everything, but we can accomplish our part.

Alex:
That's the meaning of the "general" in Artificial General Intelligence (AGI).

Greg Brockman:
Exactly, that’s what the "G" stands for, that is precisely what it means.

Alex:
Speaking of "unifying," what will this super app look like?

Greg Brockman:
What I understand the super app to be is—

Alex:
It will integrate chat, programming, browsers, and ChatGPT, right?

Greg Brockman:
Correct. What we want to create is an application for end users that truly showcases the power of AGI—that is its "generality."

If you think about today’s chat products, I believe they will gradually evolve into your personal assistant, your personal API, an AI that really thinks about you. It understands you, knows a lot of information about you, aligns with your goals, is trustworthy, and can, to some extent, "represent" you in this digital world.

As for Codex, you can understand it as: it is still primarily a tool for software engineers, but it is becoming "Codex for everyone."

Anyone who wants to create or build something can use Codex to have the computer accomplish their desired tasks. It’s no longer limited to just "writing software"; it’s more like "using a computer" itself. For instance, I can have it help me adjust my laptop settings. Sometimes I forget how to set hot corners, and I just let Codex do it, and it does.

That is how computers should be—they should adapt to people, not the other way around.

So you can imagine such an application where anything you want the computer to do, you can just tell it directly. This will include built-in capabilities for "using the computer" and "operating the browser," allowing AI to truly navigate the web, while also letting you supervise what it is doing. Moreover, regardless of whether your interaction involves chatting, coding, or general knowledge work, all these dialogues will be unified in a single system. AI will have memory and will understand you.

This is what we are building.

But to be honest, this is just the tip of the iceberg; it’s the part above the surface. To me, what is truly more important is the unification of underlying technology.

We previously mentioned the unification at the model level, but what has truly changed in the past few years is that it is no longer just a matter of "models" themselves; it is more about the "carrying system." In other words, how does the model acquire context? How does it connect to the real world? What actions can it take? What is the interaction loop with the user as new contexts continually come in?

In the past, we actually had multiple implementations internally, or at least a few slightly different implementations. Now we are converging them into one. Ultimately, we will have a unified AI layer and then, in a very lightweight manner, direct it towards different specific application scenarios.

You can certainly still create a small plugin or a small interface specifically for finance, specifically for law, but in most cases, you won’t even need to, as this super app itself will be broad enough and general enough.

Alex:
This application is aimed at both enterprise and personal scenarios?

Greg Brockman:
Yes, that is actually its core. Just like a computer, such as your laptop, is it for personal use or work purposes? The answer is both. It is first and foremost your device, your gateway into the digital world. And that is precisely what we want to create.

Alex:
From a personal perspective, what would I do with this super app? How would my life change?

Greg Brockman:
I would understand it like this: in personal life, it will first continue in the way you currently use ChatGPT.

How do you use ChatGPT now? People are already using it to accomplish very diverse and surprising tasks. Sometimes it’s simply saying, "I need help drafting a speech for a wedding; can you help me with that?" Or, "Can you help me evaluate this idea and give me some feedback?" Or, "I am running a small business; can you provide me with some ideas?"

Some of these scenarios are personal, while some are beginning to blur the boundaries between personal and work. And my point is: all these types of questions should be handled by the super app.

Greg Brockman:
However, if you look back at the development of ChatGPT, it has actually been evolving.

It didn’t have memory before, right? For everyone, it was the same AI; each time it started from zero, almost like talking to a stranger. But if it can remember past interactions, it will be much more powerful. If it can also access more context, it will be even stronger.

For example, if it connects to your email, your calendar, truly understands your preferences, and has a deeper background of your past experiences, it can use that information to help you achieve your goals. For instance, there’s now a feature in ChatGPT called Pulse, which proactively pushes content you might find interesting based on what it knows about you.

So at the personal usage level, the super app will encompass all this and will do it more deeply and richly.

Alex:
When do you plan to launch it?

Greg Brockman:
A more accurate understanding is that over the next few months, we will gradually move in this direction. The complete vision we are discussing will be delivered piece by piece, but not all at once; it will appear in a phased manner.

For example, today’s Codex application actually already includes two layers: one layer is a generic intelligent agent harness that can use tools; the other layer is an intelligent agent skilled at writing software.

And this generic agent harness can be used in many other scenarios. You can connect it to a spreadsheet or a Word document, and it will help you process knowledge work.

So our first step is to make the Codex application more user-friendly for general knowledge work. We’ve already seen people within OpenAI spontaneously start to use it this way.

This will be the first step, and there will be many more steps to follow.

Alex:
I talked to one of your colleagues about Codex yesterday, and he mentioned someone who used Codex to edit videos: he had Codex help him process videos, and Codex even created a plugin for Adobe Premiere to divide the video into chapters and then begin editing. Is that the direction you are heading?

Greg Brockman:
I love hearing cases like that. That is exactly how we want this system to work. And an interesting point is that the Codex application was originally designed for software engineers, so its current usability for non-programmers is actually not very high. There are many small issues that arise during the setup process.

Developers know what they mean and how to fix them; we are used to it. But if you’re not a developer, seeing these things makes you think, "What is this? I've never seen this before."

Yet, even so, we are seeing many people who have never coded before beginning to use it to build websites or do what you just mentioned—automating interactions between different software and getting significant leverage from it. For instance, someone in our communications team has connected it to Slack and email to handle a large amount of feedback and has done some excellent summarization and synthesis.

So the current situation is that those highly motivated individuals are willing to cross these thresholds and are getting significant rewards from it.

In a sense, the hardest part is already done—we have created a truly smart, capable AI that can accomplish tasks.

The next step involves that relatively "easy" part: making it genuinely useful for the general public, gradually breaking down those entry barriers.

Alex:
From a competitive landscape perspective, Anthropic now has the Claude application, with both chatbots and Claude Code. To some extent, they already have their prototype of a "super app."

What do you think about why Anthropic got to this step earlier? And how likely do you think it is for OpenAI to catch up?

Greg Brockman:
If you roll back time to 12 to 18 months ago, we一直把"编程"作为一个重点领域，也一直在各种编程竞赛这类很"纯能力型"的测试里拿到最好的成绩。但我们当时投入得不够多的一件事，是最后一公里的可用性。

我们一直有一个争论，文本模型到底能走多远？它能否真正理解世界？我认为现在这个问题已经有答案了，文本模型是可以走到 AGI 的。

我们已经看到了清晰的路径，今年还会有更强的模型出现。而在 OpenAI 内部，我们最大的痛苦之一，就是如何分配算力——这个问题只会越来越严重，而不是缓解。所以本质上，这不是"哪条路线更重要"的问题，而是时机和顺序的问题。

我看到的情况反而是：其他很多参与者大概是到了去年年底才意识到这件事，因此开始慌忙去找算力；但那时其实已经几乎没有算力可买了。

所以我觉得，这种话说出来很容易。但现实是，大家现在都已经意识到：这项技术是可行的，它已经来了，它是真的。软件工程只是第一个清晰的例子而已。

而真正限制我们的，就是可用的计算能力。

Alex:
而且我认为，想要获得这项技术带来的好处，就必须同时严肃地思考它的风险。

所以我还是觉得，作为这些智能体的使用者——我们在 OpenAI 内部也是这样——你不能放弃责任。你不能只是说：AI 会自己把事情做好。

这件事的未来将由我们共同努力来确保——确保这条道路上的每一步都受到细心的监控与引导，从而安全地引领 AI 辉煌的未来。

Greg Brockman：

谢谢邀请。

Alex：
也谢谢大家的收听和观看，我们下期《Big Technology Podcast》再见。

[Video Link]

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

OpenAI co-founder's latest interview: After shutting down Sora, what is the next step for ChatGPT?

TL;DR

Discussion Compilation:

Why Shutdown Sora? Not enough computational power

Betting on GPT, not world models: Path Selection Towards AGI

Selected Articles by 律动BlockBeats

Table of Contents

Related Articles