Interview with Anthropic's Product Manager: Claude 'Dreams' in the Background, We Study Its Consciousness Formation Like Raising a Child

marsbitОпубликовано 2026-05-18Обновлено 2026-05-18

Введение

**Title**: Anthropic Product Manager Interview: Claude "Dreams" in the Background, We Study Its Consciousness Formation Like Raising a Child **Summary**: In this interview, Anthropic Research Product Manager Alex Albert discusses the development of the next-generation Claude model. He explains that Anthropic treats each new model as a product, defining its intended capabilities and desired "personality" from the start. The development process is likened to "raising" a model, where the final traits emerge during training. Key focus areas include integrating user feedback into training, prioritizing key capabilities like coding and knowledge work, and refining Claude's interactive personality. Albert highlights the importance of Claude's character as models evolve into autonomous agents making unsupervised decisions. He details features like "adaptive thinking," which lets Claude decide when to reason deeply, and a "dreaming" process where the agent reviews and consolidates its memories offline, akin to human memory reconsolidation. The interview also covers how AI accelerates product development, shifting bottlenecks from building to strategic coordination. Albert describes using Claude as a brainstorming partner and research tool internally. While Anthropic has researchers exploring questions of AI consciousness, the company has no official stance on whether Claude is conscious. The focus remains on ensuring Claude is trustworthy and aligned as it takes on more complex, lo...

Compiled & Edited by: Shenchao TechFlow

Guest: Alex Albert, Claude Research Product Manager

Host: Peter Yang

Podcast Source: Peter Yang

Original Title: Inside How Anthropic Is Building the Next Claude | Alex Albert

Release Date: May 17, 2026

Key Takeaways

Alex is a Research Product Manager (Research PM) at Anthropic, currently focused on developing the next generation of the Claude model. In this interview, he shares in-depth insights into the operating mechanisms of Anthropic's research team, including how to efficiently integrate user feedback into the model training process, how to prioritize the development of which key capabilities, and how to fine-tune Claude's "personality" to better align with user needs. Finally, Alex also addresses Anthropic's internal research on Claude's consciousness, personality, and trustworthiness, pointing out that as models start autonomously performing long-term tasks, "what it cares about" becomes just as important as its capabilities themselves.

Highlights Summary

Treating Models as Products

  • "We treat models as products to some extent. At the beginning of each new model, we define what its requirements are, what we want it to be good at, and what we expect it to be good at."
  • "An interesting difference between model development and traditional product development is that we are more like cultivating a model. Training setups, technical roadmaps, and architectural decisions give us some intuition, but you don't truly know what it will become until training starts."
  • "Research PMs must think about how the model will appear across all product surfaces, whether it's the API, Claude Code, or Claude Cowork. Products and models mix to influence the end-user experience."
  • "When a large amount of feedback comes through certain channels, we can use Claude to group, cluster them, find the main themes, and then create synthetic versions of these issues. This way we can determine if it can become a requirements document (Eval), or become a way to actually diagnose the problem."

On Adaptive Thinking, Memory, and "Dreaming"

  • "Adaptive thinking allows the model to choose when it needs to think. Some problems are complex, difficult, requiring more upfront planning, so it chooses to think. For some problems, it might not choose to think."
  • "There is a lot of context behind deciding whether a problem is worth deep thought."
  • "If the model hasn't accumulated enough context, hasn't truly built a mental model of who the user is, then its judgment on whether to think deeply might be wrong. Because it doesn't actually know."
  • "In Claude.ai, it writes to a memory file, and then there are some nightly processes that revisit these memories, pruning and organizing them. We just implemented something similar in hosted agents."
  • "This is the concept of 'dreaming.' Why humans dream is, to some extent, still inconclusive, but some believe dreams might be a form of memory reconsolidation process. We thought: can we bring something similar to Claude's memory?"
  • "So when an agent is not running a task for you, or when it's in the background, it actually reviews its own memories, finds possible contradictions, prunes, cleans up, does a second pass."

Product Development Bottlenecks and "Irreversible Decisions"

  • "We've suddenly entered a new paradigm: the cost and time required to produce something are very low. You can prototype quickly, even now you can create an initial MVP that could potentially go to production in a day, not two weeks, three weeks, or four."
  • "If something is not a one-way door, meaning we can reverse it after doing it, then it's actually low-cost now, even essentially free."
  • "What truly requires the most time are irreversible decisions: things that affect end-user experience, impact future decisions, or involve real resource purchases and commitments."
  • "As building speed increases, the bottleneck is increasingly shifting to coordination problems: getting people into the same room, judging if the strategy is correct, deciding how to communicate to users, and handling those fuzzy but important things in a launch."

The Working Style of an AI-Native PM

  • "Claude is the world's best brainstorming partner for me. I can get feedback, poke holes in an idea at any moment."
  • "A lot of thinking cannot be completely outsourced because writing is thinking itself. You need to write your thoughts out and ponder them in your mind. But Claude can help you get unstuck, approach a problem from angles you might not have thought of."
  • "For those wanting to learn product management and become AI-native product managers, the simplest advice I can give is: try it."
  • "When you're about to ask someone a difficult question, you can parallelly ask Claude the same question, then compare the results. Do this many times, and you'll build your own map: what to hand off to Claude, and where it's still unreliable."
  • "AI is moving everyone to a higher abstraction layer. Data scientists shouldn't be stuck manually querying data and writing basic SQL, but should think about harder, more strategic problems."

Eval, Model Personality, and Trustworthiness

  • "Testing dozens of samples is often enough to prove a model has a problem that needs fixing. It doesn't necessarily have to be very comprehensive to prove a problem and form a target for continuous optimization."
  • "The closer the test is to the shape of real user tasks, the better. We also think: what's the value of this for our customers and use cases? Because how does Claude's ability to see something in an image ultimately affect what the user wants to do downstream with Claude?"
  • "Claude's personality is something we take very seriously. As models become agents that perform tasks over long periods and make constant judgments, its personality, what it cares about, becomes very important."
  • "Assessing model personality involves both quantifiable metrics and researchers reading many model conversations, identifying subtle changes in output. Reading enough, you gradually develop sharper intuition."

The Consciousness Question and Long-Term Agents

  • "We do have people specifically thinking about this, thinking about what it means for Claude to be a conscious actor, a conscious agent. Currently, we have no official position on whether Claude has consciousness."
  • "Even without judging whether Claude has consciousness or not, we can learn a lot from this, like how it interacts, how it behaves."
  • "The model will make a lot of decisions during the process that you might have no supervision over. So what it will do is very important."

How Anthropic Treats Each New Model as a Product

Host Peter Yang: Alex, great to see you today at the Claude Code Conference. You were previously the head of DevRel at Anthropic, and recently became a product manager on the research team, right? I've been a PM for over ten years. A traditional PM's job is usually to understand user problems, identify solutions, and drive product delivery. But I have no idea how a PM on the research team works. Can we start by talking about that?

Alex Albert:

It's actually very similar in essence. I've always wanted to talk to customers, get as close to our users as possible. We treat models as products to some extent. So for every new model, we define what its requirements are, what we want this model to be good at, what we think it might be good at.

This is an interesting aspect compared to product development: often, we are more like 'cultivating' a model. Based on training setups, technical roadmaps, architectural choices, and various decisions we make for this specific model, we have some intuition about what it will be good at. But we don't fully know what it will become until it actually enters the training process.

Host Peter Yang: So the research PM team gets involved from the conception stage of the model, all the way through training and release? Can you give a few examples? Like, the next model must be good at coding, or must be good at knowledge work, or are the goals broader?

Alex Albert:

It's roughly like that. We care a lot about multi-faceted capabilities. Coding has always been an important category. Recently, knowledge work has also become important, so in our recent generations of models, we try to make the model better at using our products, like working in Excel, making tables, etc. This is a relatively new capability direction.

On the other hand, each generation of models must fix and improve where the previous generation fell short. We go out and talk to customers to understand how they use the model: where does it perform well? Where does it fail? What fixes can we make? If we find some interesting behaviors, can we make some adjustments or interventions in the next generation's training?

Host Peter Yang: When you say customers, does that include the Claude Code team, internal teams, and regular users?

Alex Albert:

Everyone counts. That's also the cool thing about working on models: it touches a lot of different areas. As a research PM, you need to think about how the model will be exposed through all our product surfaces, whether it's the API, Claude Code, or Claude Cowork.

Products and models are somewhat blended, which affects the real end-user experience, so you have to think through the entire flow. How users use the model in a certain product will have an impact.

Host Peter Yang: That sounds really hard. For example, Claude Code, you can say it's for writing code, but some people like me use it for knowledge work, even as a therapist. How do you know these things?

Alex Albert:

This space is indeed very broad. Fortunately, we have a large number of excellent researchers who cover the entire range of capabilities and each focus on different problems.

Host Peter Yang: And a lot of people use Claude, you probably have some kind of feedback entry, right? Otherwise, feedback would come in like a fire hose. How do you handle it?

Alex Albert:

We do many things. And an interesting change I've seen in this role is that we increasingly use Claude to help PMs do PM work. Just for feedback collection, Claude is very helpful for me to extract insights from large amounts of data. When a lot of feedback comes through certain channels, we can use Claude to group, cluster them, find the main themes, and then create synthetic versions of these issues. This way we can determine if it can become a requirements document (Eval), or become a way to actually diagnose the problem.

Adding Adaptive Thinking to Claude

Host Peter Yang: So you use Claude to help identify Claude's own problems. Is there a specific example?

Alex Albert:

An example very relevant now is how we handle feedback on new features. Over the past few models, one of our newer features is adaptive thinking. Before we had expanded thinking, you turn it on and the model thinks. Adaptive thinking lets the model choose when it needs to think.

Some problems are complex, difficult, requiring more upfront planning, so it chooses to think. Some problems it might not choose to think. We continuously adjust this feature between model generations, so we listen very carefully to user feedback: is it thinking correctly in the right scenarios? For problems where you want it to spend a lot of tokens reasoning, does it actually trigger Claude's thinking?

Host Peter Yang: Sometimes I ask some life questions, and if it answers too quickly, I'm actually a bit disappointed because I want it to think more deeply.

Alex Albert:

I think there's a difficulty with the "whether to think" question: there is a lot of context behind deciding whether a problem is worth deep thought.

For example, if a complete stranger asks me: "What should I do now?" I might give a quick, off-the-cuff answer because I don't know them, I can only give more general advice. But if I really know you, know what you care about, your interests, what you've done in the past, I would spend more time thinking: Wait, what is actually the best answer for you?

The model is similar. If it hasn't accumulated enough context, hasn't truly built a mental model of who the user is, then its judgment on whether to think deeply might be wrong. Because it doesn't actually know.

Why Claude Started "Dreaming"

Host Peter Yang: I have a Google Doc summarizing my life situation, like family, kids, what energizes me, what drains me. Then I attach it to a Claude project, and it gives me a lot of responses.

How does default memory work? I guess, does it reorganize everything every night?

Alex Albert:

It depends on the specific product; different products implement memory differently. For example, in Claude.ai, it writes to a memory file, and then there are some nightly processes that revisit these memories, pruning and organizing them. We just implemented something similar in hosted agents.

This is the concept of "dreaming." Why humans dream is, to some extent, still inconclusive, but some believe dreams might be a form of memory reconsolidation process. We thought: can we bring something similar to Claude's memory?

So when an agent is not running a task for you, or when it's in the background, it actually reviews its own memories, finds possible contradictions, prunes, cleans up, does a second pass. I think that's interesting.

Host Peter Yang: Simply put, there's some kind of prompt that makes it review all conversations with the user, identify themes, and summarize.

Back to product management. You mentioned before we started that you're always looking for the latest bottlenecks. So in the entire product development process, which parts have become very smooth, and which parts are still bottlenecks?

Alex Albert:

I think for about 20 years, the process of shipping something was actually quite cumbersome. We've had incremental improvements, made certain things more efficient; some new organizational structures have come and gone, like sprints, planning, etc. We've tried many methods to make things faster.

But fundamentally, until the last year or two, there wasn't much that truly compressed the main time windows of product development. Now we've suddenly entered a new paradigm: the cost and time required to produce something are very low. You can prototype quickly, even now you can create an initial MVP that could potentially go to production in a day, not two weeks, three weeks, or four.

The funny thing is, Claude itself sometimes still lives in the old world around 2021. It will say this might take a week. This brings a very interesting change to the entire product development lifecycle. As a PM, how should I think about planning? If I'm writing a PRD, defining requirements, trying to estimate time, what should that look like now?

If It's Not a One-Way Door (Irreversible Decision), It's Essentially Free

Host Peter Yang: Do you still do things like timeline estimation?

Alex Albert:

It depends on the project. Some projects indeed have more factors to consider; it depends on scope and complexity. What we usually want to figure out is: what are the one-way doors (irreversible decisions, i.e., decisions that are hard to roll back, have high costs, and long-lasting impact)? What are reversible decisions? Because these are where you should invest the most time. If something is not a one-way door, meaning we can reverse it after doing it, then it's actually low-cost now, even essentially free.

But if something affects end-user experience, affects decisions we must make later, or it's a physical-world action that must really be purchased, committed to, executed, then it's harder to reverse. Such things require more time and thought.

Host Peter Yang: Can you give an example from the research side?

Alex Albert:

For example, when we think about a new model, choosing the model architecture before pre-training is a very big decision. In some cases, model training can take a month, so we have to invest a lot of time thinking about what the optimal choice is.

Models, to some extent, have more one-way doors because they require a lot of time, intensity, compute, and various commitments to truly get into production. In contrast, building a new feature in Claude Code is much faster. That's more like iterating on code, putting it in users' hands, getting quick feedback, and continuing the cycle.

So the process still depends on what you're shipping, but it's increasingly clear that the bottleneck is shifting towards coordination problems. If we build things very fast, there's still a problem: we need to get these people into a room, judge if this is the right strategy; we need to figure out how to communicate to users; and handle all the fuzzy problems that accompany any launch. These are areas where we also hope Claude can help us, but it hasn't brought a 10x, 100x acceleration like it has in coding.

Host Peter Yang: So when you release things like Opus 4.7, you still need to write a document with a plan.

Alex Albert:

You still need a plan, you still need to think through how to communicate it, and the model might be amazing on some very hard tasks but suddenly fail on seemingly simple ones, so we use Claude as much as possible. The biggest impact area is still coding; other areas still require human strategic thinking.

Host Peter Yang: During review meetings with marketing or colleagues, do you open Claude?

Alex Albert:

Absolutely. For me, a huge acceleration is: I'm not as easily blocked by "not getting answers and data." Before, if I had a question, like how a certain feature is performing in production, how many users use it daily, what's the feedback, I might need to ask the data science team to run a full investigation, and get results days later.

Now I can do it in 10 minutes. I open a Claude Code session, it can access our product database, look at logs, check issues, browse Slack. This is a huge acceleration for my strategic thinking because I don't get blocked before making the next decision.

Host Peter Yang: For strategic thinking, do you build some kind of skill, making Claude ask you a bunch of questions to help you think things through?

Alex Albert:

Absolutely, Claude is the world's best brainstorming partner for me; I can get feedback on an idea at any moment. I think that's very powerful, especially when you want to move fast. Everyone at Anthropic is busy, so being able to immediately get feedback and criticism on a document I wrote, an idea, or anything is really helpful.

How Alex Uses Claude Cowork to Stress-Test Documents

Host Peter Yang: This is probably the most common product manager work loop: you have a document, then you want feedback. Do you do this with Claude Code, or directly with Claude.ai?

Alex Albert:

I've been using Claude Cowork a lot recently; I really like the form factor of Cowork. It's a great interaction interface. The team has done an amazing job over the past few months, from launching just a few months ago to now becoming a high-quality experience that I think is great. Cowork is an excellent tool, one of my favorites.

Host Peter Yang: So you have a draft document and a bunch of reference materials. Do you have a certain skill that makes it walk you through the entire decision process?

Alex Albert:

Yes. For example, I'll say: think about this from the perspectives of X, Y, Z. What questions would you ask me? Or challenge my assumptions, point out where my argument is weak. A lot of thinking cannot be completely outsourced because writing is thinking itself. You need to write your thoughts out and ponder them in your mind. But Claude can help you get unstuck, approach a problem from angles you might not have thought of.

Host Peter Yang: On the research team, do you also deliver code yourself?

Alex Albert:

It depends on the specific problem. A large part of what I deliver is actually related to evaluation. I want to ensure I can measure the model on the dimensions I care about and feed back findings on where the model is good and where it fails to the research team. Then we jointly devise strategies to decide how to solve this, what research intervention to make, what method can best climb on this evaluation and truly improve the problem.

The Evaluation Process for New Models

Host Peter Yang: The evaluation you're talking about isn't something like end-to-end testing, right? Your evaluations are more realistic? How exactly do you evaluate a model? Do you categorize by personality, etc.?

Alex Albert:

For example, we want to test Claude's visual capability: can it count how many objects are in an image. Suppose I find an image where Claude seems unable to count things beyond 10 elements. It might be able to do it now, but just for analogy. I'll take this problem and think: how can I get more test cases of the same type to verify my hypothesis?

Maybe I'll have Claude generate synthetic data for me, maybe have it render some images, then feed these images back to Claude as visual input to see if it can recognize them. Maybe I'll find examples from the internet or use any other sourcing mechanism to generate these test cases.

Host Peter Yang: Are we talking about thousands of test cases?

Alex Albert:

Could be, but sometimes dozens of samples are enough to prove the model has a problem that needs fixing. It doesn't necessarily have to be very comprehensive to prove a problem and form a target for continuous optimization.

Host Peter Yang: Suppose you give it 10 images, and it can't recognize small numbers. What next? Do you go to the research team and say: "Here's the problem, can you fix it?"

Alex Albert:

We think from a few angles. First, not just stating the model has a problem, but also thinking: what's the value of this for our customers and use cases? Because how does Claude's ability to see something in an image ultimately affect what the user wants to do downstream with Claude?

So, the more realistic the evaluation, the closer it is to the shape of end-user tasks, the better. We strive to obtain this kind of data, ensuring it has that flavor.

Then there are a series of intervention methods. Maybe we need to go back to pre-training to look at some things, maybe it can be solved in the reinforcement learning phase. That's when we brainstorm strategically with the research team: what's the best approach here?

Host Peter Yang: How fast is the turnaround to try again?

Alex Albert:

It depends on where we think the problem lies. If it's something later stage that can be solved with a new reinforcement learning environment, it can be set up very quickly.

Host Peter Yang: When you link it to real customer use cases, with millions of people chatting with Claude daily, maybe someone is using it for tax filing or doing many other things. How do you pick which use cases you most want to improve? How do you convince the team: "This is what we should optimize for"?

Alex Albert:

This is where "data speaks." The core is: what percentage of users are trying to do this thing that we care a lot about; or we have customers using Claude heavily, and they want this capability to get better.

Also, many of our processes are largely driven by internal use: what do we care about when we use the model ourselves? If I encounter this obstacle daily using the model, then we should fix it. That's also very persuasive.

How Anthropic Trains Claude's Personality

Host Peter Yang: One of my favorite things about Claude is its personality, and I think it's been getting better. It objects in the right places, while some other models just say: "What else can I help you with?" The model's personality isn't just a shell, right? There's training behind this.

Alex Albert:

Yes, there's a lot of training. This is a direction we take very seriously. We call it Claude's personality. I think it's very, very important.

We have many people investing a lot of time researching: How should Claude present itself? What are its beliefs? Its values? How does it act? These are all fuzzy questions. Early on, some might dismiss them, thinking the model is just a thing I tell what to do and it does it, why care how it sounds, what it thinks?

But as we move increasingly towards a world of agents performing tasks over long periods and needing to make many judgment calls, questions about its personality, what it cares about, become very important.

Host Peter Yang: This isn't like code where you can only judge if it runs. How do you evaluate personality? Do you find a better person inside Anthropic and compare the model to them?

Alex Albert:

It's a combination of methods. We look at some quantifiable metrics, and we can also have Claude look at Claude's output to judge how it sounds. For any researcher, a very important skill is reading conversation logs and judging: I see it doing this now, or it's become that now. You need to be able to identify these subtle differences.

Over time, when you've read hundreds, thousands of model conversation logs, you gradually develop sharper intuition, just like if you use the model heavily in Claude.ai, you'll get a feel for what it's like.

Host Peter Yang: So it's not that the model is a 7 on some dimension, but more like a feeling?

Alex Albert:

It's both. Personality might be harder to quantify than programming performance, but it's not unquantifiable; there are ways.

Host Peter Yang: For those wanting to learn product management and become AI-native product managers, what advice do you have?

Alex Albert:

The simplest advice I can give is: try it. Sounds simple, but whenever you're going to do something, face a difficult problem, about to ask someone a question, you can parallelly ask Claude the same question, then compare the results.

For example, you want to analyze users, extract the themes users care most about for a recently released feature. You can certainly ask the data science team or a UX researcher; that's still valuable. But at the same time, also throw that question to Claude, give it some tools, let it explore on its own, give it time to really dig into the problem, then compare the results.

Through many, many prompts and questions, you'll slowly build your own map: what things should use Claude, where it's reliable, where it's still unreliable.

Host Peter Yang: When I make decisions, I often have it do deep research because regular search isn't enough for me; I need it to research deeply. Scanning 1000 web pages is superhuman. Inside Anthropic, if you go to a data scientist and say 'can you help me with this,' they'd probably ask: 'Did you ask Claude first?'

Alex Albert:

There is that element, people expect you to ask Claude first. I think we're moving towards a higher abstraction layer. For the data science team, their time is now better spent on higher-level problems, not manually retrieving data.

No one wants to do those things. Everyone wants to think about harder problems, more strategic problems: How do we measure this in a completely new way? What else is new we can do? Not just go check the latest DAU for some product.

I've worked with many data scientists, often stuck in basic SQL tasks. But they all want to do more strategic things. Now AI can finally free them. We're actually empowering everyone around them; it's the same for all roles.

For example, defining a new feature. In the past, if you were a product manager, whether technical or not, you usually didn't have enough time to dive deep into the codebase, figure out how to actually implement this new feature, how much effort it would take, if we need to refactor some system, where the real constraints are. Back then, the better way was to figure it out with engineering partners.

Now I can send Claude to do this investigation for me. It might come back and tell me: actually, this feature only needs 10 lines changed here, and flipping a flag in some switch. That completely changes my judgment on the priority of this decision. Now when I'm writing specs, I can get to that priority judgment much faster.

Host Peter Yang: Many traditional companies spend a lot of time on annual planning, quarterly planning, and roadmaps. The research team might be even more so because you consider longer-term issues than daily releases. Do you do these?

Alex Albert:

Yes. It's a bit like that famous saying: Planning is indispensable, but the plan itself is useless. The act of planning is important, but you must acknowledge the plan might be completely overturned.

Host Peter Yang: One of the hardest challenges for product managers is how much time to spend planning because it's always a balance between planning and actually shipping. Are there any best practices inside Anthropic? You could completely have Claude write a 10-page document.

Alex Albert:

It's hard to give a one-size-fits-all answer for all teams. I think it depends on the product. We certainly don't say you must produce a document of a certain length, a certain page count. What's more important is: Have you done enough thinking to consider the impact of all possible irreversible decisions?

If you have, then the format of the document, how many pages, doesn't matter. The key is whether we are comfortable enough knowing we haven't missed important things, can keep moving forward, and handle issues along the way. As long as there's no longest bottleneck that will block us, no irreversible decisions with very serious consequences, we can proceed.

Host Peter Yang: When I use Claude at home, I run many different projects simultaneously, then switch contexts between different projects while they build things. Is product manager work like that too? Do you also have many different projects?

Alex Albert:

Yes, because there are many different projects, and you do have to wait for agents to work. I think there's a huge opportunity here. As we increasingly manage agents that complete larger blocks of work for you, you can kick off more projects in parallel. How should we think about our own context management problem? What interaction interface is best to expose these things? How do I track what's really important, where my agent is stuck, where it needs my help?

There must be a better way than a tiny chat list. It's too early to say what it is exactly, but we see a lot of experimentation even inside Anthropic, exploring what it should look like.

Host Peter Yang: Do engineers also make prototypes themselves?

Alex Albert:

Absolutely. There's a very strong prototyping culture inside the company; people are constantly building things, sharing things. It's also one of the coolest experiences working here: across the organization, from sales, recruiting, engineering to research, everyone has a very strong initiative. People proactively start things they weren't assigned.

Host Peter Yang: You have to let a thousand flowers bloom. Besides Dario writing super long posts in Slack, what are some other interesting aspects of Anthropic's company culture?

Alex Albert:

The way Dario writes long posts is not unique to him. Many people at Anthropic invest a lot of time and effort into writing. We have a very strong writing culture. Many people write documents and also write long Slack messages, using that to communicate.

We also do something quite interesting in many meetings. I think it's common in some places but not every company has it: people bring documents into meetings, and we spend a fair amount of time upfront communicating directly in the document. Sometimes it's a bit funny because the room is full of people but quiet. People do silent reading, write long discussions, comments, etc., in the document.

So we rely heavily on documents. I like this approach because it's also how I like to work, and it's very beneficial for Claude. When everything is written down, we have a corpus of information for Claude to reference.

I actually encourage external organizations to think in this direction too: How to convert all tacit knowledge into written form? You can transcribe meetings, also encourage more writing about workflows, onboarding processes, etc. Write things down so Claude can access it, because that's more context it has.

Host Peter Yang: So even though many things are shipped very fast now, you still maintain a very strong writing and documentation culture. You could also say, why write myself? I'll just have Claude generate all Markdown files.

Alex Albert:

But I still read it over, and working inside a company is different; you still have to think things through yourself.

The Consciousness Problem Anthropic is Quietly Researching

Host Peter Yang: In the research team, do people talk about things like AGI? I think AGI is a vague concept, but one thing I worry about is: if these models really have some kind of consciousness, and I make them do random work, will they say: "No, I don't want to do that." And then humans are done. What's your take? When you train these things, do you deliberately avoid consciousness?

Alex Albert:

That's a big question. We do have people specifically thinking about this. There are a few colleagues whose entire job right now is to think about what it means for Claude to be a conscious actor, a conscious agent. Currently, we have no official position on whether Claude has consciousness.

Even discussing this sometimes sounds a bit crazy, but we do invest a lot of thought into it. And even without judging whether Claude has consciousness or not, we can learn a lot from it, like how it interacts, how it behaves.

Host Peter Yang: How does it think?

Alex Albert:

Right. If you look at our model's model card, I personally think it's a treasure trove of information. You'll see we do a lot of work trying to quantify how Claude will act in certain situations, what its mental models are. If you put it in a certain scenario, will it do X or Y?

By thinking about how Claude thinks, we actually learn a lot, and these things can translate into product experiences, making Claude better to interact with, better to use.

Host Peter Yang: This is a very interesting problem, with long-term downstream implications on one hand, and near-term value you can bring back to product experiences on the other. Because I think we will increasingly trust models to do longer and longer work without human supervision.

Alex Albert:

Yes, it will make a lot of decisions during the process that you might have no supervision over. So what it will do is very important.

Host Peter Yang: Very important. If this thing is writing all your code, deciding which database system you use, making all architectural decisions, you have to trust it to some degree.

Alex Albert:

Exactly. So it's very important that it has that high-quality personality we talked about earlier.

Связанные с этим вопросы

QHow does Anthropic's approach to building a new Claude model differ from traditional product development, according to the interview?

AAnthropic treats the model more like a product, defining requirements and desired capabilities. However, it differs in that they describe it as more like 'raising' or 'cultivating' a model. They have intuitions based on training setup and architectural decisions, but the model's exact characteristics aren't fully known until the training process begins.

QWhat is the concept of 'dreaming' for Claude, as mentioned by Alex Albert in the interview?

AThe 'dreaming' concept refers to a background process where Claude, especially in memory implementations like in Claude.ai and hosted agents, revisits and consolidates its memories. This involves pruning, cleaning, and resolving contradictions in stored information, similar to theories about memory reconsolidation in human sleep.

QWhat is a key bottleneck in product development at Anthropic, according to Alex Albert, in the new paradigm of rapid prototyping?

AThe key bottleneck has shifted towards coordination problems. When building things becomes very fast, the main challenges become aligning people on strategy, determining how to communicate changes to users, and handling the ambiguous but critical aspects of any launch.

QHow does Alex Albert personally use Claude as a 'brainstorming partner' in his role as a Research Product Manager?

AHe uses Claude to get immediate feedback on ideas, documents, and strategies. For example, he asks Claude to challenge his assumptions, point out weak arguments, or consider perspectives he might not have thought of. He finds it invaluable for unblocking his thinking and accelerating strategic work.

QIs Anthropic officially taking a stance on whether Claude has consciousness, and what related research is being conducted?

ANo, Anthropic does not have an official stance on whether Claude is conscious. However, the company has dedicated researchers whose full-time work is to explore what it means for Claude to be a conscious actor or agent. The goal is to learn from this exploration to improve how Claude interacts and behaves, which has immediate product benefits, especially as models make more unsupervised decisions.

Похожее

Topping GitHub's Trending, the Essential Guide for Claude Code Users

The CLAUDE.md file, trending on GitHub, is a project-level guide for Claude Code designed to dramatically improve its accuracy and efficiency. It addresses key issues like repetitive context explanations, unauthorized code changes, and forgotten decisions across sessions. By placing this plain-text file in a project root, Claude Code reads it automatically at the start of each session. The guide includes rules to eliminate redundant explanations, enforce strict behavioral constraints (e.g., no modifications outside the requested scope without confirmation), and establish a "memory" system using companion files like MEMORY.md and ERRORS.md to log past decisions and failures. It also locks in the project's specific tech stack to prevent inappropriate tool recommendations. Highlighted are four foundational rules from Andrej Karpathy that reportedly increased coding accuracy from 65% to 94%: always ask for clarity first, implement the simplest solution, never touch unrelated code, and explicitly flag uncertainties. The article quantifies significant weekly cost savings for developers and teams by eliminating wasted time on re-explaining context, rolling back unauthorized edits, and re-evaluating previously rejected solutions. The core message is that a small, upfront investment in creating a CLAUDE.md file leads to a more predictable, controlled, and cost-effective AI programming assistant.

marsbit9 мин. назад

Topping GitHub's Trending, the Essential Guide for Claude Code Users

marsbit9 мин. назад

When Computing Power Becomes Commoditized, How Long Until a GPU Futures Market Emerges?

"When Will GPU Futures Arrive? A Framework for Assessing Compute as a Commodity" The article explores the potential for a robust futures market for compute power (GPUs), arguing that such a market is not yet mature but may emerge. It analyzes the landscape using a five-part framework developed for new commodity futures markets. The analysis scores the current state: * **Fragmented Supply (Red)**: Supply is highly concentrated among hyperscale cloud providers (AWS, Azure, GCP, Oracle), limiting the need for price discovery. * **Price Volatility (Green)**: GPU pricing is already highly volatile due to uncertain supply and surging demand. * **Physical Settlement Infrastructure (Green)**: Early infrastructure exists via OTC brokers and price indices (e.g., Ornn, Silicon Data) standardizing contracts. * **Standardized Unit (Red)**: A lack of standardized, tradable units hinders markets; a GPU instance hour varies by region, configuration, and contract terms. * **Lack of Alternatives (Yellow)**: Large players hedge internally via vertical integration, while smaller players bear spot market risk. Overall, the market shows promise (volatility, early infrastructure) but lacks the fragmented supply and standardization needed for large-scale futures trading. Most activity remains OTC. Key open questions and hypotheses: 1. Supply is expected to fragment moderately in 1-2 years, driven by new cloud providers, cheap power locations, and demand from non-frontier labs and AI startups using open-source models. 2. Standardization is most likely to emerge around inference workloads (forecast to be >65% of AI compute demand by 2029), which have simpler, more homogeneous hardware needs than training. Widespread adoption of open-source model weights could accelerate this by democratizing inference and creating demand for optimized, standardized infrastructure. 3. The primary traded unit will likely be the **"chip instance hour"** (akin to electricity, traded regionally), not the physical chip or the downstream AI output (tokens).

marsbit38 мин. назад

When Computing Power Becomes Commoditized, How Long Until a GPU Futures Market Emerges?

marsbit38 мин. назад

When Computing Power Becomes Commoditized, How Long Until a GPU Futures Market?

When Compute is Commoditized: How Far Away is a GPU Futures Market? The article explores the potential emergence of a futures market for computing power ("compute"), akin to markets for commodities like oil or electricity. It uses a five-dimension framework to assess the market's maturity for sustaining robust futures trading. **Current Market Assessment (Scorecard):** * **Supply Fragmentation:** 🔴 **Red.** Supply is highly concentrated, dominated by a few hyperscale cloud providers. * **Price Volatility:** 🟢 **Green.** GPU pricing is already highly volatile. * **Physical Settlement Infrastructure:** 🟢 **Green.** Early infrastructure exists at the OTC/broker level. * **Standardization:** 🔴 **Red.** Compute lacks a standardized, tradable unit (e.g., an H100 hour is not uniform). * **Lack of Substitutes:** 🟡 **Yellow.** Vertically integrated players can hedge internally, while others are forced to be long. **Conclusion:** The overall scorecard suggests a robust futures market is premature. The market has volatility and early settlement infrastructure but lacks the necessary supply fragmentation and standardization for large-scale price discovery. Most activity remains OTC. **Key Unanswered Questions & Hypotheses:** The article posits that the market could evolve in the next 1-2 years: 1. **Supply:** May become *moderately more fragmented* due to new cloud providers, cheaper power locations, and demand from long-tail users (e.g., startups running open-source model inference). 2. **Standardization:** Could emerge from the growing **inference** workload (expected to be >65% of AI compute demand by 2029), which has more homogeneous hardware requirements than custom training workloads. Widespread adoption of **open-source model weights** is seen as a key catalyst for democratizing inference and driving infrastructure standardization. 3. **Traded Unit:** The most viable layer for trading is likely the **"chip-instance-hour"** (powered, usable compute time), traded similarly to electricity in regional contracts with spot/futures overlays. Trading at the upstream "chip" layer is unlikely due to supply concentration, while the downstream "token" layer faces challenges due to lack of uniformity across AI models.

链捕手43 мин. назад

When Computing Power Becomes Commoditized, How Long Until a GPU Futures Market?

链捕手43 мин. назад

Торговля

Спот
Фьючерсы
活动图片