Nicholas

Simulating Humans at Scale: Simile's Joon Sung Park

Nicholas

The race to build superintelligence is producing models that keep getting better at objective problems, but not at behaving like actual people. Joon Sung Park, founder and CEO of Simile and creator of Stanford's "Smallville" generative agents study, argues that simulating human society requires a fundamentally different kind of model. He frames today's frontier models as the "CPU of intelligence"—rational, superhuman at problems with right answers—and Simile as creating the "GPU of intelligence," built to encode the diversity of people's values, preferences, and tastes. It simulated 1,000 Americans and predicted their behavior 85% as accurately as people reproduce their own answers. CVS uses it for concept testing; some customers simulate their own earnings calls. Joon's larger bet: a "CERN of human society" that could one day model bank runs, climate cooperation, or the early signals of a collapsing democracy. Hosted by Sonya Huang, Sequoia Capital

Published
Published Jun 16, 2026
Uploaded
Uploaded Jun 16, 2026
File type
Podcast
Queried
0

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:00-1:46

[00:00] I am somebody who is quite inspired by science fiction. [00:02] And... [00:04] When you read science fiction, [00:06] that covers societies that have progressed far enough in its technological maturity. You always see two pillars. [00:13] you have some version of AGI, [00:15] and you have some version, [00:17] of simulations. [00:18] that really help guide the society. I do see an opportunity today to really take the first crack at building the simulation. I would not have said that even five years ago, but that is a conviction that we have built up over the years as we are going deep into this research. [00:35] Thank you. [00:52] - Today, we're delighted to have Jun, founder and CEO of Simile. Simile is building an applied AI lab, simulating human behavior and societies. And I'm very excited to have you here to discuss what you're building. [01:04] Same here. Thank you for having me. [01:05] Okay, take me back to April 2023, Stanford, California, specifically Smallville, Stanford, California. What was that? [01:13] So SMOVA was a project that we were running at Stanford, where the idea was that we made this observation that large-linked models [01:21] can now encode a lot of human behavior that is embedded in its training data from the web and social media and so forth that if you sort of prove at the right angle you can actually get a lot of micro behaviors out of these models so given a very specific demonstration or description of a situation what would person x do and it would actually generate really interesting behaviors we found that to be so interesting and we found that to be the ingredient that we had been waiting for

1:51-3:42

[01:51] where we decided that if we push this as far as possible, what would a society that is created by these agents look like? So we basically created generative agents. [02:01] that is paired with generative AI model with memory, planning, and reflection to basically create this lived experience of agents living in this small town. So Smallville was basically a game town of 25 agents living in it, [02:14] Individual agents had a description of persona, but they would actually wake up in the morning, do their routines, go to work, actually have relationship, sort of like people would, and they would actually have emergent phenomena, like having parties and so forth. So that was the experiment that we ran. What was the most surprising things to come out of the experiment? [02:33] So one of the surprising things was, so the experiment, the simulation itself actually sets place the day before a Valentine's Day. [02:42] So you actually see these agents [02:45] One of the agents actually thinking, well, I run a cafe. So she's a cafe owner. Her name's Isabella. [02:50] She goes and thinks, it would be great if I can do a Valentine's Day party where we invite a lot of friends, customers. So you actually see her on the day before Valentine's Day going around, actually gathering materials for the party, actually telling our customers, hey, we're going to have this party. Please come. [03:06] And on the day of Valentine's, you actually see this immersion party that actually gets formed with all these agents coming to the party. [03:13] Did anyone not get invited? Well, some of the people did get the invitation, but they forgot. That's one thing that did happen. Some of the agents did not explicitly get invited, but we had one agent who got the invite, Klaus, who decided to ask his crush out on a date. So he would actually bring in the date. They would actually have a party at this cafe. So quite surreal. So how did you end up building Smallville in the first place? Were you studying kind of human psychology and social behavior? Or was this coming from the...

3:42-5:17

[03:42] Can a customer back or was it coming from the technology out? [03:45] So my protocol team has been excited about simulations, and we saw the vision of simulation failure early on. [03:51] So my career as a researcher at Stanford really started back in 2020. [03:56] That was the year when GPT-3 was about to come out. It wasn't quite there yet. [04:01] But it was just about to come out. We start to get its first demos. [04:04] And my first year, we wrote this paper called "Opportunities and Risks: A Foundation Model." [04:10] alongside many of the Stanford researchers, and was led by one of my co-founders, Percy Leung, who is now the head of the Stanford Foundation model at Stanford. [04:18] And when we were writing that, the part that I was really focused on was, well, here's a new class of models that we have not seen in the past. [04:24] that these models that can be very generalizable in ways we didn't quite have in the past [04:29] And I got into thinking, well, if [04:32] we can imagine the kind of interaction we can create with these models, what would that be? [04:36] And many of my colleagues back then were surprised that these agents or these models can do classification or a simple generation. [04:42] And that was really incredible to see because these models didn't really know or wasn't really taught to do that. But the part that was surprising to me wasn't that these models can't do that. Because from interaction perspective, we've known how to do this for a long time. [04:55] The interesting part was, well, these models can actually encode human behavior. What does that mean if we were to push this as far as possible? [05:04] Part of the tradition that come from research included what we call social computing. [05:09] and social computing within human computer interaction, really has to do with this idea of how can we build a better technical technological platform?

5:17-6:48

[05:17] that would enable social interactions and collaboration. One of the most difficult challenges of building a social platform [05:24] is not necessarily testing the UI/UX of the system, but it's more about when you have tens of people [05:32] millions of people and down the line billions of people, how do all these people come together? [05:37] to create the immersion phenomenon that's both good and bad, and how can we design for a scale? [05:43] And so far, we didn't really have a tool that would enable us to test for that. The only way we test it today is you basically field test it. You release your prototype, see what happens, [05:53] and sometimes it actually comes at a real cost. [05:56] Obviously, it's [05:57] high cost in terms of human hours and the time it takes. [06:00] But at the same time, if you have a bad design, imagine you have a feed on social media [06:05] that is more likely to propagate certain emotion that is negative. [06:09] And obviously that is something that we want to avoid. [06:11] but this now gets tested in the field. [06:14] So we wanted to see whether we can actually create [06:17] a simulation that would actually let you test for this. So 2022, this was actually a year before generative agents, [06:24] We worked on a paper called Social Simulacra, which actually really was the precursor to the agent paper that we ended up writing. The core thesis was, imagine you're building a subreddit. You're a designer on a subreddit. You want to see what people might do in the subreddit, which is a surprisingly hard task, even for practice designers. [06:41] And we basically decided, hey, we have this model, seems unique. Let's use this model to create simulations of the entire subreddit.

6:48-8:27

[06:48] So you define the goal, you define the moderation strategies, [06:52] and you populate it with thousands of, back then we didn't call them agents, [06:56] but we call them personas, but populated with thousands of personas. [07:00] This is basically 22 version of notebook, which is quite interesting that it actually came back. And when we saw that, we actually got a lot of really important insights out of this. [07:11] What are the good behaviors? We actually simulated a community where the entire idea was for people to discuss with each other the sites, places to sightsee in Pittsburgh. [07:21] And all of a sudden, you start to see this personas actually collaborate to actually discuss, hey, XYZ places are amazing. Do you want to actually go to a trip together and actually plan those trips live in the simulator subreddit? [07:34] So that's how we got excited. So we saw the vision and the excitement and the potential applications fairly early on. [07:41] But then the work that we had to do was then demonstrating, how can we go beyond simple personas to create complex agents? [07:47] that actually can think over time because we want to simulate the longitudinal aspect of our society, and then actually validating that these simulations are actually accurate in practice. [07:57] Was there a point of model evolution at which you felt like, [08:00] Okay, we're there. The models are good enough for us to actually have a faithful representation of human society. [08:08] So, [08:09] GPT-3, when it came out, and Social Simulacra was built with GPT-3, and it was very janky. It didn't do any instruction tuning. It did not follow your instruction. So just to have it to listen to you and do what you wanted to do, you had to do some weird tricks with prompting and so forth. Well, you could actually see the promise.

8:27-10:03

[08:27] The model actually have encoded a lot of human behavior, and you could actually see the trajectory. And when we had the generative agents paper, it wasn't quite chat GPT, but we now had instruction tuning. So we could actually build much more complex agents that can reason about its memory. [08:41] That wasn't really possible when we did social simulacra. And since then, of course, the models have improved. [08:47] So where we are today is the models at its foundation level have reached the point where we can actually imagine building these kind of applications. [08:57] Now, the part that actually I do think, however, that's quite interesting here, [09:00] Today, if you look at many of the large language model companies, whether it's OpenAI, Anthropic, and many of the new labs that are getting formed, [09:08] the models they are creating... [09:10] or models that I would consider to be, there are no start to be, something that is similar to, let's build a super intelligent machine. [09:16] These machines are meant to be rational. [09:19] And these machines are supposed to be really amazing at technique problems that have an objective answer. [09:25] So maybe that's not even the best simulation of... [09:27] true human society then. Turns out people are irrational. We have a lot of subjective values, preferences, and tastes. [09:35] So you actually start to see divergence in model size going up. [09:40] and the performance in its ability to predict and simulate human behavior. [09:45] So we have sort of plateaued [09:47] with current modeling paradigm, our ability to really simulate humans. [09:52] So, [09:53] It is sort of at the starting good foundational level. [09:55] But to make it really amazing, we do need the next frontier that is more geared towards actually modeling people's diversity. Very interesting.

10:03-11:36

[10:03] At what point did you realize that what you did with Smallville could become a company? [10:10] Right. [10:11] So, [10:12] Again, the promise of application was something that I was very much inspired by early on by simulation with social simulacra and so forth. [10:20] But the part that I realized over time is research [10:24] and a company have very different function. [10:27] Research is an amazing vehicle. [10:30] if you want to basically do breadth-first search. [10:32] you are in a lab, [10:34] surrounded by a really smart set of people. And each of the researchers own a small piece of thesis. [10:40] And they go explore some of those thesis blossoming to amazing research product. But we're not necessarily known for finishing our job. We're not usually the one to bring that research impact to the real world. [10:54] company. [10:55] It's a machine for depth-first search. [10:56] You have a conviction on an area, you find a hill that you want to climb, [11:01] This is the vehicle that let you put together resources and an amazing group of people [11:07] to go after a singular vision without hesitation. [11:10] And we got that conviction, I would say, about half a year after Generative Agents. After the original Generative Agents paper, we got so much inbound interest, initially from actually social scientists who wanted to run their experiments and all the RCTs on our platform. Then very soon after, many of the Fortune 500 companies who saw this demo and their board members and CEOs who sometimes visit Stanford saw that, and they started asking,

11:36-13:08

[11:36] Well, we go run all of these surveys and experiments, and there's so many research questions about the market that we cannot answer today. [11:43] Can we run that in simulation? [11:45] That started to really intrigue me because that showed a clear line towards a real-world impact for research, which is not always the case that we have that kind of opportunity. [11:55] So that is when we decided we actually want to validate the simulations are accurate. [12:00] So we went out and actually created simulations of 1,000 people. [12:03] of the US population, we demonstrated that using our architecture and the models, we can actually predict people's behaviors 85% as accurately. [12:12] as people replicate their own. When we saw that, we thought, okay, this is something that we feel comfortable [12:18] providing to our users as a platform for simulating their really important decisions. So that's when the co-founders, myself, [12:27] Percy, as well as Michael Bernstein, was a researcher and my advisor at Stanford. Both of them were actually my advisors. So the three of us have been working together for five years. And now at this point, it's similarly six years. But that's when we got together to have the initial conversation of can this be a company? [12:42] Amazing. Maybe walk me through a customer engagement end to end today. Like who's a canonical customer and which department? And they come to you. What are they asking you? And, you know, what product or service do you deliver to them? Right. So maybe an example that I can give to make this concrete. [13:03] Um... [13:04] So CVS has been partnering with Simile for the past, I would say, nearly half a year.

13:09-14:49

[13:09] and they've been an amazing partner. [13:11] The way we initially got in touch with, so our main buyer at CVS, [13:15] is the lead out is a senior VP who leads Human Insights. [13:19] and the original story there was he was [13:23] He basically read my paper that validated the age and simulations and thought, we have to bring this to CVS. [13:30] because today, [13:31] We are bottlenecked by the number of questions we can field test. [13:35] and we're also bottlenecked by truly the physics of human society. [13:39] It's one thing to ask surveys and experiments. [13:43] Totally different thing if down the line you actually want to simulate the entire market and actually map out all the second order impact of the decisions you suggest to your leadership. [13:53] So he's been looking around for the solution and his cousin happened to know me. [13:57] and basically told our buyer Sri, [14:01] that the authors of the paper are actually looking to start something. So that's how we got connected. [14:06] In this particular engagement, [14:08] Usually the way this goes is our customers [14:12] are very much used to working with polling companies or panel companies today. And there they go. [14:18] and basically ask these companies, XYZ are the populations that we're interested in better understanding. [14:24] can we go run a research study of these topics that initial stage looks very similar for simile [14:32] So our buyers come and they tell us, "We want to better understand XYZ population. [14:37] Then similarly, it goes out and we have, through our partnership with vendors, we have a strategic partnership now with Gallup, for instance, who is a polling and panel company, where we go out, work with our vendors to...

14:49-16:20

[14:49] actually reach out to real humans. So these simulations are grounded in real data, but reach out to those people. [14:55] Collect data [14:57] that we believe are efficient and generalizable about that person so imagine you have 15 minutes [15:02] What are the magical questions you can answer or you can ask these people during that time? [15:06] We collect that data. [15:08] Use that data to create agents or simulations of these people that can basically be used to answer a large number of questions that goes way beyond the original domain. [15:17] We load it onto our platform and it's basically a SaaS product. [15:21] Our customers come and they can basically ask any questions about a group of people of their interest. [15:26] - So interesting. It reminds me of in Autonomous Vehicles, [15:30] You go and collect a bunch of data from the road, and then you're able to augment it with simulation. Is this a similar concept, or are there big differences to what you're doing? It is a similar concept in the sense that, of course, you... [15:43] with the... [15:44] self-driving vehicles, you want to create model [15:48] that is based on real world physics but you want to create a model that is generalizable beyond your training data [15:54] it needs to be generalizable in two different locations with different weather conditions [15:58] very similar concept where what we want to create is we want to reach out to real people. And for these people want to understand something fundamental about these people in a way that we can encode into the model. [16:11] I would have thought that the large language models would be such a good representation of... [16:16] you know, the whole world, that you could almost narrow it down. You could tell

16:20-17:53

[16:20] Claude, you are a 34 year old woman living in a, you know, bi-coastal metropolitan area. And, you know, it would be able to have a faithful representation. So I'm actually surprised that you go out to Gallup. Maybe can you just explain why you have to go out and collect any real world data at all? [16:37] Yeah. One of the big questions here is the question around, say, do gap. [16:42] There are things that people say, [16:43] And then there are things that people actually do. [16:47] and the gap there is real. [16:49] And a lot of the large-length models are trained on attitudinal data. [16:53] Fundamentally, it is the things that people have said online. [16:56] That does cover a large quantity of its training data. [17:00] So one of the things that Simile's simulation platform does... [17:03] is actually closing that gap. [17:05] So a lot of the data that we end up collecting by nature are behavioral. [17:10] It also includes data that actually goes into literally questions like, just tell me the story of your life. [17:17] Turns out, if we understand the person's story of your life, the kind of data you get from it is what we consider to be the long tail information about this person. [17:25] It's not about... [17:28] what you've done in this particular moment, it's not about very broad questions like what's your view on politics. [17:34] It's about where you grew up, what were some of the difficult decisions you had to make in life. [17:39] And what's interesting about this data is it's an amazing way to build a translational layer between attitudes and [17:45] and behavior. [17:47] So we combine these kinds of data sets, but fundamentally that's the gap that we want to close. What sort of behavioral data do you have?

17:53-19:25

[17:53] So the simile does run a lot of experiments. So kind of models that we have trained, for instance, we have a huge repo of RCTs. So randomized control trials that were run in social scientific context, that were run around pricing studies. [18:08] So one of the models that we are training is basically the foundation model of human behavior in quite a literal sense. [18:15] we have other behavioral signals from RCTs. [18:18] Can we actually encode that into the model? [18:20] so that the end outcome is a model that can basically predict the results of any RCTs. [18:25] Yeah. [18:25] At the same time, one of the conversations that we keep on having with our customers that we're very excited by [18:31] is our customers then come in, see that potential, and their mind goes to, wow, [18:36] We have 90 million customers, let's say, here at CVS. [18:40] How can we leverage this kind of data? [18:42] to create better simulations. So there's also conversation around how can we in a responsible and ethical way [18:49] leverage existing data [18:52] that is also in-house for our customers. [18:55] then use that to create augmented version of Simile's model. So that, of course, is going to be more fine-tuned, specific to the population of these customers. [19:05] But that's the kind of data that we will be leveraging. I see. And are you doing these interviews typically by voice? Is it a survey that you filled out? What's the modality? So it's a huge breadth. [19:15] uh or the quick answer here is it's both um [19:19] Interviews are fantastic if you want to get [19:23] the long tail information about people.

19:25-20:58

[19:25] So we actually do, in the original study that I conducted back in 2024, we literally ask a question, tell me the story of your life. [19:33] Now, the way we do it is we are training our own model. So it's a reinforcement learning loop, but basically imagine the objective function here is how can you spend a minimal amount of time [19:44] to get the maximum amount of visibility about this person. [19:48] So that is one of the things that we do. So basically training an interviewer that is not really asking for factual information or an experience about a particular platform, but just what are the life stories that people have? [20:00] that can be used to train our own model for these agents. [20:03] And then, [20:05] for the more factual or sort of more [20:08] discrete choices, choice questions, surveys, and so forth. These are also very efficient. These are time and data efficient because people can fill out many of the questions in short period of time. So for those, we actually do leverage them. And for instance, if you want to just have a broad understanding of people's viewpoints on certain topics, certain policies, [20:26] and things like that. [20:27] You describe yourself as an applied [20:29] AI lab. How do you think about where you want to build your own models versus where you want to rely on other existing models? [20:37] So in terms of building our own model, [20:39] With the core thesis here is there is an amazing model to be built. [20:44] that really encodes the diversity of people's values, preferences, and tastes. [20:50] in ways that simply a rational model cannot do. [20:53] So one way I actually post this, we're sort of building, say, imagine the current today's model.

20:59-22:34

[20:59] are akin to the CPU of intelligence unit. It's a single model trained on amazingly rational data, [21:05] that is amazing at solving very complex objective questions. [21:10] Similis model is much more akin to developing something that is closer to the GPU of the intelligence unit. [21:18] where the idea here is we don't actually need a model that is superhuman [21:22] That's similarly. In fact, we want model that's as human as possible. [21:26] but we want to make sure that these models [21:28] at the sort of individual subunits can represent a real viewpoints of different subpopulations. [21:34] So where we see that gap [21:35] That's when we go develop our own model. [21:38] But at the same time, we do leverage frontier models, for instance, as a way to coordinate the research. Frontier models are amazing at coming up with a research plan. [21:47] So that's where those models actually do get leveraged. [21:50] Very interesting. [21:51] Are people typically coming to you with questions around new product launches, how they should be marketing their companies? [22:01] Pricing, all of the above? So it is all of the above. Our customer journey usually does, however, start with a very concrete use cases and problems they are trying to solve. [22:12] Concept testing is a big one. [22:14] It's also very straightforward one. So they have a new concept, new product idea, new market message they want to test. [22:20] And they want to hear from their users what they would think about XYZ. [22:24] This is one way for them to quickly test those ideas. [22:27] And then the promise they quickly see is, well, right now we're very much in the practice of testing five to ten different ideas at most.

22:35-24:05

[22:35] But what does it look like for us to test instantly thousands different ideas [22:39] across thousand different cell populations. [22:41] That's the initial vision they see. [22:44] Then we really get into the nitty-gritty details of, well, where does the simulation go from here? [22:50] They then pretty soon started asking, well, can this be used to do product testing, but not just simply submitting an image, [22:58] But imagine... [22:59] basically asking these agents, go experience this product for 10 minutes. [23:03] and tell us about what you experienced, what you saw. So you're basically adding temporal dimension. [23:08] Then you go into things like multi-agent simulation. [23:12] Some of our customers very routinely actually ask us, [23:16] to simulate their earnings call. This is actually a use case that both surprised me at first, but this is also surprisingly a common ask. [23:23] Because of course, the CEOs and board members always need to think about, hey, how are we going to design our earnings call? How would the audience react? So that is something that we also do. And this is very much a multi-agent simulation. You know, it seems like there's so many use cases that could potentially be tested once you have like a simulated almost customer population. Right. Yeah. [23:46] the value of... [23:48] research and testing in sim versus just like [23:51] Let's say you have a new product concept that you want to test. Why not just go run a thousand Facebook ads and like you actually get the click through rates on this stuff? Isn't that real world data almost more useful than the simulated data on how people might behave that you then correct for with your own models?

24:05-25:36

[24:05] So it's a great question. And I think to some extent here, the answer has to do with initially scale. [24:12] And then down the line, truly the new capability that comes because you can simulate interactions. [24:17] The scale question here is actually quite straightforward, where yes, you can actually run Facebook ads and Facebook testing. [24:24] But the kind of experiments that you can run in simulation is actual behavior simulation at scale. [24:30] So you can basically pull in any number of users, [24:32] doesn't even have to be bounded by the number of population that's available on Facebook. [24:37] and is also much more representative [24:39] because only certain groups of people will actually respond to the online experiments. [24:44] But similarly, the model that we are creating, one of the key promises is that it is representative. [24:49] we do the hard work of actually getting the representative set of people [24:54] and then collecting the data that would actually represent them properly. [24:58] So the scale representativeness is something that many of our users do not have easy access to. [25:04] This is actually one of the common asks also that we do get or common sort of pain points that we have heard. [25:08] where the question that many of these people have isn't about like, what questions do we ask these people? But it's about in the first place, how can we get to the population that we're excited to talk to? [25:19] That's a huge bottleneck. [25:20] Then down the line, you can actually really start to imagine, and this is something that our customers and some of the most forward-looking customers are now going into, [25:28] which is [25:29] What are all the downstream implications of the decisions that you make? [25:33] It's not just about whether, imagine you have this particular product,

25:36-27:08

[25:36] Do you like it or do you not like it? Would you pay for this, not pay for this? [25:40] It's not necessarily just that initial question that we want to answer and finish. [25:46] But we want to understand, imagine you're a car company. [25:49] You launched an electric vehicle in this market. [25:53] maybe the electric vehicle does really, really well. So we can help you do concept testing around marketing and the product around the electric vehicle. [26:00] But what does that do to the perception [26:03] of let's say non-electric vehicle [26:05] Does that change the market perception? [26:08] then what does it mean for the rest of the product line? [26:11] and how do you balance those kind of second-order impact [26:14] of your decision [26:15] in a way that is more evidence-based. [26:17] Today, there's no way to test for this. You can run this in simulation. [26:21] So really going beyond simply asking one question at a time, but then to think about what are the long-term implications of your decisions is something that our customers are quite excited by. I'd love to understand how you think about how predictive your model is. [26:35] and actually simulating real human behavior. I imagine you have a lot of evals on this. I guess, what is your North Star metric? [26:41] How do you guys do on that? Um... [26:44] and... [26:45] What do you think is the theoretical limit? [26:47] It's a great question. So theoretical limit, and let me just start from there, certainly does exist in the sense that humans are genuinely, there's a lot of randomness, that if you ask me the same question, I'll actually answer the question slightly differently. [26:59] So there is certainly that degree of randomness in human behavior. [27:03] However, there's a lot of gains in performance that we can have even today.

27:08-28:39

[27:08] in the way we are predicting people. [27:10] So the measurement that we do, [27:12] is so at the level of population, we measure the distribution of responses. [27:17] if it is more quantitative. So we actually measure total various distance, which basically shows how close are the distributions of the ground truth versus the simulated information. [27:28] Thank you. [27:28] And that is a metric that we run across all the use cases that our customers have. [27:33] And we have certain threshold that we believe is good enough for decision making. [27:38] So TVD of, let's say, less than 0.15, we believe, is actually quite strong evidence for making decision. [27:45] So that is a North Star state that we want to hit for this class of use cases that are more quantitative, that's more question and answers. This also does cover RCTs, which is many of the core use cases our customers have. [27:57] Now, there's actually a really interesting question to ask around, well, [28:02] What about multi-agency relation? [28:04] What about all the downstream implications that we're going to be simulating? What does... [28:08] the evaluation of those look like. Yeah and then do you daisy chain errors as you kind of you know if this one is 85% accurate and then this agent is telling another agent something and you know like do you accumulate errors as you go as you go towards multi-agents. Exactly and one of the core thesis here is we basically see two categories of simulations. One simulation is what I would consider to be simulations that converge. [28:31] The other categories of simulations are the simulations that diverge. [28:35] And sometimes they actually coexist. And it's really about what research questions do you have.

28:39-30:12

[28:39] Questions that converge doesn't actually matter if you have a little bit of [28:44] error now the error cannot be obviously so dramatic [28:47] that it's completely detached from reality. But you actually are okay, even if the errors do compound over time, because the pull towards the convergence is strong enough that you'll actually understand where everything would fall. [28:59] A good example here actually is if you simulate a network of people. [29:03] then that network will always have [29:07] a hub that gets formed. [29:08] This is what sort of network scientists would call the skill-free network, for instance. This is actually what powered Google, too. [29:14] while the core [29:16] observation of page rank was doesn't matter how these networks actually get formulated you actually see some web pages [29:23] that get exponentially more links that are attached to it. [29:26] This is a very fundamental behavior in humans. That we also see in simulated networks. [29:33] And that convergence always happens as long as you are replicating human behavior with certain threshold accuracy. [29:40] Now, there are then questions that generally do diverge. It's like your classical questions like, [29:45] Was World War I inevitable? [29:47] Or was it not? [29:48] Thank you. [29:49] And there... [29:51] It is sometimes difficult to run the same simulation over time and get the same exact outcome. [29:57] Imagine you're running a [29:59] This is not something that necessarily similarly right now is going into, but imagine you're running a simulation of an election. [30:06] Will the same person win the election every time? There are a lot of downstream implications of every single decision that does happen.

30:12-31:43

[30:12] So it does diverge. [30:14] There, the core evaluation is around confidence. [30:18] So imagine you run the simulation 100 times. [30:20] How many of those times do the results come out to be x? [30:24] and how can we actually use that to basically create [30:27] a bootstrap resampling to calculate the confidence around the simulations. Those are some of the questions that we do ask. [30:35] And a huge part of this also, of the power of simulation is then to show when it diverges, to show the [30:42] diversity of possible outcomes. [30:44] so that people can actually look. [30:46] understand the cause or mechanism of how we got to those outcomes. [30:49] and prepare for those features. [30:51] So those are some of the implications of divergence in simulations. [30:55] Are there any mathematical descriptions of like why something would converge or diverge? Like I'm imagining if you have like an average function, maybe you converge. And then if it's like a, you know, you're splitting outcomes to a binary, then you might maybe diverge. But yeah. [31:08] So the intuition, I think, is close. And technically, this is also a research topic. [31:13] So similarly, it's certainly a company where we do go deep into this research topic. [31:18] in the sense that [31:20] I see simulation as a field that's akin to developing your day one of inferential statistics. [31:26] Inferential statistics scientists actually had to do a lot of discussion and research over time [31:31] to decide that P [31:33] Less than 0.05 is actually evidence that is strong enough for science. [31:37] Simile [31:39] is working on setting the same kind of threshold and standards for the rest of the field.

31:43-33:16

[31:43] So those are the intuition. [31:45] I think that's exactly the right intuition in terms of actually how to make a robust mathematical equation around what's going to happen when. It is a real research frontier for simulations. [31:56] Thank you for being nice with my vibe-mathing. I'm curious, it seems like a lot of the 4-500s coming to you, I'm wondering whether there are non-existing corporate use cases that might, you know, there are great mysteries of our society that might become solved. And for example, I'm wondering about economics, central bank decisions. Oftentimes, like, [32:18] I personally believe in macro. Nobody knows nothing. And oftentimes a lot of the issues come about from human psychology. So to me, macroeconomics is a function of simulating human behavior at scale. I'm thinking even in the venture capital use case, we often debate internally. [32:38] you know, does value accrue to this company or not? You could run the simulation of all the different layers of the AI stack and almost figure out where durability and value accrues. Like you had a kind of perfect, [32:50] simulator of human behavior. There's so much more you could do than serving the Fortune 500. Do you agree with that? And then if so, are you serving... [32:59] governments, you know, the like. Yeah. [33:03] So it's interesting. When we were still researching in this area, the way I actually got... [33:10] back then my advisors Michael and Percy excited about this was I basically told them look

33:17-34:51

[33:17] We do this right. [33:18] There's a Nobel Prize to be won there. [33:20] And I truly believe that. [33:23] It's also not surprising in that your classical... [33:27] economics simulations. [33:29] things like agent-based models. [33:30] that really pioneered our understanding of back in the day, the kind of topics they studied was, how does segregation happen? What are the causal mechanisms for segregation? So scholars like Thomas Schelling would actually build agent-based models that are extremely simple and rudimentary, but that showed something deep [33:47] about human [33:48] macro behaviors. And he of course went on to win an award prize. [33:53] I see the same opportunity here, [33:55] but in an augmented way. [33:56] Where back in the day, the agent-based models were very much deterministic in some sense. [34:01] where you basically, in this simulation of [34:04] let's say like modern segregation from 30 years ago, individual agent was simply red dot or blue dot. And every game iteration, they would look around its corner, see how many of its neighbors are of the same color. [34:16] and if the threshold goes below certain threshold, [34:19] then they will decide to move to a new location. That was it. [34:21] But now we can actually create real agents that replicate the full richness of individuals and run the same kind of simulations. [34:29] So the kind of questions that we can ask that goes beyond simply the commercial use cases, [34:35] For instance, [34:35] In the context of macroeconomics, the questions that I did get asked from economists were things like, [34:41] When does bank run happen? [34:44] or questions like climate change. One of the sort of core blocker of climate, like solving that issue, is the collective action problem of many nations.

34:51-36:25

[34:51] Can we have to assimilate that? [34:53] or what are the signals [34:55] of a democracy that is about to collapse. [34:58] can we understand the origin story of the monetary system? [35:03] These are the kind of simulations that I do believe [35:06] ought to be the North Star state of this field. [35:10] And it is sort of interesting to imagine what that would actually look like in practice, right? Because these would involve very large scale simulations with many agents interacting with each other. [35:20] I do see a future where [35:23] Today, this is something not the case. Today, our simulation is quick and fast to run. [35:28] But what about simulation that takes actually $100 million to run once? [35:33] and could take many months to run. [35:35] but when we run it, [35:36] it solves one of the fundamental questions of our society. [35:40] That I do think is genuinely a very exciting possibility for this field. [35:44] I'm even thinking like politics, for example, could be forever changed. Today everyone has an agenda of how they say some policy change will impact things. Why don't we just run the simulation? And there's all the downstream implications. And not just what's going to happen this year, but what does it mean in the next five to 10 years? [36:05] Exactly. [36:06] Fascinating. I was going to close by asking you what makes you excited about the future? Is it what we just talked about or is it something else? I am somebody who is quite inspired by science fiction and... [36:18] When you read science fiction that covers societies that have progressed far enough in its technological maturity,

36:25-37:57

[36:25] you always see two pillars. [36:27] you have some version of AGI, [36:30] and you have some version [36:32] of simulations. [36:33] they really help guide the society. [36:36] I do see an opportunity today. [36:39] to really take the first crack at building the simulation. [36:42] I would not have said that even five years ago. [36:45] But that is a conviction that we have built up over the years as we are going deep into this research. [36:50] And what's exciting is there's a clear use case today that can serve our users. [36:55] But then there's a lot of innovation that is yet to come. [36:58] that I do think will build up to actually building simulator that's akin to discern [37:03] of human society. [37:05] And one of the things that one of my co-founders, Percy, sometimes say is, [37:09] As you look at the greatest scientific innovation, [37:12] they often start from an amazing measurement. [37:16] Hubble Telescope really changed the trajectory of how we understand the universe. [37:21] simulation can be that [37:22] for human society. [37:24] So the thing that does excite me, there's a lot of focus on natural sciences. [37:28] But how can simulation really unlock our understanding of humanity and social sciences? [37:34] and how can we actually use it to make our society be a better place? [37:38] That's exciting. [37:40] I remember reading somebody was excited about, you know, as a small but... [37:45] you know, breathtaking chance that... [37:47] The field of economics, as we know it, may actually... [37:50] become solved by simulation. And I'd extend that not just to the economics, but [37:54] kind of everything that deals with

37:57-38:43

[37:57] uh, human behavior, uh, [38:00] And social sciences, which ultimately is everything around us. [38:03] Truly. Wonderful. Thank you so much for joining today and sharing the story of both Smallville and what you're now up to at Simile. I really enjoyed the conversation. Same here. Thank you for having me. [38:14] Music.

Want to learn more?