Transcript
WEBVTT
00:00:06.290 --> 00:00:10.316
Welcome to the index podcast hosted by Alex Cahaya.
00:00:10.316 --> 00:00:35.935
Plug in as we explore new frontiers with entrepreneurs, builders and investors shaping the future of the internet hey everybody and welcome to Everything.
00:00:35.996 --> 00:00:39.981
Bagel, I'm your co-host, alex Cahaya, and I'm joined by Ben Enroy, founder of Bagel Network.
00:00:39.981 --> 00:00:42.810
Today, we're going to explore the future of open source AI with our special guest, greg Osuri, who's the founder of Akash Network.
00:00:42.810 --> 00:00:51.082
Today, we're going to explore the future of open source AI with our special guest, greg Osuri, who's the founder of Akash, which is a decentralized compute marketplace.
00:00:51.082 --> 00:00:53.429
Greg, thanks so much for being here today.
00:00:53.429 --> 00:00:54.451
I appreciate you joining us.
00:00:54.451 --> 00:00:55.942
Great to be here, alex.
00:00:55.942 --> 00:00:56.966
Thank you so much for having me.
00:00:57.659 --> 00:01:26.945
So for some context, I've known Greg for a while since, I think, before Akash actually launched, because we were pretty early investors back in the day, and one of the things I love about both Greg and Bidon is they're like open source purists, which is, you know, I got red pilled by open source by my partner, brian Fox, who's the, you know, the author of the Baschel Rola GPL licenses, and both Greg and Bidon have been massive advocates of open source and are an inspiration to me in that realm.
00:01:26.945 --> 00:02:11.002
Greg, I know you've been following Bagel, but I'm sure one of the things you love about them is like they ship really innovative software technology around AI open to everybody and, like most recently they figured out how to use ZKP zero knowledge proofs to prove who contributed what to an AI model, and that is what enables them to power monetizable open source AI, which is huge, and they open source the white paper and all the code that shows it actually working and they like just to give you like an idea of the order of magnitude here, they reduce the time that it takes researchers to do the same thing from like 1000 thousands of hours to like two seconds with a team of like three researchers Like there are nine people on their team total.
00:02:11.002 --> 00:02:13.649
So I've been like, yeah, really inspired about that.
00:02:13.649 --> 00:02:19.056
We've been talking a lot about that on the show, Just like our last episode that we just recorded was a decent.
00:02:19.096 --> 00:02:21.103
Part of the episode was like what's the insight?
00:02:21.103 --> 00:02:28.555
Like how do you get to those insights that lead to these kinds of big innovations when there's teams that have, you know, 10 times the amount of funding and 10 times the headcount?
00:02:28.555 --> 00:02:35.748
But then this little team, this like underdog, the Dave and Goliath kind of story, ships open source, kind of like the deep seek thing that just happened.
00:02:35.748 --> 00:02:43.343
I really feel like that what they did is very similar to that where it's way cheaper, way faster and came from this like way less funded, smaller team.
00:02:43.625 --> 00:02:44.725
It's funny how that works.
00:02:44.725 --> 00:03:21.284
I wrote something in 2019 or I think 2018, a post that went a little viral, that talked about how the amount of funding is inversely correlated to progress in early stage, not later stage, especially seed stage, and I wrote this saying I mean, there has historically hasn't been a single success story that raised or over fund, that's overfunded, that actually innovated and I don't want to pick names in crypto, because if I say something it's going to cause a lot of scare, but classic.
00:03:21.284 --> 00:03:21.806
Do you remember?
00:03:21.806 --> 00:03:24.509
There used to be an app called Color.
00:03:25.110 --> 00:03:29.724
I vaguely remember this, like vaguely 2011?
00:03:29.724 --> 00:03:30.888
It was like a social media thing.
00:03:31.588 --> 00:03:36.673
Yes, it was all social media back then and we had companies that raised $50 million.
00:03:36.673 --> 00:03:38.995
I think they raised $15, $20, $30 million Back then.
00:03:38.995 --> 00:03:39.635
It was a lot of money.
00:03:39.635 --> 00:03:42.501
There were companies that raised $100 million.
00:03:42.501 --> 00:03:44.544
I even don't remember some of the names.
00:03:44.544 --> 00:03:54.007
There hasn't been a single success story that came out with companies that raised millions and millions of dollars In crypto very classic.
00:03:55.000 --> 00:03:55.140
Now.
00:03:55.140 --> 00:03:56.507
Solana is a classic story.
00:03:56.507 --> 00:03:58.566
Ethereum is a classic story.
00:03:58.566 --> 00:04:03.771
Pretty much any teams I mean that are considered top in terms of usage.
00:04:03.771 --> 00:04:11.661
Forget the market cap, because market cap can be may not be the actual way to estimate progress, but in terms of actual usage, solana raised what?
00:04:11.661 --> 00:04:13.606
30 million before they launched.
00:04:13.606 --> 00:04:15.651
I mean it's significantly lower to actually build.
00:04:15.651 --> 00:04:20.430
I remember because I was there the first, like the first round, ethereum raised what?
00:04:20.430 --> 00:04:24.086
18 million in their ico significantly lower to actually get started.
00:04:24.086 --> 00:04:29.913
Atom similar right and 15 million or so just enough to get you started.
00:04:29.913 --> 00:04:33.906
And most of the successful teams always have significantly lower funding.
00:04:34.468 --> 00:04:37.579
Because my thesis is once you have money, you have distractions.
00:04:37.579 --> 00:04:50.740
Now you're under the radar to spend the money because people are not going to give you money without you know either, having conviction that you're going to spend the money, because people are not going to give you money without you know either, having conviction, that you're going to deploy the capital because you know the opportunity.
00:04:50.740 --> 00:04:52.904
Cost of money is fairly high, right?
00:04:52.904 --> 00:04:56.293
So you can't just like sit and have money waiting in the bank.
00:04:56.293 --> 00:05:01.055
You get a lot of pressure from investors and usually they have controlling.
00:05:01.055 --> 00:05:03.702
You know authority to a degree and then you're screwed.
00:05:03.702 --> 00:05:05.567
So screwed.
00:05:05.608 --> 00:05:11.047
But teams that have less funding, smaller sizes, your coordination cost is much lower.
00:05:11.047 --> 00:05:20.362
Similar to how I believe Jeff Bezos has a saying that apparently in Amazon there is no single team that is too big for a pizza to share.
00:05:20.362 --> 00:05:27.668
If you cannot share a pizza with a team, that team is too big for a pizza to share, and if you cannot share a pizza with a team, that team is too big to work.
00:05:27.668 --> 00:05:41.064
So the pizza team size is an optimal way of thinking like small teams actually make a lot of progress versus large, humongous teams with overfunding usually typically don't make progress because they have too much distraction.
00:05:41.064 --> 00:05:41.908
That's a thesis.
00:05:41.908 --> 00:05:47.730
But here's a very classic example here of how Bagel is innovating with less funding.
00:05:47.730 --> 00:05:54.249
Because they're more resourceful, they're more ruthless when it comes to focus on the user and the market In a cash ride.
00:05:54.249 --> 00:05:59.408
We raised $1.8 million to launch what is a billion-dollar chain.
00:05:59.408 --> 00:06:07.050
Right, but a lot of that comes from our hyper-market focus and hyper-ruthless execution, and that comes with less money.
00:06:08.420 --> 00:06:13.170
So, greg, as we were talking about this already, about Akash, some insight.
00:06:13.170 --> 00:06:29.473
I'd be interested in hearing what worked in the early days of Akash, what kind of problem space you were exploring or usually that's called like problem maze, idea maze you were exploring, and how did you stumble upon this exact problem space that you're working on right now?
00:06:29.814 --> 00:06:46.502
well, the world was very different when we began in 2017, you know, when we published a paper in 2017 me coming from the silicon valley background the situation was so weirdly different when it comes to policy and when it comes to what you can and what you cannot do legally.
00:06:46.502 --> 00:06:56.367
My first shock in terms of how to build this company came when talking to lawyers think it was cooley which was the innovator.
00:06:56.367 --> 00:07:14.026
When it comes to marco centauri, I think he created this like thing called, say, saft right, the, the saft document based from safe and we were like, okay, we're just going to build a product and you know it's going to be a web-based product, and people are how do we get these tokens for two people to use?
00:07:14.026 --> 00:07:16.136
Uh, we're just going to charge them with a credit card.
00:07:16.136 --> 00:07:18.341
I was told that I'll go to jail if I do that.
00:07:18.341 --> 00:07:29.504
So first shock in terms because I'm grew up, you know, raised the whole notion that credit cards are the default ways you purchase something on the internet.
00:07:29.504 --> 00:07:36.365
When I was told that you cannot use credit cards to sell, I was like, okay, can we throw an ad?
00:07:36.365 --> 00:07:37.228
How do you get users?
00:07:37.228 --> 00:07:38.411
Can we do google ads?
00:07:38.411 --> 00:07:41.286
Like, no, crypto is banned from google.
00:07:41.286 --> 00:07:43.031
You cannot use crypto, cannot do anything.
00:07:43.031 --> 00:07:52.875
So the whole notion of user engagement and demand generation and payments that I knew was wrong in crypto.
00:07:52.894 --> 00:08:02.247
So a lot of things, things are very different now, right, I mean you can launch a meme coin and go to billions of dollars of value and you still are okay.
00:08:02.247 --> 00:08:05.072
In fact, you didn't get presidential immunity.
00:08:05.072 --> 00:08:12.209
So lessons are different, right, and we have to go and do quite a lot of work to even get to using credit cards on Akash.
00:08:12.209 --> 00:08:18.625
Now we have credit cards on Akash legally, without going through all the money services, license and all that stuff.
00:08:18.625 --> 00:08:21.749
And also what's the definition of a security and what's a commodity.
00:08:21.749 --> 00:08:37.308
I mean, these days it's a lot more relaxed, right, I mean it got to a point where truly decentralized companies are getting like investigated, were getting investigated at least, but current administration has taken a very different policy position when it comes to crypto.
00:08:37.308 --> 00:08:43.827
So, like I almost have to go back to 2017 and rethink what I could have done.
00:08:43.827 --> 00:08:54.529
You know that I was not allowed to from a legal standpoint, you know, and that's why we survived eight years, right, without getting in trouble in any way because we went by the book.
00:08:58.845 --> 00:09:07.500
So some of the things we did right, that's across the board was building a large community based on users and not based on speculators.
00:09:07.500 --> 00:09:19.711
So even before we had a token in the market, we had a series of events we call them testnets where I mean remember, alex, I think your company was involved too, so you can.
00:09:19.711 --> 00:09:25.254
Actually we ran, we ran it out for years yeah, it was 2018.
00:09:25.254 --> 00:09:30.587
We started testings from 2018, 2019, 2020 and we had, um, all kinds of cool things.
00:09:30.587 --> 00:09:31.770
We had like challenges and whatnot.
00:09:31.770 --> 00:09:31.990
Everyone's.
00:09:31.990 --> 00:09:35.727
To complete a challenge by using the product, you get some, some, some credits.
00:09:35.727 --> 00:09:38.583
You can exchange the credits for tokens that do not exist yet.
00:09:38.583 --> 00:09:54.499
So we were able to bootstrap our community pretty selective community, because they have to do be technical and they have to do a deployment, they use a command line, they have to be a provider, they have to be a validator, they have to do things on the network and you get some rewards right.
00:09:54.499 --> 00:10:04.451
And when we launched, we had a community that you know, we airdropped tokens to that bootstrap our community and that I think, in my definition, one of the best things.
00:10:04.451 --> 00:10:08.125
We kickstarted this whole notion of Airdrops in a very different way.
00:10:08.125 --> 00:10:17.368
Things are different now in terms of how people are launching Airdrops, but I think incentivizing your early users, not speculators, is a very, very good way of launching things.
00:10:17.368 --> 00:10:22.340
We went open source first, like day one, even before we launched.
00:10:23.039 --> 00:10:25.240
When you see my background, I'm very open source.
00:10:25.240 --> 00:10:29.523
I open source from license to and readmemd.
00:10:29.523 --> 00:10:30.864
That's how I open source.
00:10:30.864 --> 00:10:34.865
Would I do it from launch or would I not?
00:10:34.865 --> 00:10:40.927
If you ask me, I would still do open source, but it comes down to the comfort level of the team.
00:10:40.927 --> 00:10:43.208
Right, because I've been doing open source for a long time.
00:10:43.208 --> 00:10:45.230
I'm comfortable doing open source.
00:10:45.230 --> 00:10:51.173
I'm not embarrassed of my code and no, it is code because I understand there's quite a lot of like restrictions.
00:10:51.173 --> 00:10:59.495
When you go to someone else in my team being like, hey, we want to open source, they get very, very, very, you know, uncomfortable.
00:10:59.495 --> 00:11:05.099
And that I think you got to make sure you have an entire team buy-in, not a founder buy-in when it comes to open source.
00:11:05.099 --> 00:11:13.528
And it's very, very important because that might impede your progress because people will be afraid to ship code to open because they feel judgmental.
00:11:13.528 --> 00:11:17.370
So there's all kinds of emotional aspects that you've got to deal with open source.
00:11:18.139 --> 00:11:30.511
And third, I think like if I were to redo things right, we obviously launched the sovereign state chain and cosmos and akash is the first cosmos chain because there were there was a little option.
00:11:30.511 --> 00:11:33.370
The other option was ethereum, which is unusable till date.
00:11:33.370 --> 00:11:34.615
It's still unusable.
00:11:34.615 --> 00:11:43.260
Like you can't expect people to pay 30 gas fees to make a deployment that's, you know that's way more expensive than actual deployment.
00:11:43.260 --> 00:11:44.143
You're paying with gpus.
00:11:44.143 --> 00:11:51.315
So from from a shade state chain ecosystem, it was non-existent beyond Ethereum.
00:11:52.000 --> 00:11:58.013
If I were to do today, I would most likely do it on Solana or one of these shade state chain systems.
00:11:58.013 --> 00:12:09.970
When you do sovereign, yes, you have the benefits of control, but if you ask yourself deeply, deeply, do you really need that control in the early stages or can that control come later?
00:12:09.970 --> 00:12:22.023
I would most likely would choose a more modular system where a mechanism that lets me start off with a shared state but I can transition to a sovereign state if I need to.
00:12:22.023 --> 00:12:32.149
Classic example is SQL right and the reason why you want to do SQL, because SQL is a standard that can be used in MySQL or Postgres.
00:12:32.149 --> 00:12:40.604
All you need to do is SQL dump if you're using a shared database, and SQL import to a sovereign, a completely controlled database.
00:12:40.624 --> 00:12:42.549
If you need to right Similar analogies.
00:12:42.549 --> 00:12:44.586
I mean blockchain, I think should be.
00:12:44.586 --> 00:12:45.846
It's a key value pair system.
00:12:45.846 --> 00:12:50.889
You should be able to interoperate technically from one key value pair to another key value pair.
00:12:50.889 --> 00:12:58.207
I mean there are nuances in terms of transactions and block space and whatnot, but ultimately, if you remove all the wrappers, it's just a key value pair state system.
00:12:58.207 --> 00:12:59.124
That's your.
00:12:59.124 --> 00:13:02.025
You know it's immutable key value pair.
00:13:02.025 --> 00:13:07.163
It's it's immutable key value based.
00:13:07.182 --> 00:13:07.524
Yes, it says right.
00:13:07.524 --> 00:13:10.474
So some of the lessons like yes, that would save you quite a lot of security budget that you can repurpose for incentives.
00:13:10.474 --> 00:13:13.369
And one more thing we did absolutely right, absolutely right.
00:13:13.369 --> 00:13:15.940
I think a lot of them don't get it get, get these incentives.
00:13:15.940 --> 00:13:21.153
So people think of using incentives to bootstrap a network.
00:13:21.153 --> 00:13:22.716
We did the opposite.
00:13:22.716 --> 00:13:24.923
We're using incentives to grow the network.
00:13:25.866 --> 00:13:28.283
Now, is it the right way, the wrong way?
00:13:28.283 --> 00:13:29.988
That really depends on what you're trying to do.
00:13:29.988 --> 00:13:31.150
Right, it's.
00:13:31.150 --> 00:13:32.221
There's no silver bullet.
00:13:32.221 --> 00:13:34.229
You know that answer is not that helpful.
00:13:34.229 --> 00:13:41.288
But in a scenario like, uh, let's say helium, helium, you need the network even before you get product market fit.
00:13:41.288 --> 00:13:50.025
So there's no way in hell you have to bootstrap the network because it doesn't exist the resources, but something like Akash, where there's abundance of compute everywhere.
00:13:50.025 --> 00:13:51.907
They've got 7 million data centers.
00:13:51.907 --> 00:13:56.207
11 million of them are over 1 megawatt data centers.
00:13:56.207 --> 00:14:00.572
There's abundance of compute everywhere, so you don't need to bootstrap.
00:14:00.572 --> 00:14:03.488
People don't need money to go buy compute units, they already have.
00:14:03.488 --> 00:14:07.085
So bootstrap people don't need money to go buy compute units, they already have.
00:14:07.085 --> 00:14:14.166
So now the question is do you want to incentivize early for that computer to come on board or do you want to experiment with the understand the behavior before you incentivize?
00:14:14.166 --> 00:14:25.950
I think we chose the latter and that's working out really well because you know like most compute networks today, straight up pay, you know, for talking in tokens to have their compute on the network.
00:14:25.950 --> 00:14:33.111
Like I would literally give you block rewards, every block for you to go put your compute on the network.
00:14:33.721 --> 00:14:40.629
The problem is you know you don't know the quality of the compute, how good enough it is to your users.
00:14:40.629 --> 00:14:45.889
Does that fit your that compute, fit your market right?
00:14:45.889 --> 00:14:47.413
How would you know?
00:14:47.413 --> 00:14:48.764
By measuring utilization rates.
00:14:48.764 --> 00:15:02.845
So if you have high utilization rates for a certain type of compute that obviously is in high demand, on akash, for example, h200s, which are which are best gpus to run deep seek, are 100 utilized.
00:15:02.845 --> 00:15:05.490
That tells you that h200s are high demand.
00:15:05.490 --> 00:15:08.322
H100s were pretty high utilization.
00:15:08.322 --> 00:15:10.967
Now they're at 70 a100, similar right.
00:15:10.967 --> 00:15:16.081
So we know that h100s, a100s, h200s, 49s have high demand.
00:15:16.081 --> 00:15:17.884
We know that we be 100s.
00:15:17.884 --> 00:15:19.269
The older chips have low demand.
00:15:19.889 --> 00:15:22.086
Now the question is how do you incentivize each chip?
00:15:22.086 --> 00:15:34.663
So our incentive model now is, instead of straight up paying for the compute, we guarantee utilization because we know that if you have h200s you don't need incentives.
00:15:34.663 --> 00:15:38.092
I mean, utilization is 100, you don't need to be incentivized.
00:15:38.092 --> 00:15:44.408
But we do know that if you have h100s, where you, there's volatility in utilization.
00:15:44.408 --> 00:15:46.634
Sometimes it's high, sometimes it's lower.
00:15:47.201 --> 00:15:52.091
If you go to a provider and be like, I can guarantee you 80% utilization, like if you get under.
00:15:52.091 --> 00:15:56.488
I mean, provided that you have high quality compute and the provider you pass all these marks.
00:15:56.488 --> 00:16:00.746
If you're underutilized, we'll make sure we increase utilization.
00:16:00.746 --> 00:16:01.789
How Well?
00:16:01.789 --> 00:16:06.299
Just lower the price so we can subsidize the cost to the tenant.
00:16:06.299 --> 00:16:13.812
Because we know for sure from our data that if we lower the price for H100s to like 99 cents, they're gone.
00:16:13.812 --> 00:16:17.368
So we have the product market fit.
00:16:17.368 --> 00:16:19.567
Now can we throw in a discount?
00:16:19.567 --> 00:16:20.926
Everybody has discounts, right?
00:16:20.926 --> 00:16:22.606
Cloud providers do provide discounts.
00:16:22.606 --> 00:16:25.344
So if we throw an extra discount they're gone.
00:16:25.344 --> 00:16:25.965
Know that for sure.
00:16:26.485 --> 00:16:32.022
So incentivizing after understanding we came to this conclusion after we saw how our cash works.
00:16:32.022 --> 00:16:34.587
There's no way in hell we would.
00:16:34.587 --> 00:16:47.931
I mean we could draw math, model all that stuff in a in a room, but ultimately the customer's behavior is what's going to be the most valuable, most accurate inputs.
00:16:47.931 --> 00:16:49.259
You need to design incentives.
00:16:49.259 --> 00:16:58.510
So designing incentives post-launch, in growth phase, post-pmf, I believe, is working out really well for Akash.
00:16:58.980 --> 00:17:00.065
The numbers are very clear.
00:17:00.065 --> 00:17:01.168
Our growth is very clear.
00:17:01.168 --> 00:17:09.175
Our utilization Not only revenue growth, but we also measure earnings per gpu and we also measure utilization per gpu right.
00:17:09.175 --> 00:17:14.688
So across the board utilization right now is 70 and that was what 10 when we began.
00:17:14.688 --> 00:17:16.972
Over the last 12 months it came really high.
00:17:16.972 --> 00:17:23.809
On per gpu right now is about 20 per day compared to 10 roughly.
00:17:23.809 --> 00:17:25.532
That was about 12 months ago.
00:17:25.532 --> 00:17:27.839
So we clearly see an upper trend right.
00:17:27.839 --> 00:17:38.662
So some of the ways we look at how we design incentives, instead of blindly following and I think like Filecoin, I mean I can comment on incentive structures based on the outcomes of each deep end.
00:17:38.662 --> 00:17:44.301
I can tell you with a degree of confidence what went wrong, what came right, what went wrong.
00:17:44.301 --> 00:17:47.451
A lot of opinions on like different incentive models at this point now.
00:17:47.960 --> 00:18:08.105
Like Akash, was far and away one of the best investments seed investments, angel investments I've ever made you and I haven't actually had this conversation before, but I've, like, I've really thought about those early days when I first met you and the feeling that I got about you as a person and the team and like the vision you guys had for a cash.
00:18:08.105 --> 00:18:10.977
For me, investing is all about pattern recognition.
00:18:10.977 --> 00:18:17.267
Right, it's like seeing these patterns and feeling that feeling and knowing to have conviction and take action on that conviction.
00:18:17.267 --> 00:18:20.730
It's a very like it's as much as a science and an art.
00:18:20.730 --> 00:18:23.105
I get that same feeling with bid on.
00:18:23.527 --> 00:18:30.290
What strikes me of what you just said is you guys both approach these problems from first principles and with certain constraints in mind.
00:18:30.290 --> 00:18:36.112
Right, and it goes back to the earlier conversation we were having about like being very capital efficient.
00:18:36.112 --> 00:18:40.449
Right, like super capital efficient than open source from day one.
00:18:40.449 --> 00:18:44.375
It's like these three things are very common between both of you.
00:18:44.375 --> 00:18:55.832
Like the last conversation that Don and I had last week was I asked him like what was the insight that led you guys to uncover this zero knowledge proof, innovation and the whole market?
00:18:55.832 --> 00:19:00.258
All these researchers that were looking at how to execute this, were focused on burning compute.
00:19:00.258 --> 00:19:10.826
Measuring was compute used, but really their insight was, instead of looking at the burn, like at the burning of the compute we're going to look at, did the model actually get improved?
00:19:10.826 --> 00:19:17.290
And when they switched to like measuring that that's what the zero knowledge proof is proving is did the model improve by a certain amount.
00:19:17.290 --> 00:19:19.084
It changed everything.
00:19:19.084 --> 00:19:20.890
Right, that was the big, the big difference.
00:19:21.180 --> 00:19:28.444
So you measure that some work has been done by actually seeing the loss rate reduce.
00:19:28.444 --> 00:19:31.713
How do you measure model improved.
00:19:32.665 --> 00:19:33.932
Yeah, I can take that up.
00:19:33.932 --> 00:19:40.055
So first of all, before going into that, like Greg, totally agree with the incentive structure you described.
00:19:40.055 --> 00:19:45.396
Mark Andreessen had a quote, I think recently or famously.
00:19:45.396 --> 00:19:54.451
Is that like if a system is not working and if you put more money into it, that actually makes it worse, all right.
00:19:54.451 --> 00:20:00.457
So if a network is not working and you just put more money into it, that actually makes it worse.
00:20:00.457 --> 00:20:04.181
So if a network is not working and you just put more incentive into it, that makes it worse.
00:20:04.201 --> 00:20:07.484
It's a rule of thumb, of course, like there are exceptions all the time.
00:20:07.484 --> 00:20:10.633
No, there are no exceptions, actually, it's a first principle at this point.
00:20:10.633 --> 00:20:11.796
Yeah, it is a first principle.
00:20:14.345 --> 00:20:24.134
So if a network does not have users and you give some incentive and some of them show up, but they will leave after the incentives dry up anyway, you cannot keep that tap on perpetually and does not work.
00:20:24.134 --> 00:20:25.738
You have to do it.
00:20:25.738 --> 00:20:28.451
You have to figure out what works and what doesn't without incentives.
00:20:28.451 --> 00:20:36.126
That's when you know what's useful in terms of the for the customer customer and then you can supercharge that with the incentives.
00:20:36.126 --> 00:20:38.111
That's how I see it as well.
00:20:38.111 --> 00:20:40.415
So totally, I see eye to eye with you on that.
00:20:40.415 --> 00:20:46.238
And now going back to the protocol verification protocol that alex was mentioning.
00:20:46.238 --> 00:20:49.769
So it's basically what we have done.
00:20:49.769 --> 00:21:03.547
We figured out this like a modular structure for models where each contributor contributing they're building a model together, but they are not training this monolithic dense transformer together.
00:21:03.547 --> 00:21:20.196
Instead, like they're providing this modular contributions, which are called like adapter layers, like lauras, they're providing that and they stack on top of each other like Lego pieces and all those Lego pieces come together and build a model.
00:21:20.196 --> 00:21:25.311
So it's a fundamentally different approach of seeing this thing.
00:21:25.311 --> 00:21:34.548
Like a lot of people, a lot of teams, very talented teams, are trying to reduce the communication overhead over data centers to be able to train a dense transformer, monolithic model.
00:21:34.548 --> 00:21:36.592
We looked at it a different way.
00:21:36.592 --> 00:21:48.536
We looked at it in a way like why can't we just make the architecture itself modular, which is very much in line with the industry trend at the moment, because MOEs are a rage right now?
00:21:48.536 --> 00:21:55.652
Deepseq is a mixture of expert and this is a modular architecture, so they keep the transformer core and modularize it for efficiency.
00:21:55.652 --> 00:22:08.008
We did the same and then what that enabled us is that the contributions are, first of all, they can come from one data center each contribution, so you don't need that much of a communication to begin with.
00:22:08.008 --> 00:22:14.497
And second, the contributions are small enough that you can do zk verification of that.
00:22:14.497 --> 00:22:25.588
So when the person or developer or the data center developing this modular adapter, they run a zk verification on top and it's way less overhead.
00:22:25.588 --> 00:22:40.414
But then even that was like higher than what would be acceptable in a production environment, because in production environment you want seconds, not minutes, not hours or not days, and the previous example of verified training were weeks.
00:22:40.414 --> 00:22:42.689
So what we did?
00:22:42.689 --> 00:22:53.871
We looked into how this works and we saw that the previous attempts of ZK verifying these contributions were tied to compute.
00:22:53.871 --> 00:22:55.770
They were trying to verify.
00:22:55.810 --> 00:23:06.036
If compute was burned Then, like we, I personally have been in machine learning for more than 10 years and my team has like more than 40 papers published in machine learning together.
00:23:06.036 --> 00:23:14.050
So we have like extensive experience of training models in the traditional ML and we know that's not how it works In traditional machine learning.
00:23:14.050 --> 00:23:23.969
You do these massive training runs, you get the result and you look at the evals and if they don't match up to your standard you just throw them out.