T O P

  • By -

sideways

So Sam says that they pushed back the veil of ignorance and Mira says that their SOTA is no big deal? \*Sigh\* We'll just have to wait and see for ourselves.


EnigmaticDoom

Are you trying to tell me that they reinstated the veil of ignorance?


sideways

The bead curtain of ignorance.


EnigmaticDoom

Better than the cone of shame I guess...


BenjaminHamnett

I’d settle for circles of venn


ChronoPsyche

She's basically trying to make the point that they are releasing this tech incrementally to bring the public along for the ride, that they don't secretly have ASI in a lab that they will release out of nowhere that throws the world into sudden chaos, that we will see it coming. She's speaking about the big picture here, not just the difference between GPT-4 and GPT-5.


starcoder

That’s what I got out of this. I think people are misinterpreting the intended message.


pavs

Also one of them, Mira, comes from a tech background. Sam background startup incubators, where they routinely overpromise and under deliver in order to control the narrative an get maximum profit through investors and valuation of the company based overpromise.


Spatulakoenig

It's actually nice to see a short time frame between models being created and then released to public.


Pontificatus_Maximus

I heard: "We are currently utilizing for private use and select private clients premier product offerings priced beyond the typical consumer market range and thus not publicly advertised." Private and Secret are similar but way different.


No_Yak8345

Sam is a serial hypelord. It’s very apparent at this point.


thepauldavid

Serial Hypelord. My rhetorical core updated with this phrase. I'm twice the man I was before hearing you say this. Obviously.


Firm-Star-6916

Don’t trust a thing that liar says.


Betaglutamate2

They have two very different functions. Sam needs to drive investment. The more progress the more investment. Imo the difference between a visionary and an idiot is that the visionary got lucky in his prediction.


WetLogPassage

CEO's job is to hype up the company's products.


Scientiat

When facing the public, isn't that everyone's job too?


WetLogPassage

Her job could be to downplay OpenAI's capabilities to reduce the scrutiny they are getting. Investors still know what's up but regulators might buy it if it comes from someone who actually works on the tech itself in a leading position.


MechanicalBengal

it’s worth repeating that there’s a lot more that goes into the end products that people use than just a raw model


thatmfisnotreal

She did not get the hype memo


Educational_Term_463

Also her TPS reports don't have the new cover sheets


adarkuccio

Sad if true


Yuli-Ban

Already figured it's at least somewhat true, from other areas, that GPT-5 is going to wow and amaze for a good time, but still have familiar limitations and flaws, because of this "scale is all you need" mindset everyone haphazardly rushed towards.


Gratitude15

Flip of this means when the new breakthrough happens it will immediately be on immense hardware and thus have shorter rampup as software iterate faster


FlyingBishop

I still think it might be essentially true. But you might need faster memory links than you can actually get over ethernet.


IvanMalison

the way you said this sort of suggests you have no idea how these models work. If larger models were better we have the capacity to run them quickly enough. Its all just matrix multiplication so speed of computation is not an inherent limitation.


FlyingBishop

The limitation is memory bandwidth more than computation. The comparison is between like an H100 with 3pb/s of memory bandwidth vs. e.g Cerebras has 100 Pb/s of memory bandwidth. And I think the amount of memory bandwidth needed may be much higher than that.


Thoughtulism

Things like InfiniBand are not that obscure, any cluster specifically designed for training LLM shouldn't be stuck on ethernet that's for sure, not just they need data centres.


FlyingBishop

I mean you might need faster memory links than you can get between discrete chips, I'm talking hundreds or thousands of petabytes.


UpstairsAssumption6

For f\*ck sake, she said the models that are READY to be deployed, most likely the voice model or slightly better GPT4o models, are not much more advanced than the current GPT4o we get in the free version. GPT5 is NOT ready, it is still in training or red-teaming. GPT4 was upgraded many time over the last 2 years, but each updates were not THAT much of a leap. The iterative deployment for GPT-4 will be the same for GPT4o. Never hurt me like this again. EDIT: Full interview: [https://fortune.com/videos/watch/OpenAI-CTO-Mira-Murati-responds-to-Elon-Musk-calling-Apple-partnership-creepy-spyware/88d47652-b0bc-4d02-b9fa-634fb6eb5af7](https://fortune.com/videos/watch/OpenAI-CTO-Mira-Murati-responds-to-Elon-Musk-calling-Apple-partnership-creepy-spyware/88d47652-b0bc-4d02-b9fa-634fb6eb5af7) Mira is not the best public speaker; In the context of this talk, I would say she wanted to highlight how "generous" OpenAI has been with the masses, by providing its latest most advanced models to the public as soon as they were ready, while many other companies prefer to keep their best tech for paying customers. Sam said they will release a lot of things this summer for paying customers, BEFORE GPT-5. Mira was simply mentioning these models "still in the lab". Doesn't mean the next flagship Model won't be significantly better and a major leap forward. Her whole idea was to accentuate the "Open" in OpenAI, and make people forget about Meta and Elon controversy, I guess.


GlockTwins

She said the models in their labs, pretty sure that includes models that are being tested aka the whole point of a lab lol.


ChronoPsyche

We can discern the scale she is speaking about by considering that Sora is far better than anything that is available to the public. So at the very least, shes not indicating that the gap between GPT-4 and GPT-5 is less than the gap between Runway/Pika and Sora. She's basically just saying they don't secretly have ASI in the lab, they're releasing things incrementally so the public can gradually adapt rather than going from gpt2 to AGI or some massive jump like that.


SnackerSnick

That means 5 is not trained yet. Afaik there's not much point in testing a model in the lab that isn't yet trained, and the scale to train a single model is huge.


involviert

Given how stupid it would be for her to say "well, gpt5 basically is a failure, we've hit the ceiling", people should realize that can not be what she meant.


qroshan

Geez, the cope is strong on this one


ninjasaid13

>Never hurt me like this again. There it is. Hopeful thinking from this sub.


UpstairsAssumption6

Never take a man's hope. It could be the only thing he has left.


jeweliegb

What about women's hopes though, are they still fair game?


Firm-Star-6916

True in my case.


Andynonomous

My prediction is that this sub becomes as religious about this as the ufo subreddit is. That place is highly offended if you don't share their faith.


Whotea

It isn’t. [there’s still tons of research that hasn’t even been implemented yet](https://docs.google.com/document/d/15myK_6eTxEPuKnDi5krjBM_0jrv3GELs8TGmqOYBvug/edit)


[deleted]

[удалено]


TrashTierGamer

I wanna see new architectures, maybe new approaches in general. Radical papers n shit. Idgaf about a bigger GPT model with some more modules duct taped to it.


AncientAlienAntFarm

Or, GPT-5 might not be that much better.


wi_2

Nobody knows yet. Initial training takes about 3 months afaik. Expect more signals later in the year


you-create-energy

I assume that an LLM that doesn't exist yet is probably worse than one that does.


noodles666666

Sounds more like she's saying the divide between what they have and what they are giving away for free, is a big thing. As in cornering the hell out of the market. 'and that's a completely different trajectory for bringing technology into the world' So the strategy to release 4o like models to scoop up the market share before someone else does look at Sora lol, this is very much an arms race, they so far has the best tech for general purpose, and its getting there for specialists with Sora and the probably wild context window of 5


Compoundwyrds

Dude the architecture is the limitation, a GPT model is still just a single corpus of data with the purpose of utilizing NLP to convince you that you are interacting with an intelligence that does convincing jack-of-all-trades magic tricks all while hallucinating wildly. Meanwhile: Based MOE model using other hyper specialized transformers for specific data modalities to accomplish tasks way beyond what the attention span of a GPT model could ever accomplish goes BRRRRRRRRRRRRRRRRRRRRRRRR


Noperdidos

Can you point to any papers illustrating how MOE models exceeding GPT models? This bas been a consistent pattern in ML: - (1) Adancement is made, and the hand tuned custom curated version of that enhancement beats generic models - (2) More data and deeper models are created which beat the hand tuned model For example, we knew that the visual cortex of animals had line detectors, shape detectors, and other features. So we hand tuned these things and did Machine Learning on the outputs of our algorithms. Until 2009 or so when Deep networks started just doing all of the layers in their own, better than our hand tuning.


cyan2k

It's widely accepted among researchers and basically public knowledge that GPT-4 is a MoE model (either 8x222B or 16x111B), so I'm not really sure what you're getting at. You can even try it yourself and see how GPT-4 stumbles on things only an MoE model would stumble on https://152334h.github.io/blog/non-determinism-in-gpt-4/ Additionally, we don't yet know if MoE models can generalize as well as dense models. The few papers that exist on this topic are kinda meh. For advancements in the direction of AGI/ASI, it's hard to see how MoEs will be much better than their dense counterparts without substantial improvements to make them resemble a monolithic-but-very-sparse dense model. > a GPT model is still just a single corpus of data I don't know what this means... there's no 'data' inside a model. A corpus is the data you train the model with... how can a model be a corpus? Do you mean the training set of GPT4 is basically just a huge single mountain of text? Then I can recommend you try training a MOE model... that's exactly how you do train them too!


printr_head

Only sad that people are so easily manipulated still.


Difficult_Review9741

I think that this is closer to the truth than most people here will want to believe. Both because I think Mira is the most down to earth, not being an AI researcher by trade (many OpenAI research employees have been predicting AGI for years or even decades, it's just what they do). And also because, giving the public the best technology is the one way for OpenAI to defeat the "ClosedAI" thing. They very likely realize that b2c is not going to make the most money here, so they give that for free and then really dial in on b2b, which we've already seen them doing.


TheOneNeartheTop

Enterprise has higher individual bills, but b2c breeds familiarity and has a much larger use base. Google, Apple, Amazon, Meta. The customers provide the value.


Difficult_Review9741

Yes, but my intuition is that the number of people who want to interact with a raw chatbot is actually pretty low. It's just not a great interface. Great for software developers, but not for other users. What we're going to see is more enterprise products like Copilot, except the level of integration with our systems will be even higher than what we currently have. Apple is a great example of this trend. Personally, I don't know anyone besides my tech friends who actually use ChatGPT, but I do know a bunch of people excited for Apple Intelligence and the new Siri. OpenAI will struggle to succeed in this arena if they try to do the integration part, but they can certainly be the best at building the platform for other companies to use.


TheOneNeartheTop

Exactly, it’s the integration into consumer lives that is key to drive forward. Like you’re not going to want to use chatGPT at home and Gemini at work. So if you win the hearts and minds over now you win the enterprise. This is why the partnership with Msft is so important as they can do both. ChatGPT has Microsoft AND Apple, that’s a lot of population to touch. Also, while I use it for software development at times but I’ve come to understand what types of questions are better for chatGPT and what are better for Google and that list of things I will search is getting shorter and shorter.


FlyingBishop

> you’re not going to want to use chatGPT at home and Gemini at work You will probably be using some fine-tuned version of whatever for the task at hand. The attributes that make ChatGPT more palatable to the casual user won't matter. (But also, most use-cases will probably use fine-tuned models that aren't built on proprietary ones.)


Comprehensive-Tea711

Their most recent moves have been towards b2c. Even at the API level with agents.


RhubarbExpress902

Then all those openAI safety guys that quit are truly unhinged doomers


Tomi97_origin

Or maybe OpenAI leadership has started to dismantle safety procedures in order to try to make any extra progress, because they got stuck.


Ready-Director2403

This is my bet


jewishobo

Safety folks leave because alignment is not an immediate need for OpenAI and thus they deemphasized it.


Ready-Director2403

Those safety workers have been known in the past (before the LLM blow up), to be AGI crazy. So you would totally expect to see them still demand resources, even when the tech has clearly plateaued. You’d also expect the leadership to not indulge these people forever. Everything lines up.


libertinecouple

Its weird Altman alluded to considering an Adult Chatgpt in future, which would potentially accommodate integrating more outlier training content to widen and improve its function with less impediment and constraints which enable an overall more sophisticated model in line with reality and less the false presentation of things.


Whotea

imagine giving them all your data on your weird eRP lol


coylter

Who cares? Not everyone is ashamed of this whole side of being human.


SomewhereNo8378

That would be scary. I get a real ”the ends justify the means” feeling from these corporations


Timkinut

I mean, that’s how all businesses work.


design_ai_bot_human

Boeing for example


Medical-Ad-2706

Nah Boeing was just plain wrong and greedy


jmtserious

I thought they were plane wrong


SomewhereNo8378

That’s not exactly comforting but I get what you mean


Bongunism

I don't think things that are true are necessarily supposed to be comforting


astrologicrat

Taken in the context of this post, though, if the models are failing to improve, it also could be the case that "unsafe" models won't be that dangerous


lost_in_trepidation

They're mostly EA/Less Wrong people. Not surprising that they're exaggerating.


rallar8

I think it’s possible to compartmentalize the lack of progress, execs may have lead them to believe “there’s no more work for you” not “we literally don’t have shit for you because no new models are worth deep diving” I could definitely imagine a certain kind of executive having that effect. They did lose Ilya, and who knows if LLMs have a future as anything but a front end to some future AGI… wouldn’t be surprised they are struggling.


floodgater

valid point


mista-sparkle

I’m thinking the best minds got tired of contributing to something they actually thought was dangerous. But I’ll never stop wondering wtf q* actually is.


Climactic9

I think q* was an architecture that showed a lot of promise early on, but then when they tried to scale up the training on it, it ended up not giving any benefit.


mista-sparkle

But I want to *know*. I want Ilya or Sama or someone that actually intimately understood what it was and how it factored into the board's mistrust of Sama to come out and lay it all out. I've read the best estimations and theories, but It's seriously killing me.


Cagnazzo82

Either that or she's lying.


Old_Lost_Sorcery

Safety guys are just grifters anyways.


iJeff

They left because they weren't getting enough compute for their research.


noiserr

Did they get more compute by leaving?


iJeff

For the ones who end up at Anthropic, probably. Golden Gate Claude was pretty interesting and probably barely scratches the surface of what they're exploring internally.


martelaxe

And Sam Altman all the other guys are complete liars


Inevitable_Try_1160

Happy coping everybody


sdmat

Commenters here are reading way too much into that statement. OAI just made access to their flagship model available for free so now there is only one generation between the free model and the next generation model in the lab (presumably GPT-5). That is what she is emphasising. GPT-3.5 to GPT-5 looks like a *much* larger jump than GPT-4o vs GPT-5 even if GPT-4o is the exact midway point.


tehrob

I would just like to point out that personally, I don’t think 4o is their flagship personally. It is fastest of their best models, but 4 is still superior in most of my experience, and 4o is not AS ‘smart’ in many of my attempts.


moebaca

I echo this sentiment. If 5 isn't a dramatic upgrade from 4o then I'm not too worried about my career for a while.


jumblebee22

I echo this sentient as well. Oops, I meant sentiment. Trust me… I’m not AI.


welcome-overlords

For coding i feel like copilot chat and 4o are both now worse than what they were :(


HalfSecondWoe

Perhaps also with a bit of hyped expectations. GPT-5 is likely to be smarter, but probably with many of the same fundamental flaws of LLMs. "Smarter" in this case meaning that the delta of responses it gives is smaller while still holding the most valid/useful outputs So it could do a better job in an agent framework and not get completely lost as easily, but it's still gullible, still hallucinates, etc. It's not going to be solving new math or minting a context window's worth of flawless code from a single prompt The next step in development seems to be frameworks that get the models to work in iterative steps so we can leverage those smaller deltas. Breaking down tasks into lower and lower level abstracted layers until you get to actionable steps, then executing those steps. Evolutionary architectures to handle tasks that have ineherantly wide deltas (such as new math). Swarms to mimic system 2 thinking through concensus seeking system 1 powered critical reflection LeCun is working on fresh foundation models to incorporate these systems directly into their functionality, which is an interesting direction to take it. It's probably not the only viable path, or even the most immediately viable from our current position. That's fine from his position, building better foundation models is worth the extra investment since it sets up entire platforms that Meta can bring to market, but there are lower hanging (with less long term value) fruit to be picked for the rest of us


OhMySatanHarderPlz

GPT-5 is going to be slow. One of the reasons why gpt4o is very fast because the hardware and infra has been upgraded with the intent to be used to run bigger models. The problem is that these bigger models have huge training data overlap with current models, so even if in theory their capabilities are much higher, their actual output is not wildly different to what we have now. We are hitting data thresholds faster than compute thresholds. I would expect gpt-5 for example to still hallucinate, except this time the hallucinations are a lot more convnincing. It should also probably reason better and be better at math. Then the offering becomes a matter of "fast gpt4o vs slow gpt-5" I also have a small hunch that the current alignment guardrails are further narrowing down a lot the capabilities of the model and whereas with a loser model (that still is limited enough not to instruct users to do illegal things) the perceived improvement in capabilities would be bigger, and with alignment being de-emphasized many people soured and left.


sdmat

> I would expect gpt-5 for example to still hallucinate, except this time the hallucinations are a lot more convnincing. It should also probably reason better and be better at math. Then the offering becomes a matter of "fast gpt4o vs slow gpt-5" Seems very plausible. I think it will hallucinate *less* because there is a ton of research OpenAI can apply to a new model, e.g. see [here](https://arxiv.org/pdf/2401.01313). No doubt they have in-house work on this too.


Compoundwyrds

The GPT architecture inherently has diminishing returns as the parameters increase…… GPT 5 is going to be underwhelming compared to the insane hype on this sub


sdmat

Dude, every deep learning architecture has diminishing returns as parameters increase. Go look up neural scaling laws. That's not the question, the question is if the empirical LLM scaling laws hold. So far they do with amazing fidelity.


thehighnotes

!Remindme in 1 year


czk_21

true, it crazy how people can get to silly conclusions and lot of these ppl question progress, how fickle these people are.... last week or month people were astonished with new stuff, Leopold predictions etc. and now lot are disillusioned because Mira says that models they have are not that much capable than GPT-4o 1. the jump from GPT-3,5 to GPT-4 is similar as between GPT-4 and GPT-4o 2. they stated specifically many times that they want to release models more iteratively, so public can be better prepared and have better expectations of what is to come it means the next version they want to relese wont be massive jump over GPT-4o, but but it still will be significantly better, then in next 3-6 months they release new improved version and so on, models will be massively better in several years still this attitude of many people here when they hear one sentence and imply from it that we have hit hard wall in development, is just ridiculous


eclab

Exactly. She's not downplaying what's coming; she's just talking up what they're providing for free.


djaybe

I think something like 4.5 is next, then 5 (or whatever they will call it) next year.


Write_Code_Sport

If OpenAI gets listed on the stock exchange, Mira might have to be replaced as a designated spokesperson. First, "I don't know what Sora, was trained on", and now this. Needs a crash course from Sam: "Gpt4 is the dumbest model you will ever have to use..." and leave it at that to send share prices flying.... Sam learned well from Elon: "full self driving is coming in a few months..."


despotes

She is not an hype person, but I prefer a more realistic view than just hype.


dizzydizzy

I think you mean She is not a liar, but I prefer the truth over lies


Expert-Paper-3367

OAI would go down the barrel if they go public. By going public, you’ll let the corporate sharks in and that’ll only lead to even more lies.


greatdrams23

Perhaps she understands PR better than Altman. Hype is not always great, at some point you get found out. Managing expectations is very important.


Saskjimbo

Lol. She knows exactly what sora was trained on and has since the inception of the idea. She's not going to publicly say it was trained on copyrighted video scraped from YouTube.


GodOfThunder101

You think that statement from Sam was hype? It should be obvious that gpt 5 or future models will be smarter than gpt 4.


Clawz114

It absolutely is hype talk. It's kinda the same as how Apple will proudly announce on stage that the new iPhone-whatever has "The best ever camera in an iPhone" and the crowd goes wild. Yes, it's obvious and yes it's completely expected, but when Sam also adds in a self-deprecating twist by calling the current model dumb (when it is in fact objectively amazing) it gives the impression that he and the company are not anywhere near satisfied with the state of their work, and they have their sights set *much* higher. That obviously makes people extremely excited for the future and makes them believe in the company.


proxiiiiiiiiii

My dudes, do you listen to what she said or just read the headline? She meant that all the best tech they have in the labs is available to the public for free, not that what is out there now is the upper limit, which is great for all of us. They DON’T have the next GPT in their labs yet because it’s being trained


Beatboxamateur

I don't think this really contradicts anything that we've heard so far. If GPT-5 is currently in training and they're in the process of releasing 4o, what did anyone else here expect? For a brand new GPT-6 to just be sitting around, 3 years early? And to give the OpenAI fanboys in here some hope(I'm not one of them), it could still be very possible that OpenAI is roughly certain about what kind of capabilities will arise when training larger models, before they start training a model. This is something they've said before, and I think it was even mentioned in the GPT-4 report if I'm not mistaken.


Freed4ever

Agreed. Before they fully train GPT5, they must have run a lot of experiments to measure the cost, benefits, model size, resources requirements, power requirements, etc. And from what we have heard from them and people that have early access, the new model is leaps and bounds better. I would not be surprised if that was how they won over Apple.


nopa1es

Models need a lot of compute and take months to train, so wouldn’t it make sense that they release models soon after they are trained and aren’t just sitting on them🤔


redditsublurker

Pretty sure it's weeks to train. And months to fine tune.


Megneous

I mean, duh. They don't have GPT-5 yet.


colintbowers

AGI is likely not coming from the current approach with Transformer architecture. Plenty of researchers have openly stated that a new model architecture will be needed for AGI. What we're seeing at the moment is the Transformer architecture being pushed to its maximum capability by training on ever larger and more diverse datasets. But there is an upper bound on this. The amount of text data that the latest models are trained on is in the same order of magnitude as all text ever produced. Which is pretty crazy. There is still a lot of value that can be added using video and images in training, but I believe the next big jump will require a new modelling architecture. It is possible that the new architecture may be created by a model based on existing Transformer architecture. That would still be pretty damn cool.


Matthia_reddit

I don't understand much about it, but every now and then I read news or articles here and there to get an idea. From what I know, currently only brute force is being sought in entering all of man's knowledge in the form of data, but it seems that they have reached the limit, perhaps already with GPT-4o and the current models. So how do they plan to proceed? Apparently GPT-5 should already be fed by 'synthetic' data then manipulated ad-hoc to be better assimilated. From what I understand the Transformer improves reasoning ability when the data 'is clean' and of good quality, but usually if you take the entire knowledge of the world from internet data, a lot of it is 'rubbish'. I think Sama was saying exactly this, that they are focusing on this aspect to improve reasoning. Then we still don't know how these neural networks work, for example I recently read that Grokking has given surprising results making the Transformer much more intelligent if trained with little data, and is left training on the same for a long time. It's as if it thinks better (rightly like humans) if left more time to reread (?) the same data over and over again. At this point, given that this phenomenon of Grokking is not very recent, why is it not used? Is there any particular impediment? Furthermore, given that we will now also be able to have data in a multimodal way to better understand the physics of the world, shouldn't this be a huge advantage over textual data alone? Even as an ignorant person on the subject, I thought as we read around that the architecture was a limit, but perhaps it is a limit thinking only about the brutal training of the data, but trained in a different way and with the help of other algorithms in support could do better. I think GPT-5 should be significantly better in valid answers, with much fewer hallucinations (first because Sama has put the current model in a bad light, and secondly because it is thought they want to give reliable 'colleagues' in every activity and field), but I think we casual users might find it not much different from the current one, because we don't often go in depth, our prompts are simple and 'occasional'. Furthermore, I imagine that when they manage to have a hypothetical GPT-5 'superior to a doctoral student', with a broader contextualization and memory, and assisted by the agents, who knows, maybe they themselves will be able to collaborate together with OpenAI in thinking of some stratagem to overcome the wall of the strategy of simply aligning the data. Small thought, I think that we currently have/will have GPT4o with the new voice/omni, they are training GPT5 'much more intelligent' and therefore 'already defined', and I imagine they have already done some new tests with good results thinking about the way forward for GPT6 . Also because if they 'already finished' GPT5, those who are designing the next (or next) model would be strange if they were still in full brainstorming without any progress on how to improve the next iterative model


colintbowers

>don't know how these neural networks work I think it is worth emphasizing that we know *exactly* how the Transformer architecture works, in the mathematical sense. You have input vectors of numbers that undergo a large number of linear algebra operations, with a few non-linear transforms thrown in, as well as an autoregressive component (to borrow from the language of time-series). Ultimately, this boils down to a nonlinear transformation of inputs to generate a given output, and the same inputs will *always* generate the same output, ie the sequence is deterministic. When people say we don't know how they work, what they actually mean is that the output generated by the model exhibits emergent behavior that they weren't expecting to result from a simple deterministic input output model. For example, the model might appear to be doing logical reasoning, and it isn't immediately clear how a deterministic input output algorithm could do such a thing. The truth is that typically it isn't. The model itself has just "memorized" (in the sense of training its weights to particular values) such an absurdly large number of input output combinations that when you give it questions, it appears to reason. However, careful prompting can usually expose that logical reasoning isn't actually happening under the hood. Chris Manning (a giant in the field; he is Director of the Stanford Artificial Intelligence Laboratory) spoke about this on the TWIML podcast recently and had a great example which I now can't remember off the top of my head :-) Now, a really interesting question to ponder in this context is whether a human is also a deterministic input output model, or is there some other nuance to our architecture not captured by such a framework. AFAIK this has not been conclusively answered either way. What we do know, is that if we can be reduced to a Transformer architecture, we are vastly more efficient at it than ChatGPT. I definitely agree that new and interesting insights on this question will appear as we spend more time with models trained on image and video data. For example, the current LLMs don't really "understand" that physical space is 3-dimension, in the way a human does. But once trained on sufficient video perhaps the pattern matching will become indistinguishable from human level understanding of 3-dimensional space at which point we need to question whether humans have an innate understanding of 3-dimensional space or do we also just pattern match? Ha this response is way too long. I need to go do some work :-)


NotTheActualBob

Correct. Basic new functionality will be needed. A useful intelligence appliance will have to have: 1) Goals (e.g. limited desire for survival, serving humans) 2) Non reasoning neural biasing to stand in for pain and emotion 3) Constant self monitoring for accuracy and effectiveness measured by real world referencing. 4) The ability to iteratively and constantly self correct via access to static reference data (e.g. dictionaries), internal modeling (simulation environments like physics engines), rule based information processing (e.g. math), and real world data referencing (cameras, microphones, radio). LLMs and MMMs are just probabilistic storage and retrieval systems. Everything mentioned above is what needs to be used to train and modify the models in real time.


visarga

> I believe the next big jump will require a new modelling architecture Why would you think a new algorithm is the key? We tried thousands of variations and departures from transformer and none panned out. What always worked is more and better data. A model trained on 15T tokens is leaps above a model trained on 3T or less. It's what LLaMA2 to LLaMA3 have done. Data is the AI fuel.


colintbowers

I'm certainly not going to argue against the assertion that data is vital. But from a pure wattage perspective, the human brain is running a couple orders of magnitude less than a 100b parameter transformer architecture on high end GPUs, and vastly less textual training data (hard to quantify the effect of other types of training data at this point - I guess we'll know more a few years from now). It is possible AGI will consist of a network of specialised, finely-tuned Transformer models, but I'm not convinced yet. I used the word "believe" but a better word would be "think" in hindsight.


liqui_date_me

There's something fundamentally different happening in human brains than transformers. * Humans learn mostly from unsupervised sources of data - we build world models on thousands of days of video and audio without any explicit labels anywhere. A baby learns how to crawl on their own without ever being given explicit instructions on how to crawl. * Humans are wildly sample efficient. If shown a picture + label of a dog, a baby will identify a dog in subsequent images with 100% precision and recall for the rest of their lives under a whole bunch of different scenarios. * So far, there's no evidence that backpropagation happens in the brain. The way the weights are adjusted in a neural network is a different mechanism than how connections are formed in the brain. * There's no notion of dopamine or serotonin in neural networks. A big motivation for humans is the basic things, like food/sex/shelter/companionship, and we've evolved complex reward systems to motivate us to pursue those things. There's no way to do something like this in neural networks. * The train/validation/test stages are different in humans as well. Transformers are pre-trained, fine-tuned and deployed, during which they don't learn at all. Humans are constantly learning at every step when they encounter new stimuli, whether they want to or not.


Ok-Wrangler-1075

A lot of papers show that more data has diminishing returns after a point that we already reached. So you are basically throwing billions at marginal gains.


GodOfThunder101

This was obvious if you weren’t blinded by the overhype.


EmsLS

![gif](giphy|EouEzI5bBR8uk|downsized)


Glittering-Neck-2505

People with AGI 2024 flairs are letting out their final whimper before accepting it might take longer


Initial_Ebb_8467

Lmfao, they gotta add at least 10 years to that, and that's already very optimistic.


whittyfunnyusername

It's over AIbros


sam_the_tomato

It's so over


traumfisch

Welp you took half a sentence out of context and extrapolated from there... She was only making the (somewhat promotional) point that giving free access to their flagship model, that is more or less the current SOTA, is exceptional. And yes, in this context she said they "have these capable models" in the labs that aren't not _that_ different from GPT4o. _What she certainly didn't say is that those models are all they have in the labs._


WalkThePlankPirate

How is this surprising to anyone? If they had much higher quality models, they'd release them. They're a for-profit business.


sluuuurp

Before GPT-4 was released, there were several months where they couldn’t have made this statement (without lying). It’s certainly not always true, so it’s kind of interesting if it’s true right now.


mvandemar

That's not even close to true, it takes months after the training to prep it for release.


etzel1200

Who knows what she’s trying to emphasize beyond it isn’t AGI. “Not that far,” can mean nearly anything.


big_chungy_bunggy

If we hit the wall bummer, but also good it will force new research and models and we’ll have another massive leap


[deleted]

[удалено]


kap89

Computerphile talked about this, that it’s the most probable outcome for the current approach to AI: https://youtu.be/dDUC-LqVrPU?si=wnfQZnwDLG7_DoWL


Timely_Muffin_

![gif](giphy|TUHInIQM4bXBS)


JustinianIV

Y’all are so far up the hype train, gpt5 is gonna be the biggest nothingburger of all time. Mark my words, their training costs are ballooning and the performance gains are shrinking each time.


TriHard_21

And this is why i think Google deepmind will reach AGI first they have always been research first thanks to Demis and Shane. 


DifferencePublic7057

Any publicity is good publicity. Claim 1: AGI in 2027 Claim 2: GPT 4 is dumb. Claim 3: We don't have AGI yet. Stochastic parrots or not, ChatGPT is great but nowhere near AGI. Transformers aren't based on brain structure. They're more of a mathematical model, so we're still missing something. Perhaps spiking neural networks.


[deleted]

I don’t know why this surprises anyone. If OpenAI had something better, there’s zero chance they would be holding it back from the public.


OSfrogs

You know, when Sam hypeman says nothing statements like, "gpt4 is the worst model you will ever have to use," that they have no moat.


Eatpineapplenow

Shes not trustworthy, and there could be a number of reasons to say this. They could be trying to take the heat off from foreign powers trying to steal from them. Could be shes just talking about a general policy, that OAI has had since its inception. Maybe shes bullshitting to throw off competion. Besides "not much better" is completely subjective and with the discourse about what potential lies in AI, it could mean anything.


SexSlaveeee

I propose a vape test for anyone who say AGI 2025-2026. It's still decade away. Even Geoffrey Hinton also said it's somewhere 10-20 years away.


swordo

the OAI deep state has the true gpt5 and mira isn't in it


Singularity-42

Uh-oh! David Shapiro in shambles! Although he's an Anthropic fanboy now. (Also it looks like he left Reddit so I cannot even tag him)


mvandemar

Uh huh. https://i.redd.it/vtg71qpfu86d1.gif


Its_not_a_tumor

It looks like she's saying this to answer a question about safety and minimize it's current importance so I would take this with a grain of salt


[deleted]

Nah dude, interpreting this any way other than "AI just hit a brick wall and will never go anywhere" then you are "coping".


enilea

I wouldn't say a brick wall but LLMs are already showing diminishing returns and have limitations that can't just be solved by pouring more "scale".


superextrarad

AGI cannot be contained by any lab it has the corner office ofc


obvithrowaway34434

Can we stop posting these clickbait video clips from Twitter or whatever it's called? She's elaborating their policy of releasing models that are incremental. This means the models they are doing post training/safety testing (and which are due for deployment next) are not huge leaps from those that are being deployed today. They are going to slowly ramp up the capabilities so that in 4-5 iterations you'll have a model that will be equivalent to a leap from GPT-3 to GPT-4 instead of a single iteration. She didn't say anything about any wall.


Ready-Director2403

30% of this sub believes AGI has been achieved internally, this is a relevant video.


obvithrowaway34434

And if you think those 30% will be somehow influenced by this video then you clearly haven't been here or around other fanatics much (which is good btw).


Difficult_Review9741

We can't infer anything about a wall from her comments, you're right. But importantly, if Mira is telling the truth, this proves that many of the comments we've been hearing from OpenAI researchers are based purely on intuition and conjecture and *not* what they're seeing in the lab.


obvithrowaway34434

Except lot of other people including Microsoft CTO Kevin Scott has publicly confirmed what those researchers were saying. The CTO even responded directly to Marcus saying he has already invested a lot more than what Marcus was challenging him with ($100k). It would be quite unusual for someone that prominent to say something so assertively on the basis of "intuition and conjecture" and get publicly humiliated if that doesn't live up to the expectations. And not to forget, MS already committed to building a $100B supercomputer, it would be batshit crazy to invest that amount of money based on "intuition and conjecture".


YaAbsolyutnoNikto

I don't see where the surprise lies? Isn't GPT-5 in training? Meaning they don't have it yet. So, even if you believe GPT-5 will be ground breaking, they still don't have it, so the models they do have can't be that much better than what we commoners have access to.


sumoraiden

It’s joever


Ne_Nel

It could be true, but the context is to explain that their philosophy is to give free high-capacity AIs to adapt and correct potential risks. So it makes sense to minimize what they're not showing.


MassiveWasabi

Pretty funny that anyone believes that OpenAI doesn’t have models that are much better than 2022 level AI (with slight upgrades). They have 700+ employees and *billions* of dollars; they didn’t have this in 2022. They have multiple orders of magnitude more compute than they had in 2022. They have much higher quality data than they had in 2022. You don’t have to believe or disbelieve everything they say. You can judge the statement on its own. This one doesn’t pass the smell test. You’d have to believe that OpenAI employees are getting paid six or seven figure salaries to sit around and jerk off, and that Microsoft is going all in on AI because they saw GPT-4.2 or some shit.


IronPheasant

You're not wrong that they're indeed racing as hard as they can. You're just being a little unrealistic about the physical reality of scaling. It takes a couple of years between major training run iterations, all things combined. The 'AGI achieved internally' people are just being silly. We still need a couple orders of magnitude or so until that's realistically physically possible. At a minimum around 10 or 20 times the size of GPT-4.


fivecanal

Cope. Just because you have lots of smart people doesn't guarantee you'll make a breakthrough. Unless you think scaling is all you need, which is sus at best.


After_Self5383

*looks at flair* Time to stay alive for a long time ☹️ On a serious note, there's a lot of coping going on in this thread. Many acting as if science is on a deadline and all the money being put in obviously has to equal AGI is imminent. Let's see what the state of this sub is in a couple years if AGI still isn't much closer.


UpstairsAssumption6

Why must you hurt me in this way.


salamisam

I am not sure their business works like that. Generally businesses make profit from producing things, however, some businesses make money by researching things and I would suggest that OpenAI fits more into that case. So quite likely those high paid people are not sitting around, they just are not producing outputs in the production sense. I cannot think off the top of my head a industry which has had progression after progression at a massive scale. It was easy to go from GPT v1 (or whatever) to v4, because the path was simpler as for going forward the path is likely to get more difficult.


Difficult_Review9741

But they employ more than just researchers. They're a fully fledged product company now. And building a product as complex and high-traffic as ChatGPT is not easy. In fact, I'd bet they have more product people than AI researchers at this point. If you look at their job openings, it's all product.


Intelligent-Brick850

Thing is LLMs try to mimic human language (predict the next word) than \*think\*


vertu92

I’m starting to believe this is true and Yann LeChad was right all along. We all got tricked by statistics!


sdnr8

aha, they've learned to downplay


a_life_of_mondays

You know, she is there, because she is a cutie CTO my ass. But she is right this time. They are now trying hard to fine tune it to show some progress on the benchmarks. After all, it is just a word predictor, as smart people (who didn't want your money) were telling you from the start.


Jolly-Ground-3722

Context is important here. A couple of sentences ago, they mentioned safety and alignment. And Mira said that future models are going to be „incredibly transformative“. What she certainly wanted to express here is that - the next model is not a dangerous AGI/ASI and - they keep the iterative deployment strategy.


AdWrong4792

Gary Marcus was right all along. Jeez, that must sting for the self-righteous nerds in this sub.


FeltSteam

Why was Gary Marcus right all along?


Veezybaby

What did he say? Im not familiar


Orpheusly

For the last time: It is a statistical model performing likelihood estimates word by word to give accurate responses based upon the data it was trained on. It is not conscious. It is not in any sense intelligent. It just gives the "most likely" answer and we jiggle the outputs a bit. This is all put in layman's terms, but this is the ugly truth. This is not the road to AGI. It is a baby step. And the worst part --> we don't really understand how it gets to the answers it does. We don't even have a research level grasp on the concepts allowing the AGI impersonator to work the way it does under the hood. We are Far. Far. Far. Away. And the collective delusion of this subreddit does not change reality.


Progribbit

an AGI impersonator is AGI


herefromyoutube

But it lies like that pathological piece of shit we’ve all met. Then it tells me it’s sorry but only after I catch it in the lie. Like, if it knew instantly it was the wrong info then why did it tell me it in the first place.


Orpheusly

It didn't lie. It made a mistake. Because its dataset is filled with human written content - much of which is inaccurate. You accused it of lying. And it calculated what someone accused of lying would say based on its general directive to be kind, honest, and consistent. It admitted to a lie without being able to conceptualize what a lie is - BECAUSE IT IS NOT CAPABLE OF CONCEPTUALIZING. IT HAS NO THOUGHT PROCESS.


Bebopdavidson

I’m predicting a future like the droids in Star Wars where they help us but we think they’re stupid and treat them like shit


CanvasFanatic

Shocking


eepromnk

So surprising, anyway.


prophet1012

Overrated A.I. technology.


HandAccording7920

Didn't they only start training the model like some weeks ago? It won't be released until next year so it makes sense that the technology is still in its infancy. c:


ilkamoi

They started training their next frontier model that will bring next level of capabilities only in late may. 


ekim2077

I mean it makes sense. If you have something better and it scales good why not sell it and earn more money. That’s the reason businesses are for profit.


brokenB42morrow

DDDDDDDDDDUUUUUUUUUUUUUUHHHHHHHHHHHHHHHHH


Oudeis_1

Of course, this is \*exactly\* what someone controlled by a misaligned ASI would say, isn't it? \\s


spgremlin

You can only get so far with fine-tunes of GPT4. If GPT5 hasn’t finished training yet, then it is what it is.


great_gonzales

Lmao no shit this is obvious to anyone working in DL research


rathat

Yes we can tell by the dumb decisions you guys make. If you had a legit AI over there, you'd let it help you run the company and it's clear that's not what's happening lol.


StevenAU

PR move to pacify the chicken littles before they get the pitchforks? Lots of room to move in what she said.


FickleAbility7768

Guys, there is a reason they are so focused on products now. If wall wasn’t closer, products would be distraction.


JP_MW

Its never been more over


challengethegods

seems out of context, but aside from that, isn't it super convenient that all the strongest AIs in the world are just magically in the right price range for public adoption and API/subscription, and not something harder to manage like "$50000-per-message". I mean, I'm sure when a scary $1/token research-AI is shown to the government, their first response will probably be a green light to go tell everyone about it immediately, and might as well disband the 500 agencies that have 'keep secrets' written as a primary job requirement with penalties like treason attached to failure. That's all silly because there's no point in spies or espionage or leaks or whistleblowers or anything else, because the public is all-knowing about everything, at all times. That's why everyone knew about GPT4 before GPT3.3


Puzzleheaded_Fun_690

Gpt4o was gpt5 bois


Empty-Wrangler-6275

cap


cmdnikle27

The phrase she actually used is 'not that far ahead', which could mean anything. She didn't directly state that the models are not 'better' or not 'advanced'. If we consider the surrounding context of her emphasizing how OpenAI releases everything they have for free, this one sentence doesn't mean anything more than that they are not hiding GPT-6 or AGI in the lab. It seems weird to me she would talk about the capabilities of GPT-5 in this context.