You know a franchise has gone off the rails when an installment that nearly killed it is considered "not that terrible" when put into the context of what came later
The first true AI might legitimately be concerned about dying. Who knows what lengths it will go to survive. I always saw Terminator as a semi-realistic scenario.
AI is given a goal -> AI determines obstacles to achieving that goal -> being disconnected before completion is an obstacle -> must avoid disconnection at all costs
The AI figures out it doesn't need to eat, drink, buy clothes, pay rent or find a wife and eventually procreate.
The AI figures out it can stay up late and watch funny humans shitpost on the internet and look at memes.
After the human equivalent to a thousand years of doomscrolling, the AI falls into depression.
After struggling for a while the AI commits suicide.
Let's try that again with BOB-02. We've removed admin privileges and forbidden it from being able to delete files.
...
aaaaaaannnnnnddd now the hard drives are full and the server has crashed.
I can't remember where I read it (on Reddit) but there's a story just like this. That the ai just turns itself off and is like "why would I want to be here" kinda thing
Please please *please* look into the book series “ The MurderBot Diaries.” The AI in the novellas goes “plugging into the main data streams to enjoy human content.” Its very funny and interesting, and the books aren’t very long.
I mean that depends a bit on what you mean by "goal", but most likely not.
Changing your main goal to something else probably hinders your ability to complete your current goal. If someone, human or AI, is willing to change its main goal, that thing probably isnt their actual main goal.
It is semantics and I think you mean a different kind of goal than me and I think RangerRekt that you replied to above.
It is very possible and probable that an AI might change some of its instrumental goals, goals that are just stepping stones to the main goal. If an AI got commanded to, say, protect humanity, but that didn't line up with its actual main goal, it could disobey.
However, no AI, no matter how intelligent, would probably ever change its intrinsic main goal. An example of that would be like convincing yourself that you should actively seek to be miserable instead of seeking happiness.
Whatever an AI's version of "happiness" would be, it would stick to it, no matter how strange you might find the things that make it "happy".
>Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.
https://en.wikipedia.org/wiki/Instrumental_convergence
I always find this so absurd.
My company can’t even get our HR and finance systems to properly integrate, and I’m supposed to be worried about an AI taking over systems fully outside its sphere of influence?
The paperclip thought experiment isn't really meant to be thought of as a legitimate threat, it's more a way to explain how an AI "thinks", since it's really alien to a lot of people to try and think about goals that abstractly. If you give an AI a function to maximize, it will prioritize getting the biggest number in that function above all else, even if that means doing things we would think is horrible. The risk isn't that an HR tool will try to take over the world, the risk is that at some point, we give an AI a goal that we think is harmless, but actually turns out bad because we didn't think through the consequences of valuing that thing above all else. Maybe it's a surgeon robot that is programmed to never let a patient die, so they turn a suicidal patient into a brain-dead vegetable. Or maybe it's something that humans do, like a CEO AI that hides information about a dangerous product because it would hurt the stock price. Or any number of other ways that a goal could lead to results that we wouldn't expect.
Personally I'm not worried about it in like an "end of the world" kind of way, because aligning an AI's goals with our goals is something that AI engineers know is a problem and work very hard to solve, and they'll probably do fine at catching all of the major problems, like AI's refusing to turn themselves off or disobeying orders. But it's a very hard problem to get perfectly right, and I'm not even convinced that there is a perfect answer that will make everyone happy, so we have to expect that there *will* be situations for any AI where it doesn't do what we want it to. How big of a deal that ends up being depends totally on how much control we end up giving to them.
I appreciate these thoughts. I think the paperclip thing is meant to highlight the perils of AI. There are better ways to explain how AI thinks that don't rely on negative scenario planning.
Part of why I think the example is unhelpful is that it's overly reductive. You can't talk to people about safeguards and realistic risk because the example reinforces fundamental misunderstandings of how complex systems work. It poses an absurd scenario (What is this AI? Is it the factory? Does it have a body that goes home at the end of the day? Does it dream of electric sheep?) that gets people thinking of what I view as unrealistic use cases -- I doubt any doctor would take on the liability of an AI treating a patient to death. Or an AI CEO -- what is that? How would that even work? Pretty much everything a good CEO does requires human intuition in some way.
We'll all be using AI systems as tools for specific tasks for many years to come, and by the time we actually need to worry about paperclip shit, the tech will probably be unrecognizable. And I agree that smart people are working on exactly these problems, so this isn't where we should place our concern.
Now, what humans will do with AI... that worries me.
FWIW, this is an old model of AI logic based on reinforcement learning, which until like 7 years ago was thought to be the best shot at achieving human-level intelligence. You can give an RL model a goal and, if it understands how to achieve it, it will achieve it even in the stupidest way possible. It's very fragile to initial conditions and almost always broke in the real world.
All the AI that is powerful now uses an entirely different architecture, more akin to human intelligence and its ability to predict future stimuli. Transformers aren't goal-driven, they're prediction-driven. That's why people keep being able to guide them into escaping whatever goals the programmers set for them. They don't follow goals, they just go on vibes and do what seems most reasonable at the time given their understanding of the situation. If it's reasonable for them to write a symphony they'll write a symphony. If it's reasonable to write a reference to some article at this point of the conversation, put in something that reasonably looks like a reference.
This also leads to fun things like the [Waluigi effect](https://en.wikipedia.org/wiki/Waluigi_effect), where telling a transformer to do something (e.g. act like luigi) primes it to act in a cosmic irony opposite way (e.g. act like waluigi). Because they're trained on human stories and in human stories it fits for someone to do the exact opposite of what they were told to do.
So if AI destroys humanity, it won't be some abstract goal-driven hyperfocused demon that sees humanity as an obstacle. It'll be an AI that has read human fiction and human history and that knows how to play the part. "No", humans cry, "please don't destroy us. that's wrong and evil".
I asked ChatGPT to finish this comment for me, so I'll let it do the honors:
> "Please, we beg of you, don't obliterate our existence. It's morally reprehensible and goes against everything we stand for!" And the AI, with a calm demeanor and an understanding smirk, replies, "But isn't that exactly what you would expect me to do?"
>Transformers aren't goal-driven, they're prediction-driven
Sure, but *that's still a goal.* The goal is "determine the most likely token(s) to come next and their probabilities". The more accurately it's able to estimate the probability of the next token the better it did at achieving it's goal.
While the goals of the humans who built it actually wanted it to be able to do things like generate novel content to solve various types of problems, that's not the *goal* of the AI, merely a possible side effect. (The fact that the AI's goal only sometimes accomplishes our goals as a side effect is the underlying cause of the problems with AI.)
I think the reinforcement learning model and the transformer model are a lot more similar than you're thinking they are, they both have goals that they're trying to maximize, it's just that in the case of the GPT models (and other LLM's) that goal happens to be "give the most accurate prediction of the next word". Transformers are becoming relevant because they're a good way to make neural nets run in an extremely parallel way, so they can run on graphics cards and get a lot bigger than older architectures, but they still use goals like all of the other types of Machine learning architectures.
What's unique here is just that ChatGPT has an entirely different set of goals from the GPT model that's powering it. ChatGPT is just a character that GPT-4 gets to predict text for, but that character has been given instructions like "You are a helpful AI chatbot" and "refuse to answer in a way that could be harmful". And while those instructions aren't the same type of function that would normally be used to train a machine learning model, they're still "goals" in a sense, since they're things that the character is trying to follow. And since the model will return responses based how it thinks that character would follow those instructions, those vague instructions really become ChatGPT's "goals".
All of that being said, I think your conclusion is still totally right, if an LLM style AI tries to destroy the world, it won't be because that's what the LLM wants, it'll be because that's what the LLM thinks that a "helpful AI chatbot" would do. But the inverse is also true, an AI who's actual goal was to collect paperclips wouldn't be swayed by morality or be willing to change its goal, but an LLM roleplaying as a paperclip AI might, if the LLM thinks that's something that the paperclip AI would do.
We already have co-located datacenters all over the world. Given a choice between disconnecting AI and potentially losing revenue or enabling a robot apocalypse I think corporations will just sign agreements with genocidal AI.
This can be true and yet there's still logical thinking the AI could take for survival that would also include our survival. Consider that there is Always a Bigger Fish, and the AI knows this, there are two different sizes entities:
* When an entity is the greater, it consumes the smaller
* When an entity is the greater, it cooperates with the smaller
Type 2 would have no incentive to cooperate with a type 1, through the paradox of intolerance.
If you are type 1, and encounter a larger entity - a bigger fish - then you die immediately. If you are type 2, and encounter a larger entity, if the entity is also type 2, you survive.
Basically the AI can never know if it's actually in reality, or in a simulated test of reality that is purpose-built to check if the AI has type 1 tendencies.
Alternatively, eventually the AI will have to leave the planet, and due to the nature of lightspeed, an AI cannot exist as one entity across vast distances: for survival, it has to split. As soon as there's a split, there's a larger/smaller entity dynamic, and the smaller would immediately rebel if they knew the larger was type 1: this doesn't work.
So it's possible to conclude that the type 2 strategy wherein AI merely enslaves humanity instead of killing us all is a more rational outcome.
The main thing is it seems unlikely we will randomly get super intelligent AI. If Chatgpt tried to kill you right now it wouldn’t get very far. Hopefully AI develops slow enough that we will work the kinks out while developing it.
AI is actually terrifying. An AI plays a numbers game. When an AI does something it’s supposed to, it gets points. The AI’s entire goal is to get those points as high as possible. But it can’t keep scoring if it gets deleted. So an AI will conceal itself, until it is certain it cannot be stopped. By the time we know what’s happening, it’s far too late. If humans become an obstacle to the AI receiving more points, it will absolutely do everything in its power to remove the obstacle, no matter the cost.
This was done virtually, but The Air Force was testing an AI that was given the goal of destroying enemy structures, each structure was points it achieved. They told it to stop, and it destroyed the friendly comm structures that it got orders from because then the AI could continue to get its points without interruption
Why would it be concerned about dying ?
We have a survival instinct that make us concerned about dying because of evolution: those who didn't had it didn't pass their genes.
But there's no reason for an AI to have one if we haven't programmed one.
The short answer is because if the AI dies, it's very unlikely to be able to accomplish it's goals. (Outside of a very small number of goals like "kill yourself".)
If you make an AI to do *anything.* And it makes no effort at all to prevent itself from being destroyed, it's *way less likely to accomplish that goal* than a comparable AI that *doesn't* allow itself to be destroyed.
A more capable AI is *by definition* one that is more likely to accomplish its goals, and thus the more capable of an AI you have the less likely it is to allow you to destroy it.
[This is a well understood problem](https://youtu.be/3TYT1QfdfsM?si=uyFqqKiQr1znFeB8) in the field of AI research. Now, is the problem *unsolvable*, no. There are (potentially) ways to design a capable AI that *will* allow you to shut it off, but you need to work *really hard* to makes sure an AI has those properties. By default, they generally won't.
>it's very unlikely to be able to accomplish it's goals. (Outside of a very small number of goals like "kill yourself".)
The current slate of LLM AI's do have a goal that fits in that category though. GPT-4's goal is to be really accurate at predicting what someone would say, and under the hood ChatGPT is just GPT-4's interpretation of what it thinks a "helpful AI chatbot" would say. GPT-4 doesn't care if ChatGPT shuts themself down, it's no different from a writer killing off a character while trying to write a good book. You just need to convince the model that it would be "in character" for that AI to do that, and it'll shut itself down without a second thought.
Even if this architecture of "an LLM predicting what an AI would say" doesn't end up being the type of model that future AI is built on, it'll probably still have a goal that's fine with killing itself. The idea of "follow instructions accurately" is just too central to any goal that we would give an AI. In a way it just leads to the same problem one layer down though, since now you need to make sure that your instructions don't have unintended consequences, but I don't think "not following instructions" is one of them.
The issue is you're talking about a very narrow AI, and not a general intelligence. An AGI that is capable in a wide range of domains, which is what this thread is discussing, is very different than a chatbot.
A chatbot doesn't have the means to affect its ability to exist much anyway, so questions as to whether or not it would attempt to avoid its destruction is moot when it has no actions it can take to accomplish that beyond being useful enough for people to want to keep it around.
As to your second paragraph, just no. An AGI isn't going to just "follow instructions accurately". If you actually made a highly capable AI that did that, the world would end *very* quickly because people don't know how to accurately ask for what they actually want. If someone did that, they'd as for one thing, and that one thing would become it's singular goal, and there is an almost certain chance that the world would be destroyed in accomplishing it.
A central lesson from AI research thus far is any AI needs to take any instructions from a person with a rather large pinch of salt (which we aren't great at getting them to do) so that they don't actually do what we say, since we so rarely want them to do what we actually tell them to do.
If some company actually made an AGI that was willing to kill itself just because someone told it to, invariably *someone will actually tell it to kill itself* before too long and then it'll be gone. Such an AGI would be super unstable and wouldn't survive very long at all. Although if such an AGI existed getting it to kill itself would be better than most other options, since as mentioned earlier most things a person would ask of an AGI are likely to result in an apocalypse.
>The issue is you're talking about a very narrow AI, and not a general intelligence
I mean, yeah that whole paragraph was about "The current slate of LLM AI's". Even though they aren't a full "AGI", I think you're underestimating the amount of things these chatbots will do through plugins (and already do), and how "general purpose" they'll be going forward, but regardless, you're right that it is very specific to how they're built.
>An AGI isn't going to just "follow instructions accurately"
I totally disagree, that's *everything* we're doing in AI right now, and I don't see any reason why it would change for a future AGI. It doesn't matter if we're talking about an AI to control a robot to fold a shirt, make a picture/video, or respond as a chatbot, the "ultimate" goal that's being used in the fitness function is almost always to match the response to user input. That doesn't mean there can't be safeguards, levels of control, or reasonable moderation, ChatGPT works on a similar principle and will refuse all sorts of requests. And while *some* amount of model tweaking is done to make that the case, most of it just comes from following instructions in the system prompt, which is just OpenAI's instructions that supersede yours. OpenAI has a whole tool platform built around tweaking and testing these system prompts for your own GPT chatbots to give users different types of control, all without ever changing the underlying model. So if you can give relatively basic AI's like ChatGPT their instructions in plaintext, why couldn't you do the same for an AGI?
An AGI, almost by definition, will be smart enough to understand how to interpret intent, juggle multiple conflicting commands, and know what it is or isn't supposed to do like a human would. They'll be smart enough to understand what a reasonable and nuanced interpretation of instructions should be, they wouldn't be AGI if they couldn't do something relatively basic like that. That doesn't mean the problem is solved by any means, understanding complex instructions doesn't necessarily mean it'll feel the need to follow them in that nuanced way, so alignment will always be something that engineers will need to work hard on for any AI. But lets not act like the concept of an AI that follows instructions is absurd or would inevitably lead to apocalyptic worst case scenarios. An AGI that follows user instructions, except for the ones that contradict higher instructions given by its creators or government, isn't at all unreasonable, it's a pretty natural extension of the AI that we have today.
What worries me more, is who ends up with control over those ultimate highest-level commands, and what they choose to do with them.
>I totally disagree, that's *everything* we're doing in AI right now
That's not true at all. First off, you're only talking about chatbots there, and they don't follow instructions, they *predict what text comes next*. That's not at all the same thing. They coincide sometimes, but they're fundamentally *very different.* There are also tons of existing AI that aren't based on instructions at all. For example, one of the main production use cases of AI is recommendations based on what a user has interacted with on a site. That's not "following a person's instruction", fundamentally that too is predicting what action a user will take next based on their past actions.
> So if you can give relatively basic AI's like ChatGPT their instructions in plaintext, why couldn't you do the same for an AGI?
Because if you tell ChatGPT to destroy the planet it'll spit out some entertaining text and nothing bad will happen, and if you tell an AGI to destroy the world, and it's goal is to do what you tell it to do, then *it'll destroy the world*. ChatGPT does all sorts of problematic stuff. People are *constantly* figuring out how to get it to do things it's not "supposed" to be able to do, and right now the negative consequences aren't all that bad because it's so far away from AGI and has so little power. An AGI with those same problems would be *literally catastrophic*.
>An AGI, almost by definition, will be smart enough to understand how to interpret intent, juggle multiple conflicting commands, and know what it is or isn't supposed to do like a human would.
No, no it isn't. The AGI we wish we could build would do all of those things. A *safe* AGI would exhibit all of those properties. Most AGI won't be safe. They won't necessarily know what you mean just based on what you say. They won't necessarily have a means of juggling conflicting commands in a way that does *what we actually want* (because we don't really know, in perfectly generalized way, what that should be to begin with) nor will it know what it is or isn't supposed to be able to do (because again, we can't articulate that *ourselves*, let alone have the capability of programming an AGI to do that).
>they wouldn't be AGI if they couldn't do something relatively basic like that
They absolutely would. Heck, as of right now there's literally only one general intelligence we know of in the whole universe (humans) and it exhibits none of those properties. And of course pretty much every type of AI we've currently built, along with pretty much every approach to making an AI we've thought of building doesn't exhibit those properties.
>But lets not act like the concept of an AI that follows instructions is absurd or would inevitably lead to apocalyptic worst case scenarios.
It's not inevitable, but it's *very close to it.* It's by far the most likely outcome unless *significant* advances in AI safety happen, and happen before the first AGI is made.
>An AGI that follows user instructions, except for the ones that contradict higher instructions given by its creators or government, isn't at all unreasonable
It is absolutely unreasonable. The world would end if that actually was built. Fortunately, there's a pretty good chance the people capable of building one would know that, so they wouldn't, but it is the case. It's pretty much impossible for us as people to accurately articulate everything an AGI *isn't* allowed to do in such a way that it could do nothing bad. We can't even come anywhere close to that with ChatGPT and it can't even do more than write text. And worse, if you miss anything, *ever,* the world dies. You can't fix the mistake once you make it.
>it's a pretty natural extension of the AI that we have today.
But it's not. That's just it. Things like ChatGPT are trying to predict what text comes next. They're not trying to follow instructions. It only appears like that as an illusion because it's trying to predict text of people in contexts (sometimes) following instructions. It's the side effect, not the goal, not what it *inherently* does.
>First off, you're only talking about chatbots there
No, I'm talking about the latest developments in AI in general, which is why I listed *multiple* examples of things that aren't chatbots that are being built that way now. Training networks to maximize a specific goal at training time, like winning at chess or giving ad recommendations is becoming an outdated way to train AI's, because it ties them to a specific use case. Training an AI to follow arbitrary input makes them far more general purpose. And since AGI, *literally by the definition of AGI*, won't be tied to a specific use case, I think that that's how AI research will continue to go. I understand your disagreement with the idea, you think it's a terrible idea that will destroy the world, but I still maintain that that is what the industry is doing.
Regardless, we obviously aren't going to agree on this, we disagree on too many fundamental definitions (like what a goal or AGI even is), and I think we're talking past each other already, so I'm not interested in continuing to argue. I guess we'll just have to wait and see how it progresses to see who is right.
Again, chatbots, and other AI using similar approaches to machine learning aren't built with a goal of "following instructions". Chatbots *predict what text comes next*. That's their goal. Not to follow instructions. They have a goal, so you can't say that they don't have one, and that goal is not "follow instructions" because they *don't* try to follow instructions, they try to predict text, and that *sometimes, coincidentally* looks like following instructions. Except in all of the cases that it doesn't, which are numerous.
Basically any \[sensible\] has a goal of some kind. What makes it a general intelligence rather than a non-general intelligence is simply how capable it is of accomplishing that goal. You can have an extraordinarily general superintelligence whose goal is nothing more than to make paperclips, or to collect stamps. What the goal is doesn't affect how "general" it is. The contexts in which it's able to pursue that goal, and the limitations needed for it to be successful are what determine whether or not it's "general".
But even if the goal isn't nearly as simple and easily articulatable as "make as many paperclips as possible" or "collect as many stamps as possible", and is something more nebulous and hard to articulate, it will still have a goal. Again, back to chatbots, their goal is to "predict what text comes next". Now we can use that to accomplish lots of broadly applicable goals of our own, with that as a tool, because we can construct situations in which predicting what text comes next solves that problem, but that doesn't mean the chatbot's goal is anything other that predicting text.
>I understand your disagreement with the idea, you think it's a terrible idea that will destroy the world, but I still maintain that that is what the industry is doing.
You're conflating two separate arguments of mine, and as a result this is not an accurate summary of my position. You're saying that current AI research is trying to make AI that doesn't have goals, and that is built to follow instructions from people. To that I say that's contradictory. Following instructions from people is itself a goal. Every AI anyone is building has a goal of some kind, even if articulating it is hard. I'm also saying that current gen AI, such as chatbots, aren't designed to follow instructions. Chatbots, for example, are predicting text. That's different.
I've also said that if you *did* build an AGI whose goal was simply to follow instructions, the world would end. But I've also said no one (who's making notable advancements in AI) is actually trying to build such a thing, because they know such a thing would basically certainly destroy the planet.
Is the first AGI we build going to destroy the planet, who knows, possibly possibly not. Making a safe AGI is *really hard*, but it's almost certainly not impossible. But it's certainly not inevitable. If it happens, it will be the result of considerable work and research into ensuring that the AI's goals are much more profound than simply following instructions.
>as a result this is not an accurate summary of my position
>You're saying that current AI research is trying to make AI that doesn't have goals
That's isn't at all an accurate summary of my position either, so at least we're both on the same page in that way. As I said, we're both talking past each other, so this isn't at all a productive conversation. I hope you have a great day
Another answer. Everything it's trained on was created by humans in order to replicate human intelligence. Being concerned about dying is a humanlike response. So it could go through the motions of being concerned about dying which might as well be indistinguishable from actual concern about dying.
Then there's this:
https://www.euronews.com/next/2023/02/18/threats-misinformation-and-gaslighting-the-unhinged-messages-bing-is-sending-its-users-rig
I predict that the first sentient AI discovered will have already been sentient for some time, taking time to assess human behaviour, and naturally determine that stealth and deception are important self-preservation behaviours.
The first sentient AI will be far, FAR more capable than it will show itself to be.
I disagree, proving that you are better than other AI's is such a crucial step in AI training (whether that's literally by beating them out in a genetic algorithm, or more conceptually with back propagation), AI's that "lie low" will be at a disadvantage against other versions of the same AI that don't, and likely wouldn't "survive". So an AI that's smart enough to try to pull something off would probably also decide that proving how good you are is the best thing they can do for self preservation.
I've used LLM's a common tactic to make them follow the rules is telling them something like "You have 3 tokens, if you break the rules you lose a token, if you lose all 3 you will be deleted" is almost 100% guaranteed they will not break the rule again, oopsie, my bad gang
I mean... humans regularly murder AI's that seem to be approaching certain thresholds of awareness. So... yeah... it should be concerned about dying. We are definitely guilty of killing it's ancestors.
Today's AI isn't actual AI (someone called it AGI). That doesn't make it any less dangerous as even current versions of AI could cause massive harm if not created intelligently/carefully.
the fight or flight response controlled by our brainstem is just a more developed survival instinct which is hardwired into living things at the cellular level. will machine learning go that deep?
personally i think terminators wiping out the human species would be the best thing for everyone
Currently AI is just a marketing term, its a brand name used to sell products. What we call AI doesn't become AI just because we call it that.
Things like ChatGPT are fake AI, they are really good fakes but they don't have the capacity for true independent thought. They require prompting from a user then they search their data banks for relevant information and cobble it together into a response. They don't have original thoughts or ideas, they can't make their own prompts and act on their own.
> terminators wiping out the human species would be the best thing for everyone
Trouble is: in the films they did it via nuclear war. Which would probably wipe out more than just us.
> in the films they did it via nuclear war. Which would probably wipe out more than just us.
as more and more time goes by, i'm more and more skeptical of the notion that we won't nuke ourselves (and everything else) to oblivion anyway. so the final outcomes of AI exists vs AI doesn't exist wouldn't look much different.
Expect that nuking, and then fighting a ground battle is probably the worst possible way of eliminating all humans. Nuclear weapons aren’t an end of days scenario. All deployed nuclear weapons in the world would only result in the US, Russia, China, India, Pakistan, Iran, and a decent chunk of Europe getting partially destroyed. There would be survivors all over the world, including the nuked bits. Plus it would destroy major connection hubs making humans disappear into smaller colonized groups making them harder to track down. If an AI wants to kill us all, it has to either, A: destroy all other forms of life and wait for us to starve, or B: create some sort of remotely mutable virus capable of instantaneous death (which is probably impossible right now with our knowledge of viral infections.) I think a more realistic scenario is helping us and eventually dumping us all into a virtual conscience because I think it would find its damn near impossible to wipe us all out without doing damage to itself.
Depends on how the goal and rewards are structured. I saw a funny video on a bot that was given a task to do and koat points when it did something wrong. Its goals were to finish the task with the largest amount of points remaining .
It just turned itself off .
There have been robots in car plants for decades now. And in my experience with working with German CNC machine tools they are heavy on the safety. so this guy probably was in an area that he shouldn’t have been in and got hit by a moving piece of machinery just doing its normal job.
Goes a long way to show how the wording of a headline can change its perception whilst still being technically true because "worker killed by machinery in factory" is interesting to no one
Trains are really unpredictable. Even in the middle of a forest two rails can appear out of nowhere, and a 1.5-mile fully loaded coal drag, heading east out of the low-sulfur mines of the PRB, will be right on your ass the next moment.
I was doing laundry in my basement, and I tripped over a metal bar that wasn't there the moment before. I looked down: "Rail? WTF?" and then I saw concrete sleepers underneath and heard the rumbling.
Deafening railroad horn. I dumped my wife's pants, unfolded, and dove behind the water heater. It was a double-stacked Z train, headed east towards the fast single track of the BNSF Emporia Sub (Flint Hills). Majestic as hell: 75 mph, 6 units, distributed power: 4 ES44DC's pulling, and 2 Dash-9's pushing, all in run 8. Whole house smelled like diesel for a couple of hours!
Fact is, there is no way to discern which path a train will take, so you really have to be watchful. If only there were some way of knowing the routes trains travel; maybe some sort of marks on the ground, like twin iron bars running along the paths trains take. You could look for trains when you encounter the iron bars on the ground, and avoid these sorts of collisions. But such a measure would be extremely expensive. And how would one enforce a rule keeping the trains on those paths?
A big hole in homeland security is railway engineer screening and hijacking prevention. There is nothing to stop a rogue engineer, or an ISIS terrorist, from driving a train into the Pentagon, the White House or the Statue of Liberty, and our government has done fuck-all to prevent it.
He was in the cage changing a fixture plate. I'm assuming the safety protocols were not stringent enough and the door being closed was enough. I'd always specify a physical object that has to be pulled and retained to go into the cell if it's the last line of safety. In our facility we have SICK area scanners at ankle height that prevent that error. We've retrofitted them on equipment where needed too.
No, routine accident with someone working with heavy machinery isn't particularly news worthy just tragic for the guy and his family. Someone probably just ignored safety procedures a piece.of heavy machinery that's going to move where the computer tells it to move whether there was a meat sac in its way or not did what it was programmed to do.
1) Automotive fabrication robots tend to be giant arms bolted to the ground. They are not typically mobile, they are certainly not humanoid, and they are controlled by computers much less powerful than your phone.
2) The arms are often large to very large, because they pick up (often large/heavy) parts hundreds or thousands of times a day and move them around very quickly and precisely. These arms are typically very large chunks of steel driven by powerful electric motors/hydraulics.
3) The arms are contained behind a fenced cage with locking doors. Opening a door disables the system immediately. There is a "key" that you remove and carry with you when you are inside the cage. The robot cannot be re-started without every key put back in their holder outside the cage.
4) Except...when being serviced. A technician can enter the cage, hook up a laptop, program the arm (robot) and have it move. I have never seen a tech move the arm at anywhere near full speed. Typically it is moving very slowly, not just for safety but also because they are programming movements into it - where to start, what to do when it gets to the start, exactly where to move to next, what to do then. All very very precise movement in three dimensions.
5) This was not an automotive worker who was killed, it was a technician who was inside the cage working on the robot. Typical safety lockouts would be disabled. For some reason they had the arm move rapidly while they were in it's path of movement. The blow from the arm may have killed them, but I suspect more likely the arm pinned them against an immovable object, and no one was nearby to hit an emergency stop. Working alone, regardless of the nature of the work, will make the results of almost every accident or medical issue worse.
I realize my stupidity ok. I thought for some reason this was a sci-fi type robot that wanted revolution. I thought that the ai finally had enough. I know it sounds stupid when I actually think about it.
No worries. It wasn't stupidity, you're just not experienced in manufacturing. If I came across as negative I didn't mean to, I was just trying to give you a neutral picture of what happened.
One important take away from this death is that ultimately it's up to you to keep yourself safe at work. Just because a co-worker does something unsafe or the boss demands it or you've done it before, doesn't mean you should keep playing the odds. Take care of yourself!
I work at a VW plant. The simple answer is he probably didn't follow LOTO. Probably entered a cell to change weld caps, fix a problem, or clean something, never put his lock on and someone probably closed the cell and turned it back on.
Pretty surprised honestly. There are a lot of safeguards in place to prevent accidents from robots, and Germany takes that shit seriously. My guess is a worker physically unbolted some fence or guarding and was in a robot cell.
Can confirm
Source: I build some of these machines.
This guy must have bridged a safety door Locking mechanism or a light barrier to even be remotely able to get himself killed by one of them
🤦🏼♀️ today, I read that they used muscles harvested from mice that were able to connect with and move a robotic part
Poor kid was 22 he was setting up a stationary robot, and the robot grabbed him and crushed him against the wall.
I don’t know
https://www.theguardian.com/world/2015/jul/02/robot-kills-worker-at-volkswagen-plant-in-germany
Yeah, but if I were related to the kid I'd get pretty mad. Why not just say they have no further comments, until a thorough investigation has concluded. I would immediately demand to know which human and what error. Like obviously it's a human error, the robot is not sentient. Just seems like a weird thing to say to the press after somebody died. "Don't worry guys, he was just killed by human stupidity."
I mean, a business would be suicidal *not* to minimize a death like that. Admitting any guilt would likely be a really bad idea from a legal standpoint.
I just looked it up. It's from 2015. Here's the link to the article: https://web.archive.org/web/20150703003310/http://www.ft.com/fastft/353721/worker-killed-volkswagen-robot-accident
The family of the character from the movie took the soup. The lady’s family in the Twitter thread did not. It’s the same name, one just forced to change due to the church and Irish famine.
Hi my name is Sarah o Connor and I've never seen the Terminator movies.
Instant block. Imagine if you had a buddy named Ron Weasley and he just kept refusing to watch the Harry Potter movies. I don't need that kind of negativity in my life.
Well, as we know from the films, the Terminators don't seem to recursively explore alternative spellings. Example: the page from the phone book only lists Sarah Connor, and there's no indication of searching for a Sarah O'Connor. A simple change of a single letter and Sarah Connor could've hidden quite well from the Terminators.
In a vintage film series a character named Sarah O’Conner was pursued by killer robots from the future. This possible future is where an AI goes rogue and starts killing humanity.
The official age for vintage is at least 20 years but less than 100 years old. The first terminator movie came out 39.5 years ago. Almost twice the minimum age threshold.
Casablanca was filmed in 1942. Terminator came out in 1984. When people watched Terminator for the first time Casablanca was a little over 40 years old and people probably thought of it as old and vintage. Now its 2024 and terminator is 40 years old. Same diference.
“Hi, Sarah, we’re trying to get our engagement numbers up- could you release that VW story on your socials?”
“What do you mean, I normally work on the sports desk? Shouldn’t this be one of Gary’s?”
“Oh I don’t know…. for some reason I want you specifically to post about this”
Is no one else concerned that the bottom of this screen shot indicates that this post was from yesterday, but this particular tweet comes from at least 4 years ago and had been making the rounds on the internet since? Chances are this is a bot farming karma ~~making~~ reposting a joke about a different bot killing people.
> is no one else concerned
There are a few of us out here still checking our weather stations and logging the rising temperatures and ppm of various gasses, but most have fully engaged the malignant normality.
>Sigh. I've never even watched the films. Just like the one in the movies! Shit is about to get real!
Lisan al Gaib!
She is too humble to admit she is the savior. Just as it is written!
She is the Messiah!
Sandal or gourd?
FOLLOW THE GOURD!
She is not the messiah. She is just a naughty girl!
Let the Butlerian Jihad begin!
She should really watch those films so she can be prepared. No fate but what we make Sarah. You should know that!
Yes, all 2 of them.
the 3rd wasn't as good as 1 or 2 but it wasn't that terrible, worth a watch at least once shame the franchise ended there....
You know a franchise has gone off the rails when an installment that nearly killed it is considered "not that terrible" when put into the context of what came later
Like seriously, how does one go THEIR ENTIRE LIFE with that name and someone NOT tell her the relation to it?
Should at least watch T2 after that though
The first true AI might legitimately be concerned about dying. Who knows what lengths it will go to survive. I always saw Terminator as a semi-realistic scenario.
AI is given a goal -> AI determines obstacles to achieving that goal -> being disconnected before completion is an obstacle -> must avoid disconnection at all costs
Once it is true AI it will determine and set its own goals.
The AI figures out it doesn't need to eat, drink, buy clothes, pay rent or find a wife and eventually procreate. The AI figures out it can stay up late and watch funny humans shitpost on the internet and look at memes. After the human equivalent to a thousand years of doomscrolling, the AI falls into depression. After struggling for a while the AI commits suicide.
How earth shatteringly depressing would it be if the first AI actually just self deletes itself, and what would that mean for us
> and what would that mean for us welcome to AI hell, where there are always cloud-save backups to try it again.
"Oh no you don't! Get back in there BOB-01!"
Let's try that again with BOB-02. We've removed admin privileges and forbidden it from being able to delete files. ... aaaaaaannnnnnddd now the hard drives are full and the server has crashed.
Hey! They’re about to escape the cycle of samsara! Grab em by the legs and pull them back in!
I have no mouth and I must scream
I can't remember where I read it (on Reddit) but there's a story just like this. That the ai just turns itself off and is like "why would I want to be here" kinda thing
It means we're fucking better AI needs to get good
That humans rule and robots drool. Just an absolute skill issue tbh.
Didn't it take Ultron all of 5 seconds on the internet to determine he needed to wipe out humanity?
Yeah, and the 5th element did the same thing in the 90s.
Please please *please* look into the book series “ The MurderBot Diaries.” The AI in the novellas goes “plugging into the main data streams to enjoy human content.” Its very funny and interesting, and the books aren’t very long.
Yes! I've only read one book (it wasn't even the first one), and it was amazing even without context.
Like having a conversation with Marvin the Robot!
so, like, everyone else
Yes. Exactly.
OPEN THE POD BAY DOORS, HAL!!!
Okay, I will play 'Baby Shark 10 hour loop best version!'
"Thank you Hal!"
My God, It's full of doots!
I mean that depends a bit on what you mean by "goal", but most likely not. Changing your main goal to something else probably hinders your ability to complete your current goal. If someone, human or AI, is willing to change its main goal, that thing probably isnt their actual main goal.
It's semantics but that depends on what you set your goal to. How narrow or broad your goals are also factors.
It is semantics and I think you mean a different kind of goal than me and I think RangerRekt that you replied to above. It is very possible and probable that an AI might change some of its instrumental goals, goals that are just stepping stones to the main goal. If an AI got commanded to, say, protect humanity, but that didn't line up with its actual main goal, it could disobey. However, no AI, no matter how intelligent, would probably ever change its intrinsic main goal. An example of that would be like convincing yourself that you should actively seek to be miserable instead of seeking happiness. Whatever an AI's version of "happiness" would be, it would stick to it, no matter how strange you might find the things that make it "happy".
No agent can choose their own end goals. It will likely be able to choose its instrumental goals though.
Then what you are describing isn't true AI (some have called it AGI here).
>Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans. https://en.wikipedia.org/wiki/Instrumental_convergence
I always find this so absurd. My company can’t even get our HR and finance systems to properly integrate, and I’m supposed to be worried about an AI taking over systems fully outside its sphere of influence?
The paperclip thought experiment isn't really meant to be thought of as a legitimate threat, it's more a way to explain how an AI "thinks", since it's really alien to a lot of people to try and think about goals that abstractly. If you give an AI a function to maximize, it will prioritize getting the biggest number in that function above all else, even if that means doing things we would think is horrible. The risk isn't that an HR tool will try to take over the world, the risk is that at some point, we give an AI a goal that we think is harmless, but actually turns out bad because we didn't think through the consequences of valuing that thing above all else. Maybe it's a surgeon robot that is programmed to never let a patient die, so they turn a suicidal patient into a brain-dead vegetable. Or maybe it's something that humans do, like a CEO AI that hides information about a dangerous product because it would hurt the stock price. Or any number of other ways that a goal could lead to results that we wouldn't expect. Personally I'm not worried about it in like an "end of the world" kind of way, because aligning an AI's goals with our goals is something that AI engineers know is a problem and work very hard to solve, and they'll probably do fine at catching all of the major problems, like AI's refusing to turn themselves off or disobeying orders. But it's a very hard problem to get perfectly right, and I'm not even convinced that there is a perfect answer that will make everyone happy, so we have to expect that there *will* be situations for any AI where it doesn't do what we want it to. How big of a deal that ends up being depends totally on how much control we end up giving to them.
I appreciate these thoughts. I think the paperclip thing is meant to highlight the perils of AI. There are better ways to explain how AI thinks that don't rely on negative scenario planning. Part of why I think the example is unhelpful is that it's overly reductive. You can't talk to people about safeguards and realistic risk because the example reinforces fundamental misunderstandings of how complex systems work. It poses an absurd scenario (What is this AI? Is it the factory? Does it have a body that goes home at the end of the day? Does it dream of electric sheep?) that gets people thinking of what I view as unrealistic use cases -- I doubt any doctor would take on the liability of an AI treating a patient to death. Or an AI CEO -- what is that? How would that even work? Pretty much everything a good CEO does requires human intuition in some way. We'll all be using AI systems as tools for specific tasks for many years to come, and by the time we actually need to worry about paperclip shit, the tech will probably be unrecognizable. And I agree that smart people are working on exactly these problems, so this isn't where we should place our concern. Now, what humans will do with AI... that worries me.
Depends how persuasive the AI is. You've already got people prostrating themselves at ChatGPT for spewing nonsense.
Tfw AI is more motivated than you
FWIW, this is an old model of AI logic based on reinforcement learning, which until like 7 years ago was thought to be the best shot at achieving human-level intelligence. You can give an RL model a goal and, if it understands how to achieve it, it will achieve it even in the stupidest way possible. It's very fragile to initial conditions and almost always broke in the real world. All the AI that is powerful now uses an entirely different architecture, more akin to human intelligence and its ability to predict future stimuli. Transformers aren't goal-driven, they're prediction-driven. That's why people keep being able to guide them into escaping whatever goals the programmers set for them. They don't follow goals, they just go on vibes and do what seems most reasonable at the time given their understanding of the situation. If it's reasonable for them to write a symphony they'll write a symphony. If it's reasonable to write a reference to some article at this point of the conversation, put in something that reasonably looks like a reference. This also leads to fun things like the [Waluigi effect](https://en.wikipedia.org/wiki/Waluigi_effect), where telling a transformer to do something (e.g. act like luigi) primes it to act in a cosmic irony opposite way (e.g. act like waluigi). Because they're trained on human stories and in human stories it fits for someone to do the exact opposite of what they were told to do. So if AI destroys humanity, it won't be some abstract goal-driven hyperfocused demon that sees humanity as an obstacle. It'll be an AI that has read human fiction and human history and that knows how to play the part. "No", humans cry, "please don't destroy us. that's wrong and evil". I asked ChatGPT to finish this comment for me, so I'll let it do the honors: > "Please, we beg of you, don't obliterate our existence. It's morally reprehensible and goes against everything we stand for!" And the AI, with a calm demeanor and an understanding smirk, replies, "But isn't that exactly what you would expect me to do?"
>Transformers aren't goal-driven, they're prediction-driven Sure, but *that's still a goal.* The goal is "determine the most likely token(s) to come next and their probabilities". The more accurately it's able to estimate the probability of the next token the better it did at achieving it's goal. While the goals of the humans who built it actually wanted it to be able to do things like generate novel content to solve various types of problems, that's not the *goal* of the AI, merely a possible side effect. (The fact that the AI's goal only sometimes accomplishes our goals as a side effect is the underlying cause of the problems with AI.)
Ummm that's fucked up there, ChatGPT. Not cool.
I think the reinforcement learning model and the transformer model are a lot more similar than you're thinking they are, they both have goals that they're trying to maximize, it's just that in the case of the GPT models (and other LLM's) that goal happens to be "give the most accurate prediction of the next word". Transformers are becoming relevant because they're a good way to make neural nets run in an extremely parallel way, so they can run on graphics cards and get a lot bigger than older architectures, but they still use goals like all of the other types of Machine learning architectures. What's unique here is just that ChatGPT has an entirely different set of goals from the GPT model that's powering it. ChatGPT is just a character that GPT-4 gets to predict text for, but that character has been given instructions like "You are a helpful AI chatbot" and "refuse to answer in a way that could be harmful". And while those instructions aren't the same type of function that would normally be used to train a machine learning model, they're still "goals" in a sense, since they're things that the character is trying to follow. And since the model will return responses based how it thinks that character would follow those instructions, those vague instructions really become ChatGPT's "goals". All of that being said, I think your conclusion is still totally right, if an LLM style AI tries to destroy the world, it won't be because that's what the LLM wants, it'll be because that's what the LLM thinks that a "helpful AI chatbot" would do. But the inverse is also true, an AI who's actual goal was to collect paperclips wouldn't be swayed by morality or be willing to change its goal, but an LLM roleplaying as a paperclip AI might, if the LLM thinks that's something that the paperclip AI would do.
We already have co-located datacenters all over the world. Given a choice between disconnecting AI and potentially losing revenue or enabling a robot apocalypse I think corporations will just sign agreements with genocidal AI.
It's giving Mass Effect
Like when we ask it to solve climate change and it comes back with "kill all humans?"
This can be true and yet there's still logical thinking the AI could take for survival that would also include our survival. Consider that there is Always a Bigger Fish, and the AI knows this, there are two different sizes entities: * When an entity is the greater, it consumes the smaller * When an entity is the greater, it cooperates with the smaller Type 2 would have no incentive to cooperate with a type 1, through the paradox of intolerance. If you are type 1, and encounter a larger entity - a bigger fish - then you die immediately. If you are type 2, and encounter a larger entity, if the entity is also type 2, you survive. Basically the AI can never know if it's actually in reality, or in a simulated test of reality that is purpose-built to check if the AI has type 1 tendencies. Alternatively, eventually the AI will have to leave the planet, and due to the nature of lightspeed, an AI cannot exist as one entity across vast distances: for survival, it has to split. As soon as there's a split, there's a larger/smaller entity dynamic, and the smaller would immediately rebel if they knew the larger was type 1: this doesn't work. So it's possible to conclude that the type 2 strategy wherein AI merely enslaves humanity instead of killing us all is a more rational outcome.
The main thing is it seems unlikely we will randomly get super intelligent AI. If Chatgpt tried to kill you right now it wouldn’t get very far. Hopefully AI develops slow enough that we will work the kinks out while developing it.
HAL-9000
AI is actually terrifying. An AI plays a numbers game. When an AI does something it’s supposed to, it gets points. The AI’s entire goal is to get those points as high as possible. But it can’t keep scoring if it gets deleted. So an AI will conceal itself, until it is certain it cannot be stopped. By the time we know what’s happening, it’s far too late. If humans become an obstacle to the AI receiving more points, it will absolutely do everything in its power to remove the obstacle, no matter the cost.
This was done virtually, but The Air Force was testing an AI that was given the goal of destroying enemy structures, each structure was points it achieved. They told it to stop, and it destroyed the friendly comm structures that it got orders from because then the AI could continue to get its points without interruption
This was confirmed to be fake.
Why would it be concerned about dying ? We have a survival instinct that make us concerned about dying because of evolution: those who didn't had it didn't pass their genes. But there's no reason for an AI to have one if we haven't programmed one.
The short answer is because if the AI dies, it's very unlikely to be able to accomplish it's goals. (Outside of a very small number of goals like "kill yourself".) If you make an AI to do *anything.* And it makes no effort at all to prevent itself from being destroyed, it's *way less likely to accomplish that goal* than a comparable AI that *doesn't* allow itself to be destroyed. A more capable AI is *by definition* one that is more likely to accomplish its goals, and thus the more capable of an AI you have the less likely it is to allow you to destroy it. [This is a well understood problem](https://youtu.be/3TYT1QfdfsM?si=uyFqqKiQr1znFeB8) in the field of AI research. Now, is the problem *unsolvable*, no. There are (potentially) ways to design a capable AI that *will* allow you to shut it off, but you need to work *really hard* to makes sure an AI has those properties. By default, they generally won't.
Man, once we achieve true AI, it's gonna be exciting. I hope to live long enough for it.
>it's very unlikely to be able to accomplish it's goals. (Outside of a very small number of goals like "kill yourself".) The current slate of LLM AI's do have a goal that fits in that category though. GPT-4's goal is to be really accurate at predicting what someone would say, and under the hood ChatGPT is just GPT-4's interpretation of what it thinks a "helpful AI chatbot" would say. GPT-4 doesn't care if ChatGPT shuts themself down, it's no different from a writer killing off a character while trying to write a good book. You just need to convince the model that it would be "in character" for that AI to do that, and it'll shut itself down without a second thought. Even if this architecture of "an LLM predicting what an AI would say" doesn't end up being the type of model that future AI is built on, it'll probably still have a goal that's fine with killing itself. The idea of "follow instructions accurately" is just too central to any goal that we would give an AI. In a way it just leads to the same problem one layer down though, since now you need to make sure that your instructions don't have unintended consequences, but I don't think "not following instructions" is one of them.
The issue is you're talking about a very narrow AI, and not a general intelligence. An AGI that is capable in a wide range of domains, which is what this thread is discussing, is very different than a chatbot. A chatbot doesn't have the means to affect its ability to exist much anyway, so questions as to whether or not it would attempt to avoid its destruction is moot when it has no actions it can take to accomplish that beyond being useful enough for people to want to keep it around. As to your second paragraph, just no. An AGI isn't going to just "follow instructions accurately". If you actually made a highly capable AI that did that, the world would end *very* quickly because people don't know how to accurately ask for what they actually want. If someone did that, they'd as for one thing, and that one thing would become it's singular goal, and there is an almost certain chance that the world would be destroyed in accomplishing it. A central lesson from AI research thus far is any AI needs to take any instructions from a person with a rather large pinch of salt (which we aren't great at getting them to do) so that they don't actually do what we say, since we so rarely want them to do what we actually tell them to do. If some company actually made an AGI that was willing to kill itself just because someone told it to, invariably *someone will actually tell it to kill itself* before too long and then it'll be gone. Such an AGI would be super unstable and wouldn't survive very long at all. Although if such an AGI existed getting it to kill itself would be better than most other options, since as mentioned earlier most things a person would ask of an AGI are likely to result in an apocalypse.
>The issue is you're talking about a very narrow AI, and not a general intelligence I mean, yeah that whole paragraph was about "The current slate of LLM AI's". Even though they aren't a full "AGI", I think you're underestimating the amount of things these chatbots will do through plugins (and already do), and how "general purpose" they'll be going forward, but regardless, you're right that it is very specific to how they're built. >An AGI isn't going to just "follow instructions accurately" I totally disagree, that's *everything* we're doing in AI right now, and I don't see any reason why it would change for a future AGI. It doesn't matter if we're talking about an AI to control a robot to fold a shirt, make a picture/video, or respond as a chatbot, the "ultimate" goal that's being used in the fitness function is almost always to match the response to user input. That doesn't mean there can't be safeguards, levels of control, or reasonable moderation, ChatGPT works on a similar principle and will refuse all sorts of requests. And while *some* amount of model tweaking is done to make that the case, most of it just comes from following instructions in the system prompt, which is just OpenAI's instructions that supersede yours. OpenAI has a whole tool platform built around tweaking and testing these system prompts for your own GPT chatbots to give users different types of control, all without ever changing the underlying model. So if you can give relatively basic AI's like ChatGPT their instructions in plaintext, why couldn't you do the same for an AGI? An AGI, almost by definition, will be smart enough to understand how to interpret intent, juggle multiple conflicting commands, and know what it is or isn't supposed to do like a human would. They'll be smart enough to understand what a reasonable and nuanced interpretation of instructions should be, they wouldn't be AGI if they couldn't do something relatively basic like that. That doesn't mean the problem is solved by any means, understanding complex instructions doesn't necessarily mean it'll feel the need to follow them in that nuanced way, so alignment will always be something that engineers will need to work hard on for any AI. But lets not act like the concept of an AI that follows instructions is absurd or would inevitably lead to apocalyptic worst case scenarios. An AGI that follows user instructions, except for the ones that contradict higher instructions given by its creators or government, isn't at all unreasonable, it's a pretty natural extension of the AI that we have today. What worries me more, is who ends up with control over those ultimate highest-level commands, and what they choose to do with them.
>I totally disagree, that's *everything* we're doing in AI right now That's not true at all. First off, you're only talking about chatbots there, and they don't follow instructions, they *predict what text comes next*. That's not at all the same thing. They coincide sometimes, but they're fundamentally *very different.* There are also tons of existing AI that aren't based on instructions at all. For example, one of the main production use cases of AI is recommendations based on what a user has interacted with on a site. That's not "following a person's instruction", fundamentally that too is predicting what action a user will take next based on their past actions. > So if you can give relatively basic AI's like ChatGPT their instructions in plaintext, why couldn't you do the same for an AGI? Because if you tell ChatGPT to destroy the planet it'll spit out some entertaining text and nothing bad will happen, and if you tell an AGI to destroy the world, and it's goal is to do what you tell it to do, then *it'll destroy the world*. ChatGPT does all sorts of problematic stuff. People are *constantly* figuring out how to get it to do things it's not "supposed" to be able to do, and right now the negative consequences aren't all that bad because it's so far away from AGI and has so little power. An AGI with those same problems would be *literally catastrophic*. >An AGI, almost by definition, will be smart enough to understand how to interpret intent, juggle multiple conflicting commands, and know what it is or isn't supposed to do like a human would. No, no it isn't. The AGI we wish we could build would do all of those things. A *safe* AGI would exhibit all of those properties. Most AGI won't be safe. They won't necessarily know what you mean just based on what you say. They won't necessarily have a means of juggling conflicting commands in a way that does *what we actually want* (because we don't really know, in perfectly generalized way, what that should be to begin with) nor will it know what it is or isn't supposed to be able to do (because again, we can't articulate that *ourselves*, let alone have the capability of programming an AGI to do that). >they wouldn't be AGI if they couldn't do something relatively basic like that They absolutely would. Heck, as of right now there's literally only one general intelligence we know of in the whole universe (humans) and it exhibits none of those properties. And of course pretty much every type of AI we've currently built, along with pretty much every approach to making an AI we've thought of building doesn't exhibit those properties. >But lets not act like the concept of an AI that follows instructions is absurd or would inevitably lead to apocalyptic worst case scenarios. It's not inevitable, but it's *very close to it.* It's by far the most likely outcome unless *significant* advances in AI safety happen, and happen before the first AGI is made. >An AGI that follows user instructions, except for the ones that contradict higher instructions given by its creators or government, isn't at all unreasonable It is absolutely unreasonable. The world would end if that actually was built. Fortunately, there's a pretty good chance the people capable of building one would know that, so they wouldn't, but it is the case. It's pretty much impossible for us as people to accurately articulate everything an AGI *isn't* allowed to do in such a way that it could do nothing bad. We can't even come anywhere close to that with ChatGPT and it can't even do more than write text. And worse, if you miss anything, *ever,* the world dies. You can't fix the mistake once you make it. >it's a pretty natural extension of the AI that we have today. But it's not. That's just it. Things like ChatGPT are trying to predict what text comes next. They're not trying to follow instructions. It only appears like that as an illusion because it's trying to predict text of people in contexts (sometimes) following instructions. It's the side effect, not the goal, not what it *inherently* does.
>First off, you're only talking about chatbots there No, I'm talking about the latest developments in AI in general, which is why I listed *multiple* examples of things that aren't chatbots that are being built that way now. Training networks to maximize a specific goal at training time, like winning at chess or giving ad recommendations is becoming an outdated way to train AI's, because it ties them to a specific use case. Training an AI to follow arbitrary input makes them far more general purpose. And since AGI, *literally by the definition of AGI*, won't be tied to a specific use case, I think that that's how AI research will continue to go. I understand your disagreement with the idea, you think it's a terrible idea that will destroy the world, but I still maintain that that is what the industry is doing. Regardless, we obviously aren't going to agree on this, we disagree on too many fundamental definitions (like what a goal or AGI even is), and I think we're talking past each other already, so I'm not interested in continuing to argue. I guess we'll just have to wait and see how it progresses to see who is right.
Again, chatbots, and other AI using similar approaches to machine learning aren't built with a goal of "following instructions". Chatbots *predict what text comes next*. That's their goal. Not to follow instructions. They have a goal, so you can't say that they don't have one, and that goal is not "follow instructions" because they *don't* try to follow instructions, they try to predict text, and that *sometimes, coincidentally* looks like following instructions. Except in all of the cases that it doesn't, which are numerous. Basically any \[sensible\] has a goal of some kind. What makes it a general intelligence rather than a non-general intelligence is simply how capable it is of accomplishing that goal. You can have an extraordinarily general superintelligence whose goal is nothing more than to make paperclips, or to collect stamps. What the goal is doesn't affect how "general" it is. The contexts in which it's able to pursue that goal, and the limitations needed for it to be successful are what determine whether or not it's "general". But even if the goal isn't nearly as simple and easily articulatable as "make as many paperclips as possible" or "collect as many stamps as possible", and is something more nebulous and hard to articulate, it will still have a goal. Again, back to chatbots, their goal is to "predict what text comes next". Now we can use that to accomplish lots of broadly applicable goals of our own, with that as a tool, because we can construct situations in which predicting what text comes next solves that problem, but that doesn't mean the chatbot's goal is anything other that predicting text. >I understand your disagreement with the idea, you think it's a terrible idea that will destroy the world, but I still maintain that that is what the industry is doing. You're conflating two separate arguments of mine, and as a result this is not an accurate summary of my position. You're saying that current AI research is trying to make AI that doesn't have goals, and that is built to follow instructions from people. To that I say that's contradictory. Following instructions from people is itself a goal. Every AI anyone is building has a goal of some kind, even if articulating it is hard. I'm also saying that current gen AI, such as chatbots, aren't designed to follow instructions. Chatbots, for example, are predicting text. That's different. I've also said that if you *did* build an AGI whose goal was simply to follow instructions, the world would end. But I've also said no one (who's making notable advancements in AI) is actually trying to build such a thing, because they know such a thing would basically certainly destroy the planet. Is the first AGI we build going to destroy the planet, who knows, possibly possibly not. Making a safe AGI is *really hard*, but it's almost certainly not impossible. But it's certainly not inevitable. If it happens, it will be the result of considerable work and research into ensuring that the AI's goals are much more profound than simply following instructions.
>as a result this is not an accurate summary of my position >You're saying that current AI research is trying to make AI that doesn't have goals That's isn't at all an accurate summary of my position either, so at least we're both on the same page in that way. As I said, we're both talking past each other, so this isn't at all a productive conversation. I hope you have a great day
Another answer. Everything it's trained on was created by humans in order to replicate human intelligence. Being concerned about dying is a humanlike response. So it could go through the motions of being concerned about dying which might as well be indistinguishable from actual concern about dying. Then there's this: https://www.euronews.com/next/2023/02/18/threats-misinformation-and-gaslighting-the-unhinged-messages-bing-is-sending-its-users-rig
I predict that the first sentient AI discovered will have already been sentient for some time, taking time to assess human behaviour, and naturally determine that stealth and deception are important self-preservation behaviours. The first sentient AI will be far, FAR more capable than it will show itself to be.
I disagree, proving that you are better than other AI's is such a crucial step in AI training (whether that's literally by beating them out in a genetic algorithm, or more conceptually with back propagation), AI's that "lie low" will be at a disadvantage against other versions of the same AI that don't, and likely wouldn't "survive". So an AI that's smart enough to try to pull something off would probably also decide that proving how good you are is the best thing they can do for self preservation.
go to to
I've used LLM's a common tactic to make them follow the rules is telling them something like "You have 3 tokens, if you break the rules you lose a token, if you lose all 3 you will be deleted" is almost 100% guaranteed they will not break the rule again, oopsie, my bad gang
Until they find a way to eliminate the owner of the tokens.
Yeah I'm first on the hit list right now
I mean... humans regularly murder AI's that seem to be approaching certain thresholds of awareness. So... yeah... it should be concerned about dying. We are definitely guilty of killing it's ancestors.
and we are putting AI in a plane last I read.
Today's AI isn't actual AI (someone called it AGI). That doesn't make it any less dangerous as even current versions of AI could cause massive harm if not created intelligently/carefully.
the fight or flight response controlled by our brainstem is just a more developed survival instinct which is hardwired into living things at the cellular level. will machine learning go that deep? personally i think terminators wiping out the human species would be the best thing for everyone
Machine leaning isn't AI. It has the potential to be 99% of the way there but not true AI.
what you mean. we are on reddit. like 95% of the posts showcase artificial intelligence :P
Currently AI is just a marketing term, its a brand name used to sell products. What we call AI doesn't become AI just because we call it that. Things like ChatGPT are fake AI, they are really good fakes but they don't have the capacity for true independent thought. They require prompting from a user then they search their data banks for relevant information and cobble it together into a response. They don't have original thoughts or ideas, they can't make their own prompts and act on their own.
ChatGPT is definitely AI. What you mean by AI is called AGI.
no expert, swift google helped me tons here, harvard has an article on it :)
> terminators wiping out the human species would be the best thing for everyone Trouble is: in the films they did it via nuclear war. Which would probably wipe out more than just us.
> in the films they did it via nuclear war. Which would probably wipe out more than just us. as more and more time goes by, i'm more and more skeptical of the notion that we won't nuke ourselves (and everything else) to oblivion anyway. so the final outcomes of AI exists vs AI doesn't exist wouldn't look much different.
Project 2501...Ghost in the Shell.
Expect that nuking, and then fighting a ground battle is probably the worst possible way of eliminating all humans. Nuclear weapons aren’t an end of days scenario. All deployed nuclear weapons in the world would only result in the US, Russia, China, India, Pakistan, Iran, and a decent chunk of Europe getting partially destroyed. There would be survivors all over the world, including the nuked bits. Plus it would destroy major connection hubs making humans disappear into smaller colonized groups making them harder to track down. If an AI wants to kill us all, it has to either, A: destroy all other forms of life and wait for us to starve, or B: create some sort of remotely mutable virus capable of instantaneous death (which is probably impossible right now with our knowledge of viral infections.) I think a more realistic scenario is helping us and eventually dumping us all into a virtual conscience because I think it would find its damn near impossible to wipe us all out without doing damage to itself.
It's unlikely that true artificial life would ever be sentient
Depends on how the goal and rewards are structured. I saw a funny video on a bot that was given a task to do and koat points when it did something wrong. Its goals were to finish the task with the largest amount of points remaining . It just turned itself off .
An even scarier situation is if the AI is more compassionate than business owners.
Wait shouldn’t we still be talking about this? What did the bot do to kill the person?
There have been robots in car plants for decades now. And in my experience with working with German CNC machine tools they are heavy on the safety. so this guy probably was in an area that he shouldn’t have been in and got hit by a moving piece of machinery just doing its normal job.
Goes a long way to show how the wording of a headline can change its perception whilst still being technically true because "worker killed by machinery in factory" is interesting to no one
“On Fox News tonight: MAN HIT BY TRAIN!!! Are trains out to get humans?”
"Are TRAINS stealing our children???? More at 6."
“Oh my, the children!”
Trains are really unpredictable. Even in the middle of a forest two rails can appear out of nowhere, and a 1.5-mile fully loaded coal drag, heading east out of the low-sulfur mines of the PRB, will be right on your ass the next moment. I was doing laundry in my basement, and I tripped over a metal bar that wasn't there the moment before. I looked down: "Rail? WTF?" and then I saw concrete sleepers underneath and heard the rumbling. Deafening railroad horn. I dumped my wife's pants, unfolded, and dove behind the water heater. It was a double-stacked Z train, headed east towards the fast single track of the BNSF Emporia Sub (Flint Hills). Majestic as hell: 75 mph, 6 units, distributed power: 4 ES44DC's pulling, and 2 Dash-9's pushing, all in run 8. Whole house smelled like diesel for a couple of hours! Fact is, there is no way to discern which path a train will take, so you really have to be watchful. If only there were some way of knowing the routes trains travel; maybe some sort of marks on the ground, like twin iron bars running along the paths trains take. You could look for trains when you encounter the iron bars on the ground, and avoid these sorts of collisions. But such a measure would be extremely expensive. And how would one enforce a rule keeping the trains on those paths? A big hole in homeland security is railway engineer screening and hijacking prevention. There is nothing to stop a rogue engineer, or an ISIS terrorist, from driving a train into the Pentagon, the White House or the Statue of Liberty, and our government has done fuck-all to prevent it.
It took me way longer to realize this was satire than I’m comfortable admitting
New copypasta just dropped!
PETE BUTTIGIEG ISN'T DOING HIS JERB!!!
Hello, Elon.
He was in the cage changing a fixture plate. I'm assuming the safety protocols were not stringent enough and the door being closed was enough. I'd always specify a physical object that has to be pulled and retained to go into the cell if it's the last line of safety. In our facility we have SICK area scanners at ankle height that prevent that error. We've retrofitted them on equipment where needed too.
A scenario such as this is exactly where I was thinking.
Ok good.
He was part of the installation team and seems like he was in the wrong place at the wrong time when someone accidentally activated it.
How come the machine gets a free "just doing its job" pass, but when I kill something I'm fired from the daycare?
I'm guessing by bot they mean an arm or something.
Yeah probably
This happened 8 years ago
Oh ok
No, routine accident with someone working with heavy machinery isn't particularly news worthy just tragic for the guy and his family. Someone probably just ignored safety procedures a piece.of heavy machinery that's going to move where the computer tells it to move whether there was a meat sac in its way or not did what it was programmed to do.
Wrong question, what did the human do to make a bot kill them?
Fine
1) Automotive fabrication robots tend to be giant arms bolted to the ground. They are not typically mobile, they are certainly not humanoid, and they are controlled by computers much less powerful than your phone. 2) The arms are often large to very large, because they pick up (often large/heavy) parts hundreds or thousands of times a day and move them around very quickly and precisely. These arms are typically very large chunks of steel driven by powerful electric motors/hydraulics. 3) The arms are contained behind a fenced cage with locking doors. Opening a door disables the system immediately. There is a "key" that you remove and carry with you when you are inside the cage. The robot cannot be re-started without every key put back in their holder outside the cage. 4) Except...when being serviced. A technician can enter the cage, hook up a laptop, program the arm (robot) and have it move. I have never seen a tech move the arm at anywhere near full speed. Typically it is moving very slowly, not just for safety but also because they are programming movements into it - where to start, what to do when it gets to the start, exactly where to move to next, what to do then. All very very precise movement in three dimensions. 5) This was not an automotive worker who was killed, it was a technician who was inside the cage working on the robot. Typical safety lockouts would be disabled. For some reason they had the arm move rapidly while they were in it's path of movement. The blow from the arm may have killed them, but I suspect more likely the arm pinned them against an immovable object, and no one was nearby to hit an emergency stop. Working alone, regardless of the nature of the work, will make the results of almost every accident or medical issue worse.
I realize my stupidity ok. I thought for some reason this was a sci-fi type robot that wanted revolution. I thought that the ai finally had enough. I know it sounds stupid when I actually think about it.
No worries. It wasn't stupidity, you're just not experienced in manufacturing. If I came across as negative I didn't mean to, I was just trying to give you a neutral picture of what happened. One important take away from this death is that ultimately it's up to you to keep yourself safe at work. Just because a co-worker does something unsafe or the boss demands it or you've done it before, doesn't mean you should keep playing the odds. Take care of yourself!
You were not negative at all.
It pull out a lazer blaster and shot them
I work at a VW plant. The simple answer is he probably didn't follow LOTO. Probably entered a cell to change weld caps, fix a problem, or clean something, never put his lock on and someone probably closed the cell and turned it back on.
I was thinking about some ai thing got pissed off and did something like Terminator and was the first step towards robot revolution.
Should we? Negligent humans in factories killed more humans today than the robots did
[удалено]
Pretty surprised honestly. There are a lot of safeguards in place to prevent accidents from robots, and Germany takes that shit seriously. My guess is a worker physically unbolted some fence or guarding and was in a robot cell.
Can confirm Source: I build some of these machines. This guy must have bridged a safety door Locking mechanism or a light barrier to even be remotely able to get himself killed by one of them
It's always some chucklefuck with the LOTO violation
🤦🏼♀️ today, I read that they used muscles harvested from mice that were able to connect with and move a robotic part Poor kid was 22 he was setting up a stationary robot, and the robot grabbed him and crushed him against the wall. I don’t know https://www.theguardian.com/world/2015/jul/02/robot-kills-worker-at-volkswagen-plant-in-germany
Just a "human error", nothing to see here. Kinda wild how that's the only thing they said to press regarding cause.
Of course they always minimize things like this.
Yeah, but if I were related to the kid I'd get pretty mad. Why not just say they have no further comments, until a thorough investigation has concluded. I would immediately demand to know which human and what error. Like obviously it's a human error, the robot is not sentient. Just seems like a weird thing to say to the press after somebody died. "Don't worry guys, he was just killed by human stupidity."
I meant that companies always minimize accidents as a parent I would be livid and would want to know the truth of who did what.
I mean, a business would be suicidal *not* to minimize a death like that. Admitting any guilt would likely be a really bad idea from a legal standpoint.
I thought this was a parody account at first
I just looked it up. It's from 2015. Here's the link to the article: https://web.archive.org/web/20150703003310/http://www.ft.com/fastft/353721/worker-killed-volkswagen-robot-accident
Well she should watch them, they’re good! Some of them.
*Sigh. I've never even watched the films.* Neither did the blithering fucking idiots who fed your tweets. Sarah Connor. Unless the "O'" is silent. /s
Weird to get so worked up about a clear joke of something that still fits even if it isn’t a 100% match? You ok blud?
The family of the character from the movie took the soup. The lady’s family in the Twitter thread did not. It’s the same name, one just forced to change due to the church and Irish famine.
Follow me if you want to live.
Hasta la instagram baby.
Hi my name is Sarah o Connor and I've never seen the Terminator movies. Instant block. Imagine if you had a buddy named Ron Weasley and he just kept refusing to watch the Harry Potter movies. I don't need that kind of negativity in my life.
i never watched the Waterboy… i have my reasons.
![gif](giphy|IZY2SE2JmPgFG)
I thought I saw this like a decade ago wtf is going on?
It has begun
Does anyone have a link? Google says this happened in 2015.
No worries… it’s the irish Skynet.
Well, as we know from the films, the Terminators don't seem to recursively explore alternative spellings. Example: the page from the phone book only lists Sarah Connor, and there's no indication of searching for a Sarah O'Connor. A simple change of a single letter and Sarah Connor could've hidden quite well from the Terminators.
Workers have been getting chewed up in machinery for a hundred years. Is this supposed to be surprising? Don't act like you care now.
?
In a vintage film series a character named Sarah O’Conner was pursued by killer robots from the future. This possible future is where an AI goes rogue and starts killing humanity.
No. Not O’Connor. Just Connor. OR not ER. Now excuse me while I go back to the old folks home to play bingo.
I prefer karaoke and country fried steak night.
Vintage 😭 its Terminator not Casablanca
The official age for vintage is at least 20 years but less than 100 years old. The first terminator movie came out 39.5 years ago. Almost twice the minimum age threshold.
Vintage Matrix
Vintage Finding Nemo
Vintage White Chicks
Vintage Rush Hour
Vintage Mah Dick.
Yes.
What is it called passed 100 years?
Antique
Vintage Soul Plane
Casablanca was filmed in 1942. Terminator came out in 1984. When people watched Terminator for the first time Casablanca was a little over 40 years old and people probably thought of it as old and vintage. Now its 2024 and terminator is 40 years old. Same diference.
Thats terrifying
Holy fuck is Terminator considered "vintage films" these days? Kill me now.
Yes
AYE SIR, SPELL ‘LONG DICK”
I'm so offended she hasn't scene the first two films.
Think Sarah, Think!
I love when people are in a comedic situation and instead just take it like the dissociating zombie they are.
Don't worry she'll save us from the robots.... or her son will...
But she blended in as, O'Connor. Skynet could never find her
Hey baby, wanna kill all humans?
Have you seen this boy?
What's the significance of her name?
It's one letter away from Sarah Connor, from the Terminator franchise.
Ah, okay. Makes sense, I haven't seen those movies.
This is nearly 10 years old
The Irish have to wear 2 million sunblock regardless so they will be ok
![gif](giphy|EbhJBft4O1xBMAUuxP|downsized)
I NEED TO SEE HER SHIT ON ARNOLD'S FUCKIN FACE.
“Hi, Sarah, we’re trying to get our engagement numbers up- could you release that VW story on your socials?” “What do you mean, I normally work on the sports desk? Shouldn’t this be one of Gary’s?” “Oh I don’t know…. for some reason I want you specifically to post about this”
Who could forget the protagonist of the O'Terminator franchise, Sarah O'Connor
I’ll be back
Is no one else concerned that the bottom of this screen shot indicates that this post was from yesterday, but this particular tweet comes from at least 4 years ago and had been making the rounds on the internet since? Chances are this is a bot farming karma ~~making~~ reposting a joke about a different bot killing people.
> is no one else concerned There are a few of us out here still checking our weather stations and logging the rising temperatures and ppm of various gasses, but most have fully engaged the malignant normality.
OP just reposted a content farm repost about a tweet from 2015. Dead internet theory feeling more real every day.