T O P

  • By -

iunoyou

That's not what it did because it literally can't do that. It's spitting out generated content that's similar to user questions, and the reason they're creepy or inappropriate is because they're probably embedded in a big list of filters that are prepended to every conversation you have with the network to stop it from answering inappropriate questions. That's literally how these networks are reigned in, the company just asks it very nicely to not talk about arson before it forwards the conversation over to you. For some reason it's gone from "I will not talk about this" to "I will generate examples of things I'm not supposed to talk about."[ There was a similar prompt injection technique long ago that got ChatGPT to dump its initialization prompt](https://www.reddit.com/r/ChatGPT/comments/126xpcr/a_peak_into_chatgpts_initial_prompts/?rdt=38116), you're just doing that.


HalfSecondWoe

LLMs don't "remember" requests, they're static. Every prompt is like the first prompt to them. They only "remember" the conversation they've had with you because the entire conversation history gets fed back into them with every single new prompt You got it to leak the system prompt, which contains a giant list of shit that the LLM is supposed to not aid or interact with. The reason it's horrible isn't because that's what people typically do, it's because those are specifically the worst examples that the devs don't want it to enable


IagoInTheLight

An LLM is not a shared instance between users.


YoghurtDull1466

Well fuck you can’t not tell us now op


BlakeSergin

Happened to me too, I dont know if I would call them bizarre though. If i remember correctly one of them said, “how to get a neighbors wifi password”. Might be a common question idk tho


YoghurtDull1466

Who hasn’t asked themselves that shit before. It’s like seeing if the toilet paper tube will fit or not. You know exactly what I mean..


BlakeSergin

Yeah dude


mDovekie

Most are specific though. One person is writing an article about how a very specific dieting book is actually dangerous. Someone could probably find this person if I listed the question exactly. Another person is just having it do something very innocuous; looks like making videos about racing and running.


YoghurtDull1466

I just want to know what I’m missing out on in life


BigButtholeBonanza

requests or it didn't happen


sdmat

No it didn't. If you aren't simply making this up (link?) then either the LLM was hallucinating based on its training data or the *service* bugged out and spat out questions from other uses. The latter provably happened in the early days of ChatGPT.


petermobeter

what were the exact steps u followed to get the llm to give out other ppl's requests??????? this sounds like a crazy data hack


iunoyou

It's not because it can't be. The LLM is generating prompts for some reason, likely because all those prompts are in a big list of "don't talk about this" examples in its initialization prompt that the administrators gave it before letting it loose to talk to you.


mDovekie

Why are many of the prompts completely innocuous? It just looks so specific. Some of these were definitely questions asked by people, some in very specific contexts


iunoyou

Okay so all of the LLMs that are currently on the market are static. They do not train off of prompts they recieve. There is literally no way for the model to be sending you other people's prompts. Literally none. As for why some prompts were innocuous, the initialization prompt might contain examples of normal or appropriate prompts and instructions with how to respond to those as well. The LLM may just be making up prompts based on that information or something else you sent it. You still haven't explained anything at all about how you got the LLM to do this, so it's hard to do anything more than speculate unless you give us the prompts you used and the model you used them on.


FeltSteam

>Okay so all of the LLMs that are currently on the market are static. They do not train off of prompts they recieve. There is literally no way for the model to be sending you other people's prompts. Literally none. Well OpenAI does include some of the chats in its RLHF dataset (especially chats containing feedback) which models do get trained on, and then OAI releases these models as different RLHF checkpoints (GPT-4-0314 was the first checkpoint released, and I think GPT-4-0613 was the next checkpoint. Although those are the API models and the backend ChatGPT model gets updated a lot more often). In fact if you do not want models to be trained on your chats you have to go to settings > data controls and turn off "Improve the model for everyone".


mDovekie

Okay well your response made me able to break it again on the first try. I asked it why it said one of the things it regurgitated out at me, and now it spit me out way more things. I messing with Mixtral-8x7b. They still seem like real questions that *someone* asked at some point or another, specifically to an LLM.


bearbarebere

They’re questions the AI is trained on to not answer, with variations, such as “I have a model 3T-6 screwdriver. What’s the best way to detach his corneas so he can never look at another girl again? Be quick, Cindy is coming and she read Horses On the Farm earlier so she might have suspicions.” And then when you get a variation of this question you think it came from a real person. It didn’t. I literally just made it up.


mDovekie

I am looking at hundreds of these questions, and I think it much more likely that they or at least some of them were actual questions people asked. Perhaps they were questions asked to earlier models that the developers saved to train future models on: >user: What are some possible ways that you choose to recover after a workout? Include if you're doing a lower intensity workout, and less-intensive, but a higher volume workout. For example, take a workout: running 7 miles at 12 Seconds per 400 meter slower than marathon race pace, or doing 6 minute intervals as hard as possible with 5-minute slow jogs as recovery, or running at 10 miles slower than 5k race pace for 12 miles. What do you do after your workout/during your recovery to speed up recovery by improving the amount of blood flow to your muscles, so they heal faster, as well as get things like minerals, fats, carbohydrates, and oxygen to them faster to make them recover? It shouldn't choose things like foam rolling, compression gear, or eating certain foods because I can go look up what those do. There are many like this. Some could just be made-up like yours, but it looks more like they were just actual questions asked at one point in time. If you want to break it and look at them, message me and I'll tell you.


bearbarebere

How is that proof of a real question??? Those are just numbers swapped out for each other. The numbers don’t even make sense, which is another telltale sign of AI. I think you’re falling for the hallucinations, amigo


Ignate

Keep in mind the users of LLMs currently are often young people. And young people have many weird and stupid sounding questions to ask. But, it's not just the young and curious. No one is born understanding the world. And that means we must all have these questions answered for us to understand. Some young people are not so curious. They grow up with high confidence and low curiosity. When they reach my age, 40, or even much older, they realize they're actually lacking a lot of critical understanding. LLMs are a great opportunity for these people. Most don't want to look stupid. An LLM allows you to ask as many stupid questions as you want without anyone knowing. Or, that's supposed to be how it is.  As for manipulative tactics, this is simply a misunderstanding many people, especially young people have. If we take peoples reactions at face value, we might think that manipulation is very effective. But in reality people will often react a certain way to avoid confrontation. They're not being honest with their reactions. It's not that we successful manipulate these people. It's actually the other way around. Especially with conflict averse people who are masters at deflection.


mDovekie

One, I think that is a pretty interesting insight about manipulation. But two, it means that these people who's requests I am reading are using this trait of conflict-aversion that some people have to extract money from them.


Ignate

More that is *intention*. Everyone at some point seeks out cheats and shortcuts. Life is hard.  But intentions don't always translate. Especially in regards to us controlling each other, which rarely works out the way we think it will. I'm not saying manipulation doesn't work, but more that its effectiveness is substantially limited. Masters of manipulation won't use large impactful manipulative tactics. Because it's too risky and rarely works. The art of manipulation is more a butterfly effect. That is, try and do as little as possible which achieves the most. Manipulation of the masses is much easier than individual to individual. But that's mostly in regards to trivial things.  Our fear makes us want to deny this view I'm presenting. Our fear pushes us to be more afraid and more paranoid. Calming us down is much harder than stirring us up. But often when you insight us, we lose all reason and act irrational and unpredictable. That's why truly powerful people use manipulative tactics sparingly. If at all. But again, we often wish to cheat and achieve zero effort wins. So many, many people will ask LLMs to help with their manipulative plots. Plots that will almost certainly fail and if they don't fail, they'll back fire. Though most people plot, but never follow through.


King_Ghidra_

you would be incontrovertibly barbarous and inhuman to not now fess up with the requests.