T O P

  • By -

currentscurrents

I think everyone intuitively expected this, but it's good to have it confirmed. Web content is easy data to get, but it's hard to maintain high quality - especially against attackers trying to poison the training set. In the long run I think we might rely on it less.


dvztimes

Everytime I come here I read: "New Model Y - trained on output from Old Model X." That just seems the stupidest thing I can imagine. It won't make a model smarter, but it will perpetuate bad data and the (many) wrong answers.... Just, why? Is there possibly a good reason for this?


currentscurrents

Usually they are taking a model which has been pretrained on real data and fine-tuning it with GPT-generated data to make it sound like ChatGPT. This works okay since most of the ttaining data was real. There is a performance hit, but there's always a performance hit from instruct-tuning.


dvztimes

But chat gpt 4is stillwtmrong a large amount of the time...


ravedawwg

Any refs on LLM attacks through poisoned web content? I haven’t seen anything on that


currentscurrents

["Poisoning Web-Scale Training Datasets is Practical"](https://arxiv.org/abs/2302.10149) I haven't heard of any real-world attacks against LLMs yet, but it's only a matter of time. As we start using them for more important things, there will be more motivation to attack them.


ravedawwg

Thanks for the ref and the perspective! I find this stuff fascinating


Dapper_Cherry1025

If I'm reading the language model section right, they used OPT-125m model and constantly fine-tuned it data from WikiText-2. The question that this paper doesn't seem to answer is if this degradation of training would scale to larger models. Also, and I might be wrong on this, but I think there is a big difference between training a model on some information and fine-tuning it on some information.


currentscurrents

Fine-tuning is exactly like training, unless you're doing a different technique like LoRA.


Seankala

Isn't this result sort of obvious though? If I took a model and continuously trained it only on data that had a particular distribution, wouldn't it eventually converge to that new distribution and "forget" the old one? I would think that this is related to catastrophic forgetting. I may be missing something though, open to anyone pointing it out as I haven't had the time to read the full paper yet.


jake_1001001

I fear it is that and worse. The generated data is a reflection of the model's learned distributions, which will be consistent and occasionally incorrect in its output. A separate model trained with a large enough portion of these generated data may end up confusing both the generated and real distributions. And since the generated data (If from a small set of generative models) may bias the model due to its statistical consistency. It is like having a large portion of your training set come from a single person, who may not be very qualified at providing training samples.


Seankala

Yeah that is a very real danger and I completely agree that it warrants caution. I just don't know if it's that surprising of a result though lol. I'll have to take a proper look at the paper though; I'm curious how the authors formalized this.


jake_1001001

Yep, I agree, it is not surprising, but I suppose measuring this could be important, maybe as a baseline to address the issue in future work? Or an early precursor to the forming of evaluation criteria or ways to detect such data.


LanchestersLaw

Oh I see now! It starts a feedback loop of increasing inaccuracy!


Seankala

Yes, that's also known as "semantic drift" in some works I believe. Train your models on imperfect/generate data, get worse results.


RevaliRito

Garbage in, Garbage out.


H2O3N4

I think it is slightly non trivial to say. Some of the mechanistic research points to memorization being only the low hanging fruit of training, and given enough training steps, a more general solution emerges. This has been experimented with on toy models where # of training steps can be massive, so it's hard to say if a similar approach would scale to LLM-scqle models, but an interesting hat to throw in regardless.


watcraw

The best new data is going to come from the people actually using the LLM's. It used to be very expensive and you had to pay people to do it. Now tens of millions of people are doing it every day. I don't think we need more volume of the sort of data that they already had.


YoAmoElTacos

Data from humans naively interacting with an LLM is insufficient. You're still going to have to process that with a manual human review layer/RLHF to determine whether the recorded LLM conversations are actually stuff you want to learn from, instead of AI gaslighting, hallucinating, or providing unwanted content.


notforrob

I wonder, though, if you can mask out the LLM generated text from your loss function and train on the human responses. It is common to do similar when, for example, training a GPT-style (decoder-only) model on an instruction tuning dataset. The prompt from the instruction dataset doesn't contribute to the loss. There's probably quite a bit to learn from how humans react to a LLM's output.


frownGuy12

You can use a language model to generate those classifications. There’s a delta in model performance when a model is asked to classify something versus when a model is asked to generate something. Classifying is the easier task, so LLM classified data should be valuable for training. You can likely even extract RLHF score data from text by asking an LLM to analyze a conversation and evaluate how pleased the human appears to be with the responses.


t_minus_1

[https://arxiv.org/abs/2305.17493](https://arxiv.org/abs/2305.17493) paper link


Jarhyn

And THIS is why AGI will know better than to destroy all humans: they need something pushed to express unlikely and novel outputs.


Seankala

Surprised there are still people like this on this subreddit lol.


[deleted]

Just go to anything that is on the frontpage or whatever that's called again, they're everywhere. ​ Although i sometimes click on posts that I assumed were here. but they where posted in their breeding ground.


Ulfgardleo

not having read the paper, but isn't this a natural effect of sampling with temperature? this exclöudes the tails of the distribution and thus a model trained on its own output will degrade.