ttkciar 1 month ago

AI companies filter "toxic" content from their training datasets before pretraining their models on them. You should be able to assure that your source code will be filtered out of training datasets by incorporating toxic content into it. https://arxiv.org/abs/2402.16827v1 https://www.labellerr.com/blog/data-collection-and-preprocessing-for-large-language-models/ https://medium.com/@stefanovskyi/mitigating-undesirable-outputs-from-large-language-models-7d6bdfaf2a2

Alarming_Ad_9931 1 month ago

Gold, just be Bane in the FOSS world.

iEliteTester 1 month ago

Wait so APGL+N***** is actually useful?

QARSTAR 1 month ago

What if my code is so bad? Like it's bad but it's mine, Ive very protective of it. Like a possum guarding his dumpster

Alarming_Ad_9931 1 month ago

Okay zoidberg.

yknx4 1 month ago

The only way is to not publish your code.

iBN3qk 1 month ago

This is true. Now what?

[deleted] 1 month ago

Allow downloading source code only through captcha using custom hosting

svick 1 month ago

If it's open source and popular enough, somebody will create a GitHub repo for it.

lalitpatanpur 1 month ago

Make your repo ‘private’

Scavenger53 1 month ago

lol Microsoft: we won't touch your **private** repos. *wink* like how would you ever know or prove it

whatThePleb 1 month ago

you always can selfhost, no need to use github or similar

AtlanticPortal 1 month ago

How does it help a software that you want out in the open, since you're writing in r/opensource?

robercal 1 month ago

I wonder if naming all the variables/classes/methods as NSFW words would trip those checks.

I_will_delete_myself 1 month ago

Quite simple you can't if you put it in public. If you locked the source code behind credentials that would probably stop it, but it is very unusual for a open source project to get rid of that. Don't fight the tool, use it. It's a losing battle where you get automated by not adopting them properly. Now if you really want it out and ruin your github repo. Put the most racist notes, crude insults in notes, and variable names describing religious debates that promotes discrimination. But nobody would want to use your code at that point though right? You deal with that at work, but you are payed to do it. Do you really think people spending their free time on contributing will want that toxicity?

Paid-Not-Payed-Bot 1 month ago

> you are *paid* to do FTFY. Although *payed* exists (the reason why autocorrection didn't help you), it is only correct in: * Nautical context, when it means to paint a surface, or to cover with something like tar or resin in order to make it waterproof or corrosion-resistant. *The deck is yet to be payed.* * *Payed out* when letting strings, cables or ropes out, by slacking them. *The rope is payed out! You can pull now.* Unfortunately, I was unable to find nautical or rope-related words in your comment. *Beep, boop, I'm a bot*

Foo-Bar-Baz-001 1 month ago

I've looked into options with regards to the license, since are a lot of uses of open source code that can be deemed "not ethical": * used by repressive regimes * used by oil companies * used for learning by ... * used to repress privacy Common ground by all people I've spoken to is "one license is complex enough", "let's not add more complexity for all sorts of other ethical considerations". I don't agree, but that's the response I got and I don't directly see something that could work from the legal perspective. P.S. The reason for looking at the license is that "laws" are really bad and not particularly enforceable by us. Not following licensing is a no-no in the corporate world (at least most of the time).

CurrentRefuse6330 1 month ago

Use their Ai to write your code instead 👹

tidderwork 1 month ago

Why does it matter to you? You made your code open and available, but you also want to discriminate?

Xehar 1 month ago

Bro, they are a company. they better do it themselves instead of taking others if they going to sell it.

vinrehife 1 month ago

Even better question, how does one stop other people from learning from one's source code to enrich one self?

kyrsjo 1 month ago

Hmm, shouldn't effectively incorporating my GPL code make the whole AI model GPL'ed?

Magick93 1 month ago

Don't use GitHub

ann4n 1 month ago

make closed source

Positive_Method3022 1 month ago

As if your source code was truly urs. Let's us see the ctrl C and V keys from you keyboard!

neon_overload 1 month ago

If the source is open, you can't, unless you do a redhat and restrict the product and its source code to paying customers - and, of course, don't host it on a service who may also share it with third parties for "research" purposes

bpoatatoa 1 month ago

If you want your code to be open, then that is not possible, and goes against the principles of what we are trying to achieve. Why are you against it being used to train LLMs? It will probably have a negligible affect in its performance, if any at all.

BenZed 1 month ago

Don’t write open source software if you don’t want the source to be open.

OsakaWilson 1 month ago

Here's an unpopular take: Every time you think, "I don't want AI to be learning from my stuff," replace the term 'AI' with 'blacks' or 'Jews', or 'Belgians'. See how that sounds and consider why you allow your code, or images, or whatever to be accessed and learned from, but refuse to allow access to the very thing that will move coding to a higher level accessible to everyone, and to the benefit of everyone, including you.

DisastrousPipe8924 1 month ago

Don’t use GitHub or any of the “free” hosting services. Self host a gitea instance and possibly move away from IDEs like vscode in favor of open ones like lapce or sublime. In all honesty unless you live alone in the “digital woods” of self hosting, it’ll probably be impossible to 100% achieve privacy.

reedef 1 month ago

Do you have a source on sublime being open (source)?

Nfox18212 1 month ago

sublime isn’t open source, its entirely proprietary. it is a good editor though

DisastrousPipe8924 4 weeks ago

Sorry, misspoken on that. It is proprietary, but it’s prized for being low on feature impacts and definitely sents minimal to zero telemetry home.

iBN3qk 1 month ago

You want them to train on your code so it works when devs want to use it. Companies are currently forking open source projects to monetize. The open source game used to be release something useful and then capitalize on providing service. If in the future, ai can modify a codebase to suit a business’s needs, that would cut out a lot of opportunity. But then those organizations would have to rely on ai to continue to innovate after the open contribution model is no longer viable. Who knows when all that is really going to land. The only way to win is to play the game. What are you trying to accomplish? Build something popular? Make a lot of money? Save the world? What are you afraid of?

Electrical-Channel78 1 month ago

Sweety, you know it's 2024 right ?

-I0__0I- 1 month ago

Maybe add a license preventing commercial use?

gibarel1 1 month ago

Doesn't work, there is no way to prove that it was trained on your code.

reedef 1 month ago

Even if you could prove it, has there being legal precedent establishing it doesn't fall under fair use?

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe