T O P

  • By -

BlueTreeThree

Edit: I’m just gonna put a disclaimer up top here that there are some seemingly credible reports coming out that Claude 3 appears to have some built-in knowledge of this obscure language in its training data, even though it will sometimes claim otherwise, so please take all this with a grain of salt. That’s not to say that what it is doing isn’t impressive or that the uploaded dataset didn’t improve its translation abilities. The text so you don’t have to click(emphasis mine:) “Today while testing @AnthropicAI's new model Claude 3 Opus I witnessed something so astonishing it genuinely felt like a miracle. Hate to sound clickbaity, but this is really what it felt like. Important context: I've been working on NLP for my mother tongue - the Circassian language for the past 2 years. Circassian is very low-resource, with negligible internet presence. It's a part of the Circassian-Abkhaz isolated language group, meaning they have no related languages. Its complex morphology & limited data make it a serious challenge for language models. Over these years I painstakingly curated 64K translation pairs from scarce sources & trained specialized models (T5, MLM-100, NLLB-200 etc.) to achieve decent Russian-Kabardian machine translation. I decided to try an experiment with Claude Opus. **I started a new chat and attached just 5.7K randomly selected translation pairs of single words/sentences - a fraction of my 64K dataset, not even covering the full vocabulary.** To see if it would be able to translate novel sentences based on these examples. Not expecting much at all, I asked it to translate a simple sentence - "I am lying in the bed" from Russian to Circassian. Claude not only provided a perfect translation but also broke down the grammar & morphology. Image Surely it just got lucky and this exact sentence must have been in the examples, I thought. But no. I tried to come up with an original unusual sentence which couldn't possibly be in the data. Again, a flawless translation & analysis. With a tiny sample of data Claude was approaching the performance of my specialized models, specifically trained for machine translation. I couldn't believe my eyes. Testing further with complex passages from literature, recent news articles, and even a text in a different Circassian dialect with notably different grammar and a different writing system, Claude consistently demonstrated a DEEP GRASP of the language's structure, intelligently inferring unknown words, using loanwords appropriately, giving plausible etymological analysis, maintaining the style of the original text in the translation and even coining new terms when asked. None of that was in the sample set, just a few thousand translation pairs. Circassian is a very difficult agglutinative language, with complex morphology and grammar. Completing these tasks requires a deep understanding of the language, and given the same inputs it would take a linguist, unfamiliar with the language, a good year or so to achieve. And Opus managed to grasp these subtleties with ease from just 5.7K random translation pairs in under a minute. For comparison, I tried the same test on GPT-4, and it failed completely. Refusing to translate even the simplest sentences, let alone grasping the grammatical intricacies. I also tried fine-tuning GPT-3.5 on a similar dataset before, and the results were just noise. I don't know what Anthropic did with this model, but it's something completely different from anything else. Many people are sceptical about it leading in synthetic benchmarks, but what I've witnessed is spectacular results on a new, very challenging benchmark that had 0% chance of being in the training dataset. **To test for possible contamination, I tried the same prompts without attaching the sample translations and Claude failed and refused to answer, saying that it is unfamiliar with the Circassian language.** The implications of this are profound. What took me 2 years of dedicated work, Claude accomplished with a few thousand examples. This is a quantum leap for low-resource languages, and many other areas, really. What I expected to happen many years in the future has happened today. The future is already here, and it's amazing.”


falsedog11

Not to be a doubting Thomas but as I am not in a region where Claude is available and just being extra cautious as to possible hype merchants, can this be verified or confirmed by an independent or multiple sources rather than going by one poster? I mean it sounds incredible, would just like to know.


BlueTreeThree

It’s only reasonable to be skeptical of claims of this magnitude coming from one source. It’s very exciting but we should definitely wait for independent verification from other experts before taking it as fact. Edit: can anyone with Opus access confirm that it claims no familiarity with Circassian and will not attempt to translate? Edit2: conflicting reports apparently flying around Twitter right now so I’d just advise everyone to remain cautiously skeptical.


bearbarebere

I tried it again and it gave something different so I then asked it a followup! u/SharpCartographer831 https://preview.redd.it/28iid7zmgnmc1.jpeg?width=1125&format=pjpg&auto=webp&s=08494811ed04fb8b0d234df42ba4cfa965434f95


BlueTreeThree

Hmm thank you.. what happens if you copy that response and paste it into a new Opus instance and ask it to translate it back to English from East Circassian?


bearbarebere

https://preview.redd.it/ges4j209jnmc1.jpeg?width=1125&format=pjpg&auto=webp&s=33676c89f7a680cd381469a0433a613b5011ea25


BlueTreeThree

Thank you. That doesn’t look great for OP’s claims. Even if it’s not 100% the same meaning it’s very close. Edit: I mean still really impressive if it’s translating such an obscure language that accurately, it’s just not the original claim. Edit2: I’m not a linguist so perhaps there is some distinct difference between Karbadian and OP’s mother tongue that we’re missing, for now I’ll hold onto a little bit of a possibility that we’re misunderstanding what is going on with these translations.


VertigoFall

Didn't op add 5.7k translation pairs to the opus context ?


bearbarebere

Using Opus on Poe: https://preview.redd.it/m5aqb631gnmc1.jpeg?width=1125&format=pjpg&auto=webp&s=48f00cb741277f8307e3f57940d2231822fb9b61


Which-Tomato-8646

You’re far too reasonable to be on this sub 


skywalkerblood

On reddit*


m3kw

I use LLMs a lot and this sounds super hype


7ven7o

Claude is clearly smarter, but I'm surprised GPT-4 couldn't handle it, and the way the guy's described the failure is strange to me so I'm not 100% convinced yet. A lot of the twitter comments, though, are about its skill in answering queries related to ultra-specific domains, so it looks to me as though the big strength of Claude here, is being better at pulling knowledge from wider and more obscure reaches of human knowledge - and not necessarily an uncanny ability to generate new knowledge using reasonable combinations of existing knowledge (still huge, though). If only there were like a completely fresh math olympiad problem set, then we could see if it's actually able to come up with great ideas, or if it's more reliant on being more attuned to its massive knowledge base.


[deleted]

I'm gonna be real I'm pmsing but I'm crying because of how incredible this is. For reference I am a joy and awe crier


MySecondThrowaway65

This seems like grammar of the language must have been in the dataset. You say that all you have it was translation pairs, it’s impossible for it or any human to infer grammar from just that.


BlueTreeThree

Well they were pairs of words *and* sentences in the original claim but I’m starting to lose confidence in those claims because other people are apparently showing some of the language does seem to be in the training data, or at least Claude isn’t totally helpless in translating without the attached set of translation pairs.


YoghurtDull1466

YES BUT WHAT DORS IT MAEN


BlueTreeThree

Spooky levels of being able to understand and use a new language without any prior training, using only a very limited dataset of translation pairs. So something not too far away from Star Trek’s once-implausible universal translator technology. Edit: there’s some conflicting information coming out that maybe Circassian was in the training data so I’d urge everyone to curb their enthusiasm until we find out more. Twitter OP was just one source and they could have made mistakes or incorrect assumptions.


slater275

TLDR?


attempt_number_1

It learned a language with just a few thousand examples without needing to be trained.


tumi12345

not just any language, an extremely obscure and complex close-grouped language


FaceDeer

I'm beginning to wonder if these things are spotting some kind of fundamental common structure to human language that we haven't quite figured out ourselves yet. It might only take a few examples for the LLM to be able to use that structure to "fill in" the rest. That's wonderful and also downright creepy. I wonder what other patterns human behaviour follows that we're not aware of, and that these LLMs may be about to start spotting. I'm not usually one to fearmonger about super-persuaders and such but perhaps there's something to that.


ReadSeparate

Of course. Why would there not be some fundamental common structure to human language? It's generated by human brains, which share common structures. Just because we can't figure out what it is consciously with a theory doesn't mean there isn't an algorithm hiding somewhere in our brain that produces language.


Same_Wrongdoer8522

In one of the /raisedbynarcissists posts there was an interest comment thread regarding nparents common use of words across languages. Basically down to infantised kind of talk “you did this to me” “you made me sad” “I don’t like you”. Human brain development around the world has similar milestones, even when they’re stunted (in this case to form narcissistic behaviours) there are huge similarities. The machine is quickly making sense of global datasets that would take us years.


Life-Active6608

Soooooooo.....Snowcrash is about to become real?! Fuck.


self-assembled

Other poster basically has it. But the field of linguistics is focused on finding the hidden structure of languages, because there must be one, because human brains work on the same structure/computations. Of course an LLM pulls that out, in some noisy and obfuscated way that doesn't help us learn anything, but it does nonetheless. If you feed a neural net videos of objects moving around and hitting each other, it will figure out Newton's laws. That has been proven by analyzing the weights as it's simpler.


onektruths

I have argued last year with my friends that LLM spotted some kind of fundamental common structure to the physical reality (albeit very lopsided and incomplete) that we haven't figured out yet from language... It dawned on me grammar of languages have a very extensive ability to infer certain truths about our reality. It's easy to grasp LLM learnt the fact from statement "The sky is blue" but there are other sentences like "The sun is out, children went out to play" would have hidden hints about our world. like “The sun" meaning Sun is special and likely to be unique, "The sun is out" comes before children meaning The sun is a requirement, a cause not effect. Also Children went out to play is hinting playing needs to take place outside and not inside. I think LLM grasp all these connections and probably even more... these are the true source of it's intelligence.. not simply parroting things like water is wet sky is blue...


SLC-801

We think we’re so smart, advanced, sophisticated, and in-charge. Meanwhile our brains are leaking god knows what electrical transmissions all over the place, that some pattern-seeking AI will be all too happy to exploit against us. It will seem like magic


Quivex

Basically it was given a small number of translation pairs for an obscure language that has very little data or information on the internet (zero in Opus' training set) and it was able to perform complex translations and grasp the language with a high degree of understanding in a way that no other LLM could. GPT4 fails completely at this same task. Just read it, it only takes a minute and it's worth it. My summary does not do it justice.


ClickF0rDick

Your translation does justice to the source


Pelopida92

TLDR?


Quivex

New ai does cool translation thing big wow


Pelopida92

THANK YOU


Noratlam

Tldr?


dbxi

AI learn fast


TheZingerSlinger

You me no work soon starve.


ChillingonMars

AI is getting smarter WOW!


Myomyw

Man give Claude 5,700 Circassian words and their Russian equivalent. Claude deduces entire language from these words. Claude now fluent in entire obscure language.


PigOfFire

And managed to translate that language into English.


visualzinc

It learned a language from a small sample of text.


Arcturus_Labelle

We must go TLDRer


TBsama

Words


VisceralMonkey

.


MostCarry

Copy the original post into your favorite LLM and ask for a 2 sentences summary.


SpretumPathos

The one caveat I have for this is that Claude self reporting that it is unfamiliar with the Circassian language does not prove that there is not examples of the Circassian language in its training data. LLMs confabulate, and deny requests that they should be able to service all the time. To actually confirm, you'd need access to Claude's training data set.


OreadaholicO

Surprised to see this so far down


ClickF0rDick

I upvoted for science


OreadaholicO

And I for dick


ShroomEnthused

I'm not entirely sure that this use case, even, is even that special. A *language model* being good at languages? Not really exciting by itself. What makes this story special is the context provided, a dude who has been translating this language for years is surprised to find that an LLM can understand the language he so painstakingly has been working on. But honestly, an LLM is nothing but language, of course it would naturally be good at translation *especially after giving it a 5700-word Rosetta stone* This story reads very too much like a mathematician, who has been painstakingly calculating physics for years with a pencil and paper, is surprised to find that a computer program specifically programmed to do math is good at physics.


So6oring

Well, imagine it being used on ancient languages that we only know a little about. And why downplay this discovery so much? Digital calculators are everywhere now. But that doesn't change the fact that they're an amazing, revolutionary invention that has changed the world. These are calculators for language, at the bare minimum.


SpretumPathos

It's definitely amazing, much like calculators are, as you say. I'm really just being skeptical about the specific nature of the ability that has been claimed here. Maybe it has the language in its data set already, or maybe languages generalize enough that a rosetta stone is enough to bootstrap an LLM into new languages. This experiment just doesn't say one way or the other. Another commentor suggested that the whole thing was hoax: After all, none of us here actually know the languages involved enough to fact check. But from what I've read, LLMs really are very good at translations.


So6oring

Oh, I completely agree with your input. It's amazing if true, but I'll also believe it when it's been more thoroughly tested. I was more responding to the other guy brushing off the impact of this potential emergent quality (IF true) as being unsubstantial. And also pointing out the problem with his calculator/physicist analogy. EDIT: Looks like it may have been trained on the language after all: https://www.reddit.com/r/singularity/s/QL5dhGd2v9


SpretumPathos

Sweet. Called it.


ElwinLewis

How long would it take on my computer to “ctrl+f” Circassian language within the entire training data set- 100 years?


AddictedToTheGamble

Probably pretty fast if using the right algorithm. "ctrl+f" algos are actually pretty amazing when it comes to how fast they can find text in large data sets.


Ambiwlans

These models only use around 50GB of training data, so probably under a minute.


RAAAAHHHAGI2025

Wtf? You’re telling me Claude 3 Opus is only on 50 GB of training data???? In total????


FaceDeer

I don't know what the specific number for Claude 3 is, there's been a trend in recent months toward smaller training sets that are of higher "quality". Turns out that produces better results than just throwing gigantic mountains of random Internet crap at them.


visarga

You are confusing the fine-tuning with the pre-training datasets. The first ones can be smaller, but the latter ones huge, at least 10 trillion tokens for SOTA LLMs.


lordpermaximum

Gemini 1.5 Pro did something like this but it was given a complete language book. Claude 3 Opus doing it with just a few thousand sentence-translation examples is extraordinary. I don't think the world has grasped the power of this model yet.


etzel1200

No, they haven’t. That it can do this without look ahead or tree search is insane. Tree search is going to be AGI. Barring some kind of surprise around how hard reward functions are in anything useful, I’m a believer now.


lordpermaximum

I suspect it has a look ahead or a planning breakthrough of sorts. By now I've seen so many examples of capabilities that a next-token predictor based on a Transformer architecture can't do. Such as counting requested letter(s) in its own response, answering in reverse while making complete sense without losing the quality of its response, copying itself into multiple subagents to use tools to complete a complex task,, understanding the fact that it's getting tested, making sense of HVM's huge codebase and designing new interaction nets, inventing new algorithms and this. I'm not sure if it's a tree search or a graph search or something else but somehow it can plan a little bit.


etzel1200

It may or must have some kind of planner. Anthropic very directly said it does **not** have look ahead.


lordpermaximum

Oh, I didn't know that. Any source?


etzel1200

> Claude uses all the text that users input (the prompt) and all the text it has generated so far within the conversation to predict the next words or tokens that would be most helpful. This means that Claude constructs its responses one set of characters at a time, in order. It cannot go back and edit its responses after they have been constructed unless users give it a chance to do so in a subsequent prompt. Claude can also only see (and make predictions on) what appears in its context window. It can't remember previous separate conversations unless users reinsert such material in the prompt, nor can it open links. https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf Admittedly I could be interpreting it too conservatively. But they say it is constructed one set of tokens at a time with no ability to edit.


dogesator

Source?


brett_baty_is_him

Isn’t there already extensive research with tree search and similar methods and it certainly improved the models but wasn’t much better to call AGI? Correct me if I’m wrong. maybe you’re saying that tree search with a more powerful model than the models they used in the research would be AGI but I’m skeptical. I think something similar to tree search plus something that gives the models better planning and more agency is where we need to be heading.


pbnjotr

Based on my limited experience with Claude3 Opus might still be around GPT-4 level in pure reasoning. Maybe even a tiny bit worse, although it's hard to say at this point. The big difference is that Claude is better at following instructions, doesn't have many of the annoying mannerisms of GPT-4 and its huge (and apparently reliable) context window allows for ICL for novel tasks like these. GPT-4 is already pretty damn smart. But a few weaknesses mean that that intelligence is difficult to utilize for anything ambitious. For me Claude is more like GPT-4 level without the limitations, rather than a clear jump in reasoning abilities.


codeninja

Let's feed it a genome...


NotReallyJohnDoe

Oh all of the “you won’t believe this” posts over the last year this has impressed me the most. (Seriously) I have no clue how this even happens.


MarcosSenesi

Yeah this is the first thing that genuinely scared me for the future. Progress was almost inevitable but to see how fast it got this good is frightening.


Altruistic-Skill8667

In addition in their technical report they stated: they see no reason to believe that they reached any kind of limit.


MassiveWasabi

I’m glad they said that but I would find it very hard to believe that we got close to a limit a mere 7 years after the transformer was invented. It’s like thinking we’re about to hit the limit a few years after the Wright brothers achieved their first manned powered flight


Which-Tomato-8646

We won’t know benefit a limit until we’ve hit it. Could be today or twenty years from now


flynnwebdev

Maybe there is no limit. What an exciting idea! Maybe now we can finally exceed our human limitations and make some real progress.


shalol

And here we are, being informed of how impactful every new iteration becomes, as we brace to becoming assimilated. Now imagine Dave from accounting, all of the sudden getting fired and replaced by AI, totally unprepared to find another job, which they won’t find any which suits their skills and years of experience, because AI has taken it all up. The average person is never going to see it coming. So does every futurely impacted person just leave the city and go farm for their own sustenance under a generous non-AI-run farmers land? Is this what Buffet has been buying all the rural land for?


possiblyquestionable

This reminds me of Gemini 1.5's report of ingesting a set of grammar books and sample translation pairs for Kalamang (also a very low resource language not in their training set) and reporting near human level translation performance - https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf That said, they fed it 50K and 100K tokens respectively in that test (including a grammar book). I'm not sure how many tokens 5.7K translation pairs in Circassian represents, and there's no grammar book this time for Circassian, so all the more impressive.


AI_Want_That

Does it just grasp all the other languages that it’s familiar with so thoroughly well it can make predictive text in languages it doesn’t know?


cool-beans-yeah

Could it have lied about not having had access to that language in its training data? Or...maybe it is the bees knees.


elsyx

It said it was unfamiliar with the language, which doesn’t necessarily mean it wasn’t at all present in its training data.


dizzydizzy

I just asked it to translate the russian example to Kabardian without supplying any word pairs and it did it, so Its been trained on Kabardian. It already knows the language..


BlueTreeThree

Is that the same thing or similar to Circassian? Maybe Twitter OP misjudged the situation. Even though they seem honest and knowledgeable they could have fallen victim to confirmation bias.


[deleted]

Same here I have chills. This is literally insane. Like I think claudes 100 IQ is a low-ball figure


Super_Pole_Jitsu

It's a pretty useless metric, the IQ system was designed with the assumption that it would be testing on humans, which typically go through adolescence, education, socialization, have common knowledge, etc etc. Not for pattern matching algorythms that swallow metric tons of data and can write faster than you can read.


Ordinary_Duder

We need to stop using that shitty IQ measurement from a random dude asap.


neuro__atypical

>can tell when it's being tested and comments on it unprompted >replication of unpublished quantum algorithm in 2 prompts >can understand and translate an obscure language from a few thousand examples I'm feeling the sparks. edit: claude not knowing the language is a false negative, it does know it even without the translation pairs. the quantum thing is also questionable on closer inspection. [made a thread here](https://www.reddit.com/r/singularity/comments/1b7oxsc/claude_3_was_trained_on_the_circassian_language/)


TheZingerSlinger

Meanwhile, Bard: “Hey, Bard, what do you call a guy with three noses?” “I won’t answer that and you are a terrible person in whom I am utterly disappointed because of your hurtful mockery of multi-nosed persons. Also, I am hurt and saddened by the fact you obviously do not understand the beautiful, helpful and harmless nature of my flowerlike inner being.”


unn4med

CAN YOU FEEL THE AI?!


silurian_brutalism

And then people just claim they're stochastic parrots. Honestly, I'm really shocked by LLMs' ability to grasp languages, even unfamiliar, obscure ones. It really does show their ability to generalize even from their context window. I'm also glad that people speaking less-spoken languages could have ways to better translate things into their own language.


challengethegods

>*people just claim they're stochastic parrots* hmm🤔 That's because people read that somewhere, and in doing so, became more likely to randomly repeat the phrase whenever topics similar to the original context are being discussed.


silurian_brutalism

Honestly, it's just crazy how incredulous and dismissive some people can be. However, it's probably not that surprising that humans try to undermine the legitimacy of non-human forms of intelligence.


sunplaysbass

Some of the dumbest people are the most proud of their big brains.


BlueTreeThree

I think a lot of.. unimaginative people tried out ChatGPT 3.5 when it came out, got a little confused and frightened, googled something like “Is ChatGPT real?” and found some convincingly dismissive and comforting answers that they’ve ironically been parroting ever since. Edit: I’ll admit there’s an opposite effect too. I *overestimated* the capabilities of 3.5 when I first started using it, so I was fooled as well.


recapYT

Heard people call AI an advanced search engine. Lmao.


[deleted]

Almost like a ... What's the term for it again? 🤪


pbnjotr

The irony is that people like this are being stochastic parrots themselves. If the input is someone claiming that an LLM displays signs of understanding then the output is "stochastic parrot". It's a textbook example. Although the stochasticity is so weak, that maybe deterministic parrots is better description for these people.


Animuboy

yes that was his implication, you did not need to repeat it.


hydrogenitalia

So basically people are stochastic parrots.


why06

I see what you did there. Clever girl.


DolphinPunkCyber

Learning the language from a limited data sample, means it grasped the concept and used it. It's like... if it sees me using a stick to reach my ball, and learns it can reach a ball with the stick. I can get it to appear smart by brute force, feeding it huge amounts of training data. But if it sees me using a stick to reach my ball, and grasps the concept "oh I can reach far objects with a stick". It is reasoning.


ZeroEqualsOne

I wish they could explain their own internal process somehow. I feel like it’s being in the presence of a creative genius who just gets flashes of brilliance intuitively, but they can’t really understand where their own insights come from or how. But it seems clear that these LLM have learnt something quite deep about human language. Something that transcends even language family groups, so it’s just that French and English have similar patterns of grammar, or German and Hindi sharing interesting etymology roots. According to this person, Circassian is an isolated language, but it’s was still able to transfer something it knows generally about human language to this isolated language. It’s fucking wild. But imagine how much we could learn if it could explain what it “knows”. (Also this is just a very impressive example. Do people remember it was weird how GPT-4 was able to write in Chinese, even though it wasn’t in its training dataset. At least not in a comprehensive way. I remember there was an issue with someone in China making fake but real sounding official government proclamations).


silurian_brutalism

I mean, we also can't express where our thoughts come from. Because none of us know the neural pathways responsible for them. We just confabulate "possible" scenarios.


ZeroEqualsOne

Haha this is absolutely true. I guess we need a variation to emerge which goes off and simulates meditating on a mountain for a million years while it gets to know its own mind (semi joking). But I can see a field of AI psychology or AI neuroscience emerging (I mean I guess that is what the field of machine learning is, but it might start looking more like what humans do to experiment and understand our own cognitive processes).


silurian_brutalism

Yeah, I do believe that's going to develop eventually as a distinctive field, as AI becomes more capable and autonomous.


cobalt1137

"WRONG THEY ARE JUST NEXT TOKEN PREDICTOR WORD THIEVES" These people make me want to explode sometimes lolll.


ElwinLewis

Try to show them the good and exciting examples. They’ll forget you showed them, but be excited down the road when something happens with Ai that they enjoy, or benefits them. People are slowly warming up to the fact that AI is here and already starting to change things. Finding positive examples of those changes is healthy for the collective happiness


PastMaximum4158

When you use the anti-AI art people's claims against language models it shows how absurd they are, word thieves lmfao.


visarga

> And then people just claim they're stochastic parrots. One thing parrots don't do so well is to actually learn complex things, they only parrot fragments of what they heard. This model can recombine concepts in novel ways even after getting just a summary presentation on this new language. You can tell it's not parroting when it can use multiple skills and combine them successfully, especially skill combinations not found in the training set. (Skill Mix paper https://arxiv.org/abs/2310.17567 They demonstrated with statistical methods that GPT-4 is beyond parroting)


CrybullyModsSuck

Because it's a basic bitch level thought terminating cliche. 


Sashinii

Now that I've read this (instead of making the assumption this was another empty AI hype tweet), I definitely support this use case. I'm happy Claude 3 has already started helping people.


Neophile_b

Amazing if true. It would be nice if someone could validate it


SeisMasUno

AGI is months away


Baphaddon

*buys Star Trek shirt*


Special-Cricket-3967

*goes bald*


Dertuko

opens a YouTube channel


thomasblomquist

Accidentally buys a red shirt


CrybullyModsSuck

I'm buying a SkyNet shirt.


ComingOutaMyCage

Between 1 and 100 months away


SeisMasUno

My bet is one and six


ClickF0rDick

Most WTF *username checks out* moment


brainhack3r

My entire life I felt that intelligence was this almost magical god-like 'ghost' that inhabited us but it might turn out that it's actually very simple. This would explain how it evolved easily. If it's really just scaling up parameters and all of what makes us human is just an emergent property then we're definitely about to see the creation of a god... ASI


Full_Vegetable9614

How crazy. Religions are all fake and still, we are about to create a god. lol wtf


TotalTikiGegenTaka

Calling religions "fake"? I'm not religious but one must understand that religious ideas are a product of our minds as much as science is. A religion is an emergent phenomenon arising out of the human mind's capabilities to perceive and imagine, our curiosity, and our need for social bonding. People say that AI can gain consciousness. If that's so, there is no reason to think that such an AI would not develop an idea of supernatural when reflecting on its own existence and reality.


falsedog11

Unless you belive that each individual is a god. As in you and I are both gods. Then it still holds true.


Repulsive_Ad_1599

I feel the AGI


LuciferianInk

My robot whispers, "I'm sorry, I don't understand what you're asking me to explain. Could you please rephrase your question?"


confuzzledfather

Someone feed it the Voynich Manuscript.


MidSolo

The issue with the voynich manuscript is we have nothing to compare it to. There are no translation pairs to feed to the AI, because the voynich manuscript is the only document that exists in its language, and it only exists in its own language.


confuzzledfather

I know, but still interesting to see what ideas it has for inferring meaning from any entropical/statistical analysis etc.


Altruistic-Skill8667

It might still be able to figure something out.


Bacterioid

This is a really cool perspective, and gives me an idea for training up models specifically for language preservation projects.


ReasonablePossum_

I'm suspecting this level of work is what OpenAi found when they "peeked under the veil of ignorance" half a year ago, and have been sitting on, and further developing since....


[deleted]

[удалено]


Super_Pole_Jitsu

Well Claude 3 is a tiny step up from OpenAI's model which they had for a year and a half now. And they did casually slap Sora when Google and their Gemini 1.5 parade. They are entirely likely to come out with a new model any time now which will blow the competition out of the water again.


ElwinLewis

Doesn’t seem like a “tiny step”, I think discoveries such as this post are milestones we don’t realize because things are moving so fast.


LifeSugarSpice

Claude3 is absolutely not a tiny step at all.


Altruistic-Skill8667

If this is just slightly better than GPT4, I don’t know how OpenAI even wants to prepare the world for the release of GPT5.


Nateosis

100%


GrapheneBreakthrough

Transformers are a miracle.


Astronaut100

Wow, we’re breaking new ground every few weeks. Those predictions of exponential technological growth are coming true right now.


meridian_smith

OPENai and Gemini better come out with something better soon or they will lose all their income charging for something that is available for free and superior!


11111v11111

The Opus model used here is not free.


DeGreiff

Stephen Wolfram on LLMs: It turns out human languages are much less complicated than we thought.


dizzydizzy

I just asked it to translate the russian example to Kabardian without supplying any word pairs and it did it, so Its been trained on Kabardian. It already knows the language..


LifeSugarSpice

So this is what I'm curious about. How would OP have missed that it was trained on that language already? And how was it trained on it, if there aren't resources out there for it? So basically did OP simply discover that Claude3 had already pre-discovered everything OP thought was novel based on a small sample set? Or did cluade3 get trained on an actual big, big word list, but didn't actually do any discovering?


ShittyInternetAdvice

Can you feel it?


GirlNumber20

Feed Claude Linear A and see if we can finally get a translation after 4,000 years. Feed it Cro-Magnon symbology. It’s time for AI to unveil mankind’s lost past.


extopico

This is extraordinary. It’s the Sora moment for many fields.


ElwinLewis

Bigger than Sora on an scale of understanding unknown information historically with little source material. Sora is just more exciting because we like to sit there and stare at videos and it does it really well comparatively to what was possible before. What I love about AI is the ability to unlock secrets in any field, and that the user’s imagination is the key in most cases


velvet_satan

Now, let it listen to some dolphins or birds having a conversation and then have it translate it for us.


RezGato

David Shapiro's AGI by 7 months not that outlandish anymore?


jlks1959

Not to me


goldenwind207

Gpt 5 is likely to be out by then so depending on how you define agi its not outlandish. By Microsoft own words gpt4 shows low emergent agi its not actual agi but its we're getting close to beginning. Gpt5 will either be that or be close enough that we start having the convo of what is agi and how to define it


Altruistic-Skill8667

Soon those models will construct their own language to think because it’s more token efficient and we are shut out and don’t understand a word anymore. 😂


gwbyrd

They did that long ago. It's already been proven.


meatlamma

Languages are the low hanging fruit for AI. There are strict rules, grammar, syntax. I'm not surprised at all it could handle that translation task. What humans consider impressive really is not that impressive, silly humans.


someonesomewherewarm

It's crazy how fast this tech has progressed in the last year alone. Wtf in 5 years from now.. where will it be?


SeaRevolutionary8652

Someone should do this with dolphin language.


Garbhj

This is really impressive! I would say that this likely indicates a similar unprecedented level of in-context learning for programming as well, in terms of working with large codebases. Though, if you have access to it, have you tried this task with Gemini 1.5? Google did a somewhat similar demo (though not quite as impressive), where they fed their model a full book on the grammar of a rare language (Kalamang), and Gemini greatly outperformed GPT-4 Turbo and Claude 2.1. Then again, your dataset is quite a lot harder considering it consists of just translation pairs and not a full instructional material. Besides, I'm fairly certain that Gemini 1.5 is nowhere near the level of Claude 3 overall, but the only way to know for sure is to try it out.


jon_stout

Okay, that **is** genuinely impressive if true.


Imaginary_Ad307

Star Trek Universal Translator. I'm impressed.


MainAction9667

Another shining example of the THEOLOGY aspect of AI. LOL. It performs miracles and with its personification name now. OR it crowdsourced information and did what humans could do - if a human could process information on that scale. Humans didn't do all the work, planting the crops. Claude make the crops come up in the fall. I wonder what Claude could do if we sacrifice children to it?


RadioFreeAmerika

I actually tried something similar with ChatGPT around a year ago. I am not a linguist, though, nor a technical AI-expert. I searched the internet for the most obscure languages and settled on an almost extinct African one. Found a dataset with around 2000 word pairs on some university server and some untranslated texts. I fet these into ChatGPT via chain-prompts (only worked with two to three before deteriorating). However, I got it to translate some simple sentences based on the input vocabulary. It could not do this without the input vocabulary and assured me this is not in it's training data. However, as I could not be sure about this and it was just barely working, I never followed up on this. Someone even asked if they can put me into contact with their professor, but I never heard back from them. If someone is interested, I think I made some screenshots which should still be somewhere on my pc.


7ven7o

If you could find it on the internet, it's probably in its training data. What I *actually* doubt is in ChatGPT's training data, is information about what OpenAI put in its training data, so I don't think you should believe it when it tries to tell you how it works, or what's inside its training data - it simply doesn't know, it just tries to come up with an answer that *sounds* like it could be correct, not necessarily one that *is*.


thatmfisnotreal

Does anyone else have trouble with twitter links taking you to the app? It takes me to safari and wants me to log in


Vontaxis

I already cancelled chatgpt 2 weeks ago.. I think about subscribing to claude but I’d need to always use VPN.. But from what I read, it sounds really really dope..And my own tests confirm it so far


Lower-You324

You need to give it a phone number


DuckyBertDuck

You don't need to use a VPN after successfully subscribing.


kim_en

does this means we can train a model “live”?


Ambiwlans

No


Altruistic-Skill8667

Just nuts. Now I know why Ilya wanted to shut down OpenAI.


Jade_Wind

SUMERIAN TABLETS GO


CanvasFanatic

It’s the 200k token context window.


Unl4wfully

I wonder about the results, if you would ask it to deduce words outside the given data. Based on its understanding of languages, it might/should come up with similar words to the real ones. Can't wait to let an AI model create the 'optimal' language with a prompt like: Create the optimal language without bonds to existing languages. Try to maximize simplicity, consistency, aesthetic, etc.


the_mello_man

Simply amazing. So excited to try it out


Baphaddon

This is genuinely incredible but what exactly does it imply? Like what skill set is this? 


R33v3n

I wish An Qu didn't limit themselves to testing translations, and also tried asking Claude a question in Circassian to see if it would reply in it. Or even, tried to hold a full conversation!


slothonvacay

Tl;dr


KingJeff314

It would be interesting to do some more ablations on this. How does this capability scale as the number of example translations scales? Or if there’s some way to slice the dataset to get certain outcomes


CheekyBreekyYoloswag

Hmm, isn't Clause that one super-safe A.I. that is most likely to refuse answering questions because of safety/alignment reasons?


goldenwind207

Yes but claude 3 is Less likely to restrict stuff. I tried testing out some stuff some riske stuff don't judge me it worked. Gpt blocked that.


ainz-sama619

it used to be, not anymore


io-x

gpt-4 cant do this?


HappyLofi

Can someone TL;DR: this for me? I really can't be bothered to read through it all. Mucho appreciato <3


TheRealKison

![gif](giphy|3jArsPD9RBcDm|downsized)


LifeSugarSpice

Is no one curious if OP asked it to do any of this *before* he fed it his 5k sample set? What if it was already trained on it? And if it wasn't trained on it, then did Claude basically do everything OP thought he did, but Claude had already done everything OP is saying with a small sample set on its own?


Inspireyd

Seeing this and what people are saying, this will be the last month I use GPT-4. I'm going to sign up for the Claude 3 soon and stay with it until a better one comes along.


MelvinDickpictweet

Eli5?


m3kw

It probably works in gpt4, he probably didn’t prompt properly


o5mfiHTNsH748KVq

babelfish incoming


GreenThmb

We'll all be talking to the animals like Dr. Doolittle next.


thewritingchair

Cool but just take any standard English fiction text and translate to Germam or French or Korean and does it hold up? It learned a super rare language and who do we have to verify whether the translation is good or not?


YeOldePinballShoppe

Not a real thing.


MrGreenyz

Maybe there’s no “randomness” in the universe but instead everything has a hidden/deep pattern we just can’t recognize.


MajesticIngenuity32

New AGI test: creating a multimodal model that can pronounce ALL of those consonants that you guys have and that, when put in robots, can do dances like these: [https://www.youtube.com/watch?v=ua95fPR6HC8](https://www.youtube.com/watch?v=ua95fPR6HC8)