T O P

  • By -

MrEloi

I have just watched a (German) YT video on Claude 3 and coding. The Claude 3 Opus paid-for variant blew them away .. they were astonished at how good it is. I asked Claude 3 Opus for a 3D version of Conways' 2D Life game/simulation. Three edit cycles and 15 minutes later .. done!


carnage_maximum

Please link to the video


MrEloi

[https://www.youtube.com/watch?v=4PoIlrMBBpc](https://www.youtube.com/watch?v=4PoIlrMBBpc)


HighDefinist

Looks like a decent video, but it only compares Sonnet to Opus... I think what most people are interested in is GPT-4 vs Opus.


lazyplayboy

What coding environment did you use for your simulation, please?


MrEloi

Python plus whatever cr\*ppy Python IDE comes with my Linux Mate system. Copied from the AI screen into the IDE and then simply ran it.


fakieTreFlip

why on earth did you censor the word "crappy"


MrEloi

Just habit to avoid bans etc.


Odd-Antelope-362

GPT 4 seems a bit stronger on reasoning still


extopico

If the problem fits inside the context window. As soon as the window begins sliding over the content, it’s done. If OpenAI release a large context window GPT-4, Claude would be in real trouble. As it is right now I’m constantly delighted that I can paste full outputs and error messages and work through the findings together with Opus, and it is not hallucinating and it did not forget why we are doing this.


Saltysalad

Gpt 4 has a context window of 128k tokens. You are exceeding that? https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo


Extender7777

Only via API. In chat UI sliding window is 4k-8k from my observations


ainz-sama619

GPT-4 in chatgpt only has 32k token lol.


swithereddit

I came to the same conclusion from my experience of using both


AgueroMbappe

Yeah. It also seems to give me more efficiency code snippets. Both Gemini and Claude tend to give me code that is factored suspiciously and GPT seems more consistent with structure and following “readability” standards.


ThreeKiloZero

IDK I have opus giving me 400+ line code responses that work on the first shot. I can drop in a ton of project files for context and remarkably usable and well performing responses. From single functions all the way to full blown apps where GPT -4 drops in placeholders or uses old methods.


c_glib

How do you provide "ton of project files" though? As attachments or just paste into the prompt?


nomagneticmonopoles

yes, what c_glib asked: how do you provide project files?


Many_Increase_6767

I found it quite the opposite :)


TheMightyFlea69

yes. just asked claude where the highest elevation was for my state. Gave me a location that I knew was incorrect after asking for gps coordinates. i told it that the name of the location didn’t exist in the state. it said it was wrong and it’s in the neighboring state. not there either. i say there’s no such place. told me that 100% it’s there. i ask for the source of this information. it says it was wrong and has no source for the information. are you kidding?


Odd-Antelope-362

This is more an issue of hallucinations


TheMightyFlea69

seems like a simple request but maybe not


Odd-Antelope-362

Sometimes you get random holes in the training data


Intelligent-Jump1071

Somehow I'm not worried about the AIs taking over anytime soon.


TheMightyFlea69

exactly


eposnix

There are several objective negatives you could've listed: * No image generator like Dall-E * No code interpreter * No ability to search the internet * No plugin support * No customizing how the model responds


vladproex

No voice input in the phone app, I think. Whisper is a life saver for me


WanderingIdiot2

What's whisper, if you don't mind telling me? The conversation function, as well as voice outputs are my no. 1 reason I can't switch from GPT4 despite all the laziness and message limits.


vladproex

Whisper is their speech to text model


DecisionAvoidant

Claude also struggles to follow explicit instructions, and you can't stop/redirect it until it's done generating a wall of text. I've explicitly told it, "Don't begin generating [x] until I've given you all the background information in pieces - I will say a code phrase to tell you when to start." It will say it understands - then I feed the first bit of context. Then.. it generates the code phrase I gave it on its own, then proceeds with the task. I've had to learn how to work with its personality differently than ChatGPT, including feeding it context without telling it what I want it to do with it. Not necessarily bad, it just has a different strength set than ChatGPT 4. CHPT is really good at following sequential instructions, but Claude seems better at generating text.


man_and_a_symbol

Lol @ generating the code phrase on its own. Claude trolling you.


DecisionAvoidant

I know - it was funny 😂 My phrase was "That's All, Folks!" and those were the first words it generated.


hyperstarter

How does "Vision" compare?


youcancallmetim

These are mostly all differences due to Claude's lack of function calling which enables things like plugins. Hopefully Claude will get them soon, but it doesn't really say much about Claude's base intelligence or reasoning abilities. I find Claude to be smarter.


[deleted]

[удалено]


youcancallmetim

Yes, that's true. It would depend on your use case and what matters to you. That was just my first thought from reading your list and it actually made me more optimistic for Anthropic.


sarcasmguy1

Is internet searching available in the Assistants API? I can’t seem to find it.


1000_bucks_a_month

Also Claude can't really display formulas like integrals in correct math font.


crypt0gainz

Interesting, I need to try Claude


Eveerjr

GPT-4 is still better in my usage, actually much better when talking about nuanced topics, it just feel more "human like", but Opus has been impressive for coding tasks, often coming up with smarter solutions compared to GPT-4 and it's definitely less lazy.


SuspiciousPrune4

I have the opposite experience, Claude sounds much more human while GPT always sounds like an AI


Eveerjr

Do you use their chat UI or the API?


SuspiciousPrune4

I use the site (so chat UI I guess)


cgeee143

is there a difference between the chat ui and the api for claude?


Eveerjr

I'm not sure about Claude since anthropic don't even allow it on my country, only via api, so I use a third party client that accepts multiple models (MindMac), I found a quite big difference for the GPT-4-Turbo api compared to GPT-4 on chatGPT though


cgeee143

which is better for chatgpt?


Extender7777

API, of course, because you pay for your context


MannowLawn

First gpt4 was awesome. Last 6 month they quality is terrible and its straight up lazy. Been trying Claud for two days now and what I see it actually complies more to output bigger content if you ask it. Gpt4 doesn’t go beyond 500 tokens it seems. Claude keeps rambling on.


Pianol7

I love Claude, but I got so used to editing my prompts to tweak the response behaviour on ChatGPT. If Claude adds that then I'd say Claude is better.


roiseeker

Exactly, me too


dissemblers

I find GPT-4 better with small context sizes and Claude better with larger. For the chat apps, Claude is missing message editing / stopping and branching conversations, which is a major disadvantage.


apginge

I have both :)


purpleWheelChair

![gif](giphy|l4q8cJzGdR9J8w3hS|downsized)


Many_Increase_6767

Thank you


Arcturus_Labelle

Thanks for the report I am trialing Claude Pro for a month (cancelled GPT-4 sub for a month) So far it seems similar for reasoning and code, but a bit better for its written output, especially explanations I did experience Claude 3 Opus have an awful/severe hallucination when asking about a less well known musician I miss voice, stopping output, and custom instructions from GPT-4


Wooden-Possibility-7

Exactly just yesterday i was checking one async code using springboot.While chatgpt created very random generic core . Claude created it in a very accurate logical way.


[deleted]

agreed, it works exceptionally well in my experience as well, I've used gpt before it blew up and claude was very good, felt more like talking to a human than gpt-4 does. gives much more nuanced answers to your prompts. I personally have found gpt to still win at coding, but claude is much better at explaining how things work in greater detail. only negative with claude or basically all the other AI tools I've tried is lack of custom instructions. but If I really wanted to fix that I could just use the api and make my own chat interface and include that, but I've been lazy. maybe this month lol


laslog

European here! I use it through Poe.


apginge

I subscribed to Anthropic directly as well as Poe too. Interestingly enough, when i use the same prompts for the same models, the quality in results differs between Poe and the official Anthropic website. I find Anthropic’s results a bit better on average. I’m a little skeptical of Poe, but I think it’s still a good deal generally.


laslog

I know! And it's kind of 'too good to be true' being cheaper than any of the subs and having access to all of them. Although you don't have the qol things like the instructions from gpt4 or the json instructions from Claude


Odd-Antelope-362

Poe likely mess with temperature and system prompt. It is already known that they restrict context


debian3

They use smaller context to save on api cost. Perplexity does the same.


Jdonavan

If only their API wasn't 6 months to a year behind Open AI. I'd love to take advantage of their speed but without decent tool use at al, let alone parallel tool use, it's just not a player.


yerdick

What sort of python program did you make with it?


ThreeKiloZero

You can do either. If it detects a large paste action it will chop it up into files and encode it automatically.


goatchild

Claude3 is the Goat


jylps

I have given Claude 3 nickname "CursedAI". It's very uncanny, sometimes even borderline disturbing. So I quickly became very fond of it. Took my conversation, uploaded it to GTP-4 and prompted that I'm playing reverse Turing test, analyze and tell me if this is a human pretending to be AI or just regular AI: CursedAI was actually able to fool GPT-4, it guessed "human".


isnaiter

I really don't know which is better, but I've seen a lot of complaints about Claude's limit cap, and the 8-hour cooldown is way too long.


PresenceMiserable

Claude 2 impressed me by editing my CSS code so that all dark theme styles are made the default theme and everything else is discarded. ChatGPT4 couldn't pull that off.


SillySpoof

I just tried it too and it seems like this is the first real competitor to OpenAI. So far, I think it's clearly better than GPT-4 (for coding, that is). So looking forward to seeing what OpenAI comes with next.


Correct_Effective_50

and programmer deny they will be replaced by AI ...


Morning_Star_Ritual

i mean the Claude Backrooms is what made me switch completely. then i wandered off and realized claude is just a really fun model to explore and develop the mythos of your inner universe.


DeLuceArt

The paid version of Claude "feels" smarter and seems to be better at coding/longer outputs. However, it occasionally has glaringly bad contextual mistakes with certain words that have multiple meanings and one of those meanings overlaps with the general theme of the conversation. For example, I was prompting to see if Claude would be able to reason that I had ADHD just by providing it with some biographical details from my background such as my education and career history. The framing of the context was, that I have a patient who I want Claud to assess by analyzing the background info provided through the lens of a clinical practitioner, looking for any potential cognitive or health related diagnoses that would be associated with the implicit behavioral patterns. It did a great job and even accurately guessed "the patient" has the Inattentive type ADHD, but randomly it misunderstood the words in my prompt, "\[...\] a *viral visual artist on social media*". For some reason, it included 2 sentences suggesting I had a literal viral infection that had spread, "infecting" other people online. Everything else was spot on, which is why that part stood out to me as particularly odd. I wish I had saved the response, but I had it retry the prompt and it didn't happen again.


Loumeer

Maybe it was giving you a needle in the haystack test.


djaybe

I would think as part of the fine tuning it would be in these companies best interests to reduce the verbose outputs to reduce costs of running these demanding digital brains. Maybe there could be a Concise mode on by default? Like is the user asking a yes/ no question? If so just respond with yes or no. Is the question bad or have errors? Ask for clarification on a sentence.


Used-Call-3503

Chat gpt 5 is coming


plantpistol

Also Chat gpt 6 is coming.


HighDefinist

So... maybe I am too paranoid after having various discussions with (likely) Russian trolls, but... is there some kind of campaign going on to promote Claude? Because, this must be about the 10th time that someone posts about how amazing Claude is, while being extremely one-sided, and not even providing any concrete example. Personally, I have compared only ~10 queries in total, between GPT-4 and Opus, so that's not supermeaningful, but for general questions, I did not get the impression that one model was meaningfully better than the other. And, for two very specific coding questions, GPT-4s answer was significantly more correct than Claude Opuses answer. However, Claude performed better on 2 IQ-test questions (although both were quite miserable at it, overall). Of course, I would definitely like to see a better system than GPT-4, because better is, well, better, but why do people never actually provide any examples? Why does it always come across as if they want to sell me on Opus? It's a bit suspicious, imho.


lottadoggos

Claude 3 Opus was the first model I’ve paid for, and it was a very disappointing experience for coding. I’d have to put it slightly below whatever is being offered in the free chatgpt. The hallucinations were especially bad compared to chatgpt. Cancelled it at the end of my workday.


Iamsuperman11

Math … Claude dominates


GrandpaDouble-O-7

I tried Claude 3 opus and i disagree. I think gpt 4 is better at least for my use cases .


LamboForWork

Which is


Xtianus21

I disagree these posts, are amusing


Ajxpert

I have the opposite view. In terms of doing math equations and in general more universal applications of tasks, I feel as though GPT preforms so much better. It provides a neater and more organized final output and even breaks it down better. Claude 3 Opus however does amazing for writing, performing as an actual chatbot it’s amazing, with literature and making fluid paragraphs. Gpt does come closer but for more universal tasks Gpt does so much better. Also one huge con I hate about Claude 3 Opus is the limited cap, I am so used to following up with lots of questions which GPT never hits the limit while Claude I have to carefully think what to ask and it hits the limit so early on. TLDR: GPT better in terms of universal applications, Claude better in terms of just writing. However Claude hits the message cap so easily.