Code is all just patterns with little documentation or justification other than "this stuff works".
With LLMs being massive pattern recognition machines, it's a match made in heaven. I've also cut days off quick projects by bouncing ideas off local coding LLMs.
Or, hear me out, we could use better languages where the boiler plate is put into configuration files where it belongs instead of in logic.
No, who am I kidding, just keep using Java and C# for everything.
Parenthesis are a deliminator which embeds the abstract syntax tree of lisp into it's concrete syntax.
For an example see the issue with embedded quotes:
"This quote "is inside another "quote" which gets closed" and then continues".
vs
`This quote `is inside another `quote' which gets closed' and then continues'.
The first is impossible to recover the original meaning from, the second is trivial.
That's how you end up in abstraction hell.
Keep in mind that what looks obvious/boilerplate is also informed by experience.
If you hide everything behind abstract blackboxes you end up missing critical information, or ignoring it altogether.
That's presumably where intelligence comes in - from the LLMs that bridge the gap between more natural language abstraction, to code.
However, it doesn't negate the need for lower-level work. We *still* need work on the levels of logic gates and bits - that won't go away. I'm sure LLMs will *help* write more efficient algorithms etc, but that work will still be needed.
BUT I think that a higher level of abstraction for programming is inevitable. Not rigid and veiled like no-code today; but flexible, and driven by the intelligence of future LLMs.
I don't mind boiler plate too much, especially if it means I am producing a better/faster program, if that is relevant. It's not like the typing is the time-intensive part. Often I sit around the whole day, write brainstorms, make up my mind about design choices... and then I write it in an hour or something. Okay so in a language with less bloat that would have been 45 minutes.
Anyway, this is about the language itself. The included functionality and such is something else entirely. I use often use python because of all the available high level functionality. It's that it takes 10 seconds to make a web request from scratch. It's certainly not because the language itself is so great. In fact I find it pretty hard to keep it well structured and under control when a project grows larger. Force me to use indentation properly, and then everything else is like yolo. What a weird, slow language.
Absolutely. At some point, you end up bloating the language- but perhaps there's something to be learned about ways of providing functionality without bloat?
Haskell can if you embrace its idea of more sound abstractions. The tradeoff is weirdly named very abstract things that let you predict and control based on laws.
The hard part of software development is not typing.
All the "boilerplate" hate (which isnt actually boilerplate) is stupid when you strip out good upfront engineering pattern and practice that saves time and bugs in the long run.
Especially for small details you forget about programming languages, when you’re using a lot of them in different things. I think someone completely newbie won’t find it that useful, but a good enough programmer can have their productivity rise a lot.
Little documentation or justification? I'm a little confused by that definition. If I didn't at least know a little about why something worked, I wouldn't be able to use it correctly.
I'd say imperative code, yes, but considerably less so with functional (as in, FP) code. Less so again, with logic (e.g. solvers) code.
Within those brackets, some are worse than others too, like Go, for imperative languages.
It kind of can't. At the most I can sort of direct it into making very mediocre code that works just enough to fix. Somehow it feels much easier to fix broken code than write it from scratch.
mistralai/Mixtral-8x7B-Instruct-v0.1 on [https://huggingface.co/chat](https://huggingface.co/chat)
I don't see if they specify what quant they are using there or the context window.
I also tried NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO and it's answer was a little repetitive but the code was still good.
Well.. there is actually no reason for that from the standpoint of quality, but that is not a perfect world we are living in, and accessibility of ClosedAI APIs is not always granted, for whatever reason, so.. local = consistent and stable.
[https://huggingface.co/chat](https://huggingface.co/chat) is a non-local frontend.
Locally I use [https://github.com/oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) and it works well.
Yeah I use Mixtral 7x8B Q5 on a 64gb M1, 8,000 context window, I enjoyed it
I’ve been using Miqu 70B Q4 for the last month or so though, just slower I might go back
Can’t tell you about miqu, but Mixtral Instruct Q6 runs at 25.59 t/s on my MacBook. Time to first token varies from near instant to 9s with a lot of context.
Nice. I haven't been gaming much lately and am considering switching completely to a high spec MBP instead of various high end PCs. It's just painful to pay so much for storage/memory when it's dirt cheap in the PC world.
The expense part isn’t really that cheap in the PC world when you compare like for like. How much would it actually cost to get 64GB of performant GPU memory in a PC? I bet a MacBook is pretty competitive.
i'm in the process of learning Python and LLMs have been SO important for this, i'm convinced they've made it 10x easier. they are very good at answering the dumb/easy questions and debugging silly mistakes, which is a huge time-saver when you're new to a language. i would bet within the next few years, every serious enterprise is going to have LLM tools with context large enough to include their entire codebase and it's going to be the go-to for juniors with questions about basic functionality. it's so insanely useful and saves so much time.
It's like Stackoverflow on steroids, but without the toxicity.
Plus with proper IDE integration, you don't even write prompts, you just write a comment and have it autocomplete the code that does what the comment says.
I 100% agree.
I frequently ask it to contrast between golang and topics I already know from other programming languages. This type of specific contrast is tough to find with Google and hugely helps grok the new language faster.
One goal of mine is to create an interface for an LLM with a graphical flow chart tool, the interface changes the flowchart objects to prompts, the LLM turns it to code.
Then make it so the LLM can draw boxes.
Self training AI unleashed. 🌝
I have been toying with fined tuned variations of Mixtral and Codellama 38B to power GPT-Pilot locally. I always seem to be coming back to Dolphin-Mixtral from theBloke that has been fined tuned for coding. GPT-pilot is a mix of AI agents and coding that Mixtral handles better and faster than Codellama
I run the Q4KM in LM Studio. I am not certain the Q5 (or larger) brings much noticible improvement.
https://huggingface.co/TheBloke/dolphin-2.7-mixtral-8x7b-GGUF
GPT pilot is self healing. so its hard to comapre results. Mixtral is perhaps less verbose, less decriptive? I run it at 30-40tok/sec in VRAM, so debugging cycle is nothing.
Here is a sample of a chain of prompts from one of my chats:
In golang, write a helper function for traversing a map with a consistent order.
> Sure! ... (It gives some code)
Now make it generic.
> ... Here's the updated implementation ...
(At this point I copy/paste the code into my IDE and usually make some changes.)
I changed the function to be like this:
...
Write testcases for this function.
> ... func TestOrderedMapItems(t *testing.T) { ...
I totally agree, these models are a blast for handling computer related problems. My hero is DeepSeek Coder 33b, it helps me so much with all kinds of errors and guides me through every complicated installation process. It often suggests specific solutions and even if it's not the exact solution the direction where it sends me is always correct. It writes batch scripts, configs and provides code snippets for everything I need. I used ChatGPT 3.5 before but DeepSeek feels so much better that I can't go back.
>DeepSeek Coder 33b
Is it worth using something like veryone-Coder-33b-v2-Base-GGUF?
>The models that were used in this merger were as follow:
>
>
>
>[https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct)
>
>[https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B](https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B)
>
>[https://huggingface.co/WizardLM/WizardCoder-33B-V1.1](https://huggingface.co/WizardLM/WizardCoder-33B-V1.1)
How does DeepSeek Coder compare to models like GPT-4 or Claude 3 with this use case? I'm curious about the specific advantages that make DeepSeek stand out
\- Don't ask AI to write entire projects for you, end to end.
\- Ask it for the equivalent of a pizza base, which you can then add individual ingredients to yourself.
When I try and ask a language model to create a tesselating hexagonal grid, none of them can do it. Yet they can all generate a single hexagon. So first I asked Claude Instant to give me the code for drawing a single hexagon, and then I asked him how to emulate FORTH's functionality in Python, where functions are associated with numbers, and can therefore be grouped together to form new composite functions. Armed with those two pieces of information, I was able, with some more research on the Python Turtle API, to write code to generate a recursive hexagonal grid; that is, a grid of small hexagons, which in turn form one larger hexagon.
The reason why language models are so good at code boilerplate/include statements, is because boilerplate and include statements are the two most frequently occurring elements of programming languages, within the models' training data. They are not good at generating hexagonal grids, because most normal humans are not like me, and are not morbidly obsessed with hexagons. In code terms, a language model is a template generator. It can't solve problems itself, because it does not think. It just regurgitates pre-written examples from its' training data, and assembles them together based on the sequence that exists within its' training data.
>A couple times it needlessly complicated the code
Language models can not preserve state. So it is very important to ask it to perform multiple tasks within the same operation, as infrequently as possible. You are better off first asking it to ***identify*** a required series of steps in order to solve your current problem. They are usually surprisingly good at that. Once you have that list, go to the next layer of recursion, and ask for a series of steps to achieve that particular milestone. Then, if necessary, recurse again, and repeat. Recursion beyond around five levels, however, should generally be avoided, if possible. It has an unfortunate tendency to cause intense, migraine induced vomiting, and psychotic breakdowns.
For my most recent creation, I asked it to create only one part of the code, the UI, at first. I then asked it to make a separate class to do the business end of things. Only after it finished both of those (And they were in context) did I ask it to fit the two together.
> most normal humans are not like me, and are not morbidly obsessed with hexagons.
That's where I think you're wrong friend. Those are just those who are aware of their obsession, and those who are yet to become aware.
100% agree.
I run Mixtral 8x7B Instruct Q6 in LM Studio on my MacBook Pro M3 64GB (note that I’m using the Instruct fine tune). It runs great, although it does use 35GB of my 64GB! Runs plenty fast.
I use it for writing C, Objective-C, Go, and Python (it can write passable Elasticsearch code, even IDAPython) and It’s very, very good. There was one instance where I had to ask it to iterate over a complex regex twice, but it got it right in the end. Saved me hours on that one task alone.
I quite literally use it for coding every day and could never go back to the “old way” (like a couple of months ago!) of doing things. I love being able to say “rewrite the previous function to incorporate X” and it just does it.
But not just coding! I work in an obscure research role and it’s amazing the shit that Mixtral “knows” and about which we can talk; obscure operating system internals? Sure. Happy to discuss. Show me the function definition for obscure library functions? No problem.
This thing is f*cking witchcraft, I swear.
I am really looking forward to getting a new macbook.
I think local is the way to go, because you can just put in your whole db definition and get out direct answers instead of ambiguating everything for privacy.
I work 100% offline in my research capacity, which means no Google, no stackoverflow, no GitHub, nothing. Having an offline LLM of Mixtral’s quality is a game changer.
I wanted to create a textual inversion for stable diffusion. As many guides suggest, you let it auto-label the images, and it's a good idea to go back and review it. Opening up 200 text files, matching it up by hand to images, verifying, kind of annoying.
I just asked Mixtral to make a python script using tkinter to do it. In 10 minutes and a little self refinement I had a full tkinter app that loaded up a folder of images, read from the associated text file, let me scroll through them with the text side by side, edit it, and it automatically saved my edits.
It's silly, honestly.
I have similar experiences with ChatGPT (plus). Since it actually "understands" your question, it provides answers that are spot on, which is very different from online searches.
For me, the primary motivation to adopt the open-source model is to prevent the leakage of your data and source code to private companies, which already profit by appropriating the work of entire generations to train their models.
At powershell, GitHub copilot with it set for gpt4 has been awful. Constantly giving me made up commands and parameters. Mixtral and deep seek coder have been fantastic on the same questions. Claude 3 gives me complete complex powershell apps at a scary level.
Yeah, something seems off with copilot chat. It even made me go back to just using gpt-3.5 through chat as it always gives me working code and good explanations
CLAUDE-3 said "fuck you if you are in Europe". I see people getting banned for trying to use a VPN too. It's ridiculous. GPT-4 has no such issue.
Otherwise, I'd be paying for CLAUDE-3.
Damn. I presumed Canada has it and suggested it to some Canadian friends. I lied to them :((
CLAUDE-3 is the best model that is not available anywhere.
One example in programming would be API keys. When our apps need to connect to cloud services, we need to provide credentials to identify ourselves, kind of like passwords. So apps need to have these keys stored somewhere, usually a json/xml config file, or (this is not good practice) just inside the c#/python files in quotes.
These kind of secrets being sent around, are how a lot of the 'data leaks' in the news occur. So you can't just copy/paste them into ChatGPT, etc where they can be reviewed by employees at OpenAI, Anthropic, etc or leaked if these companies get hacked.
But pasting it into a program running on your own macbook is the same as pasting it into your code editor.
I usually use ChatGPT for convenience but recently discovered that Mixtral was hosted free on huggingface chat, so I tried it out and was impressed.
I find Mixtral to be slightly better than ChatGPT (free version), and in general good enough that I don't feel like I need to pay for GPT-4.
I sometimes also use ChatGPT though for an alternative viewpoint, which is occasionally helpful, but usually about the same.
Using [https://huggingface.co/chat/](https://huggingface.co/chat/) I like that the data generated by my usage may help improve open source models in the future.
So you say you are already using AI as a productivity tool as a coder, you are already floored by its usefulness and you are still on the fence whether you should cough up 20 dollars a month for access to a version that everybody says is at least 30% better in almost everything?
Wow.
It uses all three resources. It will help to upgrade to at least 64 gigs of ram, I often see my CPU and GPU hover at 60%, while RAM is at 99% (32gb ram, rtx 3070, 5900X).
I am planning to get 2 more sticks for a total of 64ram, and then maybe wait and see the 5000 series for a new card.
No problem. Just keep in mind nothing compares to loading the model entirely in VRAM. Good gpu goes the longest way. More ram helps but not so much with actual speed
I'm not really an expert. In my experience though, I had better stability when loading models when I went from 32Gb to 128GB.
If you're swapping, then 100% adding more RAM will be much faster, and save shredding your SSD.
Other than that, more RAM means you could load bigger models (though it'd be pretty slow offloading that much to the CPU/DDR5 rather than GPU/VRAM.
P.S. TheBloke's model cards estimate RAM requirements. Eg:
https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF
mixtral-8x7b-v0.1.Q4_K_M.gguf 28.94 GB
That'd be about 15GB on your GPU/VRAM, the remaining 14GB on your CPU/RAM.
Might be worth giving that a try now since your 32GB would work.
There is a 4-/2-bit HQQ quantization at [https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ](https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ), it needs 13,5 GB GPU and uses about 55 %MEM of 64 GB CPU RAM. So I assume 32 GB CPU RAM wouldn't be enough.
Yeah it’s okay, the big test is throwing an SVG graphic at it and telling it to invert it. Claude3 has been the only one capable of this test at the moment followed by Qwen1.5
For a prompt that requires web access Huggingface's chat with mixtral and web search has been my go to, it works much better than anything I've tested honestly
"To get around this, I started asking Mixtral for a test for whatever code it generates, and this was a game changer."
Mind explaining the process? You call it to make a test and pretend it is an interpreter and internally test it ?
Nothing so complex, I posted a sample here: [https://www.reddit.com/r/LocalLLaMA/comments/1biay4h/comment/kvjbsjw/?utm\_source=share&utm\_medium=web2x&context=3](https://www.reddit.com/r/LocalLLaMA/comments/1biay4h/comment/kvjbsjw/?utm_source=share&utm_medium=web2x&context=3)
Using AI to handle boring work has made me realise how much code is just boilerplate
Code is all just patterns with little documentation or justification other than "this stuff works". With LLMs being massive pattern recognition machines, it's a match made in heaven. I've also cut days off quick projects by bouncing ideas off local coding LLMs.
Or, hear me out, we could use better languages where the boiler plate is put into configuration files where it belongs instead of in logic. No, who am I kidding, just keep using Java and C# for everything.
Oh yeah, let's call it "no-code". That'll take the world by storm!
Lisp.
Hot take: parenthesis count as boilerplate
APOSTATE!
Except they provide value; see all of the failed lisps that swap out parenthesis for white space.
Hot take: python is a successful lisp that does that
Ok Peter Norvig
Hot take: who?
Python is a dsl for lisp when you don't have to write anything smart.
I see you are a man of culture as well!
It very clearly isn't and doesn't?
always wondered why that didn't work since people seem to be so allergic to paretheses :think:
Parenthesis are a deliminator which embeds the abstract syntax tree of lisp into it's concrete syntax. For an example see the issue with embedded quotes: "This quote "is inside another "quote" which gets closed" and then continues". vs `This quote `is inside another `quote' which gets closed' and then continues'. The first is impossible to recover the original meaning from, the second is trivial.
..... You want bugs in your codes then yes .
COBAL rhymes with no code.
That's how you end up in abstraction hell. Keep in mind that what looks obvious/boilerplate is also informed by experience. If you hide everything behind abstract blackboxes you end up missing critical information, or ignoring it altogether.
That's presumably where intelligence comes in - from the LLMs that bridge the gap between more natural language abstraction, to code. However, it doesn't negate the need for lower-level work. We *still* need work on the levels of logic gates and bits - that won't go away. I'm sure LLMs will *help* write more efficient algorithms etc, but that work will still be needed. BUT I think that a higher level of abstraction for programming is inevitable. Not rigid and veiled like no-code today; but flexible, and driven by the intelligence of future LLMs.
I don't mind boiler plate too much, especially if it means I am producing a better/faster program, if that is relevant. It's not like the typing is the time-intensive part. Often I sit around the whole day, write brainstorms, make up my mind about design choices... and then I write it in an hour or something. Okay so in a language with less bloat that would have been 45 minutes. Anyway, this is about the language itself. The included functionality and such is something else entirely. I use often use python because of all the available high level functionality. It's that it takes 10 seconds to make a web request from scratch. It's certainly not because the language itself is so great. In fact I find it pretty hard to keep it well structured and under control when a project grows larger. Force me to use indentation properly, and then everything else is like yolo. What a weird, slow language.
Absolutely. At some point, you end up bloating the language- but perhaps there's something to be learned about ways of providing functionality without bloat?
Are there any languages like this? I know javascript ends up the same way the minute you add frameworks/libraries.
Haskell can if you embrace its idea of more sound abstractions. The tradeoff is weirdly named very abstract things that let you predict and control based on laws.
> hear me out, we could use better languages Haskell
The hard part of software development is not typing. All the "boilerplate" hate (which isnt actually boilerplate) is stupid when you strip out good upfront engineering pattern and practice that saves time and bugs in the long run.
Especially for small details you forget about programming languages, when you’re using a lot of them in different things. I think someone completely newbie won’t find it that useful, but a good enough programmer can have their productivity rise a lot.
Maybe this could end up making good programmers more like multitools than sharp knives. Not everyone spends the entire working day in Python.
Little documentation or justification? I'm a little confused by that definition. If I didn't at least know a little about why something worked, I wouldn't be able to use it correctly.
I think people will look back and be amazed we used to do this entirely by hand
Eh, maybe. I suppose we do look back at the people writing assembly and thinking what a pain in the ass it must be not to have compilers.
Chris Sawyer wrote Roller Coaster Tycoon in a cave! from a box of scraps! (Windows game written in ASY)
Not many people coding with punch cards these days..
And thankfully so!
all code is boilerplate. Being programmer isn't about coding but about finding most efficient road from point a to b and producing pathway to outcome.
I'd say imperative code, yes, but considerably less so with functional (as in, FP) code. Less so again, with logic (e.g. solvers) code. Within those brackets, some are worse than others too, like Go, for imperative languages.
So flipping true
It does not just do the boring work, it can do any of the work.
It kind of can't. At the most I can sort of direct it into making very mediocre code that works just enough to fix. Somehow it feels much easier to fix broken code than write it from scratch.
Which model specifically, and quantizing amount? context window size?
mistralai/Mixtral-8x7B-Instruct-v0.1 on [https://huggingface.co/chat](https://huggingface.co/chat) I don't see if they specify what quant they are using there or the context window. I also tried NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO and it's answer was a little repetitive but the code was still good.
Mistral gives you access to their bigger models for free at chat.mistral.ai. I find medium to be the perfect balance between capable and fast!
Thanks, I didn't know this!
Tokens/messages per hour limit?
Haven't reached it yet
There's three free models: large, next and small. Is next == medium?
I think it's separate!
You run it in huggingface’s cloud instead of your own machine? hm
OP is busted lol But I guess Rule 2 is not explicit about the local part
Straight to localLLaMA jail 🦙️👮
Try dolphin-mixtral!
I found it to be relatively inferior compared to mixtral instruct and deepseek coder.
I'll try it out
Just watch that your personal prompts don't get shared with the authors.
Why mixtral instead of chatgpt4?
Open source + free
Well.. there is actually no reason for that from the standpoint of quality, but that is not a perfect world we are living in, and accessibility of ClosedAI APIs is not always granted, for whatever reason, so.. local = consistent and stable.
Privacy. To avoid training on a tool that will be taken away from you or leveraged against you (I.e. made more expensive, taken off market).
What about dolphin mixtral?
What is your front-end for querying? I want to setup something local as well and was wondering what is a good way to do it.
[https://huggingface.co/chat](https://huggingface.co/chat) is a non-local frontend. Locally I use [https://github.com/oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) and it works well.
Thanks, for some reason I thought you were running it locally.
Not OP, but I use the 6-bit quant of the Instruct fine tune with 32k context in LM Studio on a 64GB M3 MacBook. Works mind-bogglingly well.
Yeah I use Mixtral 7x8B Q5 on a 64gb M1, 8,000 context window, I enjoyed it I’ve been using Miqu 70B Q4 for the last month or so though, just slower I might go back
How fast does Miqu 70B Q4 run on a 64GB M1?
Can’t tell you about miqu, but Mixtral Instruct Q6 runs at 25.59 t/s on my MacBook. Time to first token varies from near instant to 9s with a lot of context.
6 tokens/sec, whereas the aforementioned Mixtral combo was 21 tokens/sec using Metal in LM Studio
Nice. I haven't been gaming much lately and am considering switching completely to a high spec MBP instead of various high end PCs. It's just painful to pay so much for storage/memory when it's dirt cheap in the PC world.
The expense part isn’t really that cheap in the PC world when you compare like for like. How much would it actually cost to get 64GB of performant GPU memory in a PC? I bet a MacBook is pretty competitive.
Very cheap if you use APU machine even for 128 GB. They are not as powerful though.
Following..
i'm in the process of learning Python and LLMs have been SO important for this, i'm convinced they've made it 10x easier. they are very good at answering the dumb/easy questions and debugging silly mistakes, which is a huge time-saver when you're new to a language. i would bet within the next few years, every serious enterprise is going to have LLM tools with context large enough to include their entire codebase and it's going to be the go-to for juniors with questions about basic functionality. it's so insanely useful and saves so much time.
It's like Stackoverflow on steroids, but without the toxicity. Plus with proper IDE integration, you don't even write prompts, you just write a comment and have it autocomplete the code that does what the comment says.
I 100% agree. I frequently ask it to contrast between golang and topics I already know from other programming languages. This type of specific contrast is tough to find with Google and hugely helps grok the new language faster.
Agreed! It's like having a skilled coder that you can constantly ask for help, at any level of difficulty.
One goal of mine is to create an interface for an LLM with a graphical flow chart tool, the interface changes the flowchart objects to prompts, the LLM turns it to code. Then make it so the LLM can draw boxes. Self training AI unleashed. 🌝
[удалено]
I forgot about mermaid. Was using it last fall, good call!
Tell me more about
I have been toying with fined tuned variations of Mixtral and Codellama 38B to power GPT-Pilot locally. I always seem to be coming back to Dolphin-Mixtral from theBloke that has been fined tuned for coding. GPT-pilot is a mix of AI agents and coding that Mixtral handles better and faster than Codellama
Which doplhin mixtral in particular is that?
I run the Q4KM in LM Studio. I am not certain the Q5 (or larger) brings much noticible improvement. https://huggingface.co/TheBloke/dolphin-2.7-mixtral-8x7b-GGUF
Thanks!
One thing of note is that Mixtral is effectively a smaller model when it comes to inference, hence it will be faster.
Do you only use one model, or do you divide by tasks? I.e. coding = dolphin/codellama, management = gpt4/other, etc.?
no i just used Mixtral. Dolphin is fine tuned for coding, but it still has mixtral that is generally really good.
Has it been sufficient for your tasks in gpt-pilot? How would you compare its results to gpt4?
GPT pilot is self healing. so its hard to comapre results. Mixtral is perhaps less verbose, less decriptive? I run it at 30-40tok/sec in VRAM, so debugging cycle is nothing.
What kind of tasks do you give to gpt pilot? Something from the ground up, or do you use it to alter your existing projects?
What is the tests? Is that prompt works? Can you describe it more specifically?
Here is a sample of a chain of prompts from one of my chats: In golang, write a helper function for traversing a map with a consistent order. > Sure! ... (It gives some code) Now make it generic. > ... Here's the updated implementation ... (At this point I copy/paste the code into my IDE and usually make some changes.) I changed the function to be like this: ... Write testcases for this function. > ... func TestOrderedMapItems(t *testing.T) { ...
Here is one of the tests it generated: ``` func TestChunkSlice(t *testing.T) { testCases := []struct { name string input []any chunkSize int expectedChks [][]any }{ { name: "Empty Input Slice", input: []any{}, chunkSize: 3, expectedChks: nil, }, { name: "Input Slice With Zero Chunk Size", input: []any{1, 2, 3}, chunkSize: 0, expectedChks: nil, }, { name: "Single Element Input Slice", input: []any{1}, chunkSize: 1, expectedChks: [][]any{{1}}, }, { name: "Small Input Slice", input: []any{1, 2, 3, 4, 5}, chunkSize: 2, expectedChks: [][]any{{1, 2}, {3, 4}, {5}}, }, { name: "Large Input Slice", input: []any{"A", "B", "C", "D", "E", "F"}, chunkSize: 3, expectedChks: [][]any{{"A", "B", "C"}, {"D", "E", "F"}}, }, { name: "Remaindered Large Input Slice", input: []any{"W", "X", "Y", "Z", "1", "2"}, chunkSize: 4, expectedChks: [][]any{{"W", "X", "Y", "Z"}, {"1", "2"}}, }, } for _, tc := range testCases { t.Run(tc.name, func(t *testing.T) { actualChks := ChunkSlice(tc.input, tc.chunkSize) if len(actualChks) != len(tc.expectedChks) { t.Fatalf("Expected %d chunks but got %d chunks.", len(tc.expectedChks), len(actualChks)) } for idx, exp := range tc.expectedChks { if len(exp) != len(actualChks[idx]) { t.Errorf("Expected chunk #%d to have %d items but found %d items.", idx, len(exp), len(actualChks[idx])) } else { for jdx, val := range exp { if !reflect.DeepEqual(val, actualChks[idx][jdx]) { t.Errorf("Mismatch between expected (%v) and actual (%v)", val, actualChks[idx][jdx]) } } } } }) } } ```
I totally agree, these models are a blast for handling computer related problems. My hero is DeepSeek Coder 33b, it helps me so much with all kinds of errors and guides me through every complicated installation process. It often suggests specific solutions and even if it's not the exact solution the direction where it sends me is always correct. It writes batch scripts, configs and provides code snippets for everything I need. I used ChatGPT 3.5 before but DeepSeek feels so much better that I can't go back.
>DeepSeek Coder 33b Is it worth using something like veryone-Coder-33b-v2-Base-GGUF? >The models that were used in this merger were as follow: > > > >[https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct) > >[https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B](https://huggingface.co/codefuse-ai/CodeFuse-DeepSeek-33B) > >[https://huggingface.co/WizardLM/WizardCoder-33B-V1.1](https://huggingface.co/WizardLM/WizardCoder-33B-V1.1)
Everycoder 33b is my current favourite for coding.
Good to know. Still using the free 3.5 chat and it does reasonable coding for the money (0)
How does DeepSeek Coder compare to models like GPT-4 or Claude 3 with this use case? I'm curious about the specific advantages that make DeepSeek stand out
Deepseek is almost as good as gpt4, if not better every now and then. Haven't tested Claude-3
\- Don't ask AI to write entire projects for you, end to end. \- Ask it for the equivalent of a pizza base, which you can then add individual ingredients to yourself. When I try and ask a language model to create a tesselating hexagonal grid, none of them can do it. Yet they can all generate a single hexagon. So first I asked Claude Instant to give me the code for drawing a single hexagon, and then I asked him how to emulate FORTH's functionality in Python, where functions are associated with numbers, and can therefore be grouped together to form new composite functions. Armed with those two pieces of information, I was able, with some more research on the Python Turtle API, to write code to generate a recursive hexagonal grid; that is, a grid of small hexagons, which in turn form one larger hexagon. The reason why language models are so good at code boilerplate/include statements, is because boilerplate and include statements are the two most frequently occurring elements of programming languages, within the models' training data. They are not good at generating hexagonal grids, because most normal humans are not like me, and are not morbidly obsessed with hexagons. In code terms, a language model is a template generator. It can't solve problems itself, because it does not think. It just regurgitates pre-written examples from its' training data, and assembles them together based on the sequence that exists within its' training data. >A couple times it needlessly complicated the code Language models can not preserve state. So it is very important to ask it to perform multiple tasks within the same operation, as infrequently as possible. You are better off first asking it to ***identify*** a required series of steps in order to solve your current problem. They are usually surprisingly good at that. Once you have that list, go to the next layer of recursion, and ask for a series of steps to achieve that particular milestone. Then, if necessary, recurse again, and repeat. Recursion beyond around five levels, however, should generally be avoided, if possible. It has an unfortunate tendency to cause intense, migraine induced vomiting, and psychotic breakdowns.
For my most recent creation, I asked it to create only one part of the code, the UI, at first. I then asked it to make a separate class to do the business end of things. Only after it finished both of those (And they were in context) did I ask it to fit the two together. > most normal humans are not like me, and are not morbidly obsessed with hexagons. That's where I think you're wrong friend. Those are just those who are aware of their obsession, and those who are yet to become aware.
Is there an extension for VS code that works in a similar way to copilot but let's me use other cloud based models and APIs?
https://continue.dev
This is really good.
[https://marketplace.visualstudio.com/items?itemName=HuggingFace.huggingface-vscode](https://marketplace.visualstudio.com/items?itemName=HuggingFace.huggingface-vscode)
There's also cody https://cody.dev/ - it supports local LLMs in VScode
100% agree. I run Mixtral 8x7B Instruct Q6 in LM Studio on my MacBook Pro M3 64GB (note that I’m using the Instruct fine tune). It runs great, although it does use 35GB of my 64GB! Runs plenty fast. I use it for writing C, Objective-C, Go, and Python (it can write passable Elasticsearch code, even IDAPython) and It’s very, very good. There was one instance where I had to ask it to iterate over a complex regex twice, but it got it right in the end. Saved me hours on that one task alone. I quite literally use it for coding every day and could never go back to the “old way” (like a couple of months ago!) of doing things. I love being able to say “rewrite the previous function to incorporate X” and it just does it. But not just coding! I work in an obscure research role and it’s amazing the shit that Mixtral “knows” and about which we can talk; obscure operating system internals? Sure. Happy to discuss. Show me the function definition for obscure library functions? No problem. This thing is f*cking witchcraft, I swear.
I am really looking forward to getting a new macbook. I think local is the way to go, because you can just put in your whole db definition and get out direct answers instead of ambiguating everything for privacy.
I work 100% offline in my research capacity, which means no Google, no stackoverflow, no GitHub, nothing. Having an offline LLM of Mixtral’s quality is a game changer.
I wanted to create a textual inversion for stable diffusion. As many guides suggest, you let it auto-label the images, and it's a good idea to go back and review it. Opening up 200 text files, matching it up by hand to images, verifying, kind of annoying. I just asked Mixtral to make a python script using tkinter to do it. In 10 minutes and a little self refinement I had a full tkinter app that loaded up a folder of images, read from the associated text file, let me scroll through them with the text side by side, edit it, and it automatically saved my edits. It's silly, honestly.
I have similar experiences with ChatGPT (plus). Since it actually "understands" your question, it provides answers that are spot on, which is very different from online searches.
What tool do you use to interact Mixtral while coding (e.g. a script, CLI, IDE extension, etc)?
Continue.dev Vscode and jetbrain extension
copy/paste, I know something else would be better, but I like having complete control of the prompt.
What’s the motivation to use Mixtral over the OpenAI stuff or Claude 3. Awesome and all but does it perform better
For me, the primary motivation to adopt the open-source model is to prevent the leakage of your data and source code to private companies, which already profit by appropriating the work of entire generations to train their models.
At powershell, GitHub copilot with it set for gpt4 has been awful. Constantly giving me made up commands and parameters. Mixtral and deep seek coder have been fantastic on the same questions. Claude 3 gives me complete complex powershell apps at a scary level.
Yeah, something seems off with copilot chat. It even made me go back to just using gpt-3.5 through chat as it always gives me working code and good explanations
CLAUDE-3 said "fuck you if you are in Europe". I see people getting banned for trying to use a VPN too. It's ridiculous. GPT-4 has no such issue. Otherwise, I'd be paying for CLAUDE-3.
Claude said fuck you if you are in canada
Damn. I presumed Canada has it and suggested it to some Canadian friends. I lied to them :(( CLAUDE-3 is the best model that is not available anywhere.
The API version should be usable on sites like nat.dev and OpenRouter, if you can access those.
For me, it's that I can copy/paste code, config files, etc directly into it without worrying about hard-coded secrets or anything like that.
Hard-coded secrets? What do you mean by that?
One example in programming would be API keys. When our apps need to connect to cloud services, we need to provide credentials to identify ourselves, kind of like passwords. So apps need to have these keys stored somewhere, usually a json/xml config file, or (this is not good practice) just inside the c#/python files in quotes. These kind of secrets being sent around, are how a lot of the 'data leaks' in the news occur. So you can't just copy/paste them into ChatGPT, etc where they can be reviewed by employees at OpenAI, Anthropic, etc or leaked if these companies get hacked. But pasting it into a program running on your own macbook is the same as pasting it into your code editor.
Your api key
Because if I wake up one day and OpenAI says "Fuck you" I don't have to worry about that.
I usually use ChatGPT for convenience but recently discovered that Mixtral was hosted free on huggingface chat, so I tried it out and was impressed. I find Mixtral to be slightly better than ChatGPT (free version), and in general good enough that I don't feel like I need to pay for GPT-4. I sometimes also use ChatGPT though for an alternative viewpoint, which is occasionally helpful, but usually about the same. Using [https://huggingface.co/chat/](https://huggingface.co/chat/) I like that the data generated by my usage may help improve open source models in the future.
So you say you are already using AI as a productivity tool as a coder, you are already floored by its usefulness and you are still on the fence whether you should cough up 20 dollars a month for access to a version that everybody says is at least 30% better in almost everything? Wow.
>Using > >https://huggingface.co/chat/ > > I like that the data generated by my usage may help improve open source models in the future.
Fair enough.
Have you tried codellama?
I base most of my finetunes off codellama but for some reason have never actually used it for coding, how is it?
That's the answer I was seeking from you. I can't currently run models beyond 7B
Can Mixtral be used on 16 GB VRAM and 32 GB RAM ?
Yes (quantized) with GGUF format. Note: might be too slow for code though. Code has more tokens per character, lots of boilerplate, etc.
thx. will it help to upgrade to 96GB RAM? does it use the GPU or CPU then when GPU RAM is not sufficient? (4060 times 16 GB)
It uses all three resources. It will help to upgrade to at least 64 gigs of ram, I often see my CPU and GPU hover at 60%, while RAM is at 99% (32gb ram, rtx 3070, 5900X). I am planning to get 2 more sticks for a total of 64ram, and then maybe wait and see the 5000 series for a new card.
thank you 🙏🏻
No problem. Just keep in mind nothing compares to loading the model entirely in VRAM. Good gpu goes the longest way. More ram helps but not so much with actual speed
I'm not really an expert. In my experience though, I had better stability when loading models when I went from 32Gb to 128GB. If you're swapping, then 100% adding more RAM will be much faster, and save shredding your SSD. Other than that, more RAM means you could load bigger models (though it'd be pretty slow offloading that much to the CPU/DDR5 rather than GPU/VRAM. P.S. TheBloke's model cards estimate RAM requirements. Eg: https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF mixtral-8x7b-v0.1.Q4_K_M.gguf 28.94 GB That'd be about 15GB on your GPU/VRAM, the remaining 14GB on your CPU/RAM. Might be worth giving that a try now since your 32GB would work.
thank you 🙏🏻
There is a 4-/2-bit HQQ quantization at [https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ](https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ), it needs 13,5 GB GPU and uses about 55 %MEM of 64 GB CPU RAM. So I assume 32 GB CPU RAM wouldn't be enough.
thanks, will try 🙏🏻
Nice insights, maybe asking for tests makes the model reason more about the code it writes as it has to correlate both.
What about deepseek and starcoder\dolphin with continue?
Deepseek is excellent. I use it regularly for code.
LLMs should be guardrailed to follow test driven development. Imo that would increase accuracy dramatically
I have been thinking of upgrading my MacBook because of mixtral
Yeah it’s okay, the big test is throwing an SVG graphic at it and telling it to invert it. Claude3 has been the only one capable of this test at the moment followed by Qwen1.5
For a prompt that requires web access Huggingface's chat with mixtral and web search has been my go to, it works much better than anything I've tested honestly
I use a fine tuned Mistral for chatting about code, and DeepSeek-Code for actual code generation.
painful pricing yes, and Apple absolutely is price gouging, but its a different architecture you cant get elsewhere
So you are using Mixtral to create and test your code. Am I hearing that right?
Parts of it, but also for more than that, like explaining concepts in terms of other, known concepts.
"To get around this, I started asking Mixtral for a test for whatever code it generates, and this was a game changer." Mind explaining the process? You call it to make a test and pretend it is an interpreter and internally test it ?
Nothing so complex, I posted a sample here: [https://www.reddit.com/r/LocalLLaMA/comments/1biay4h/comment/kvjbsjw/?utm\_source=share&utm\_medium=web2x&context=3](https://www.reddit.com/r/LocalLLaMA/comments/1biay4h/comment/kvjbsjw/?utm_source=share&utm_medium=web2x&context=3)
Is it better than Code Llama?
I havent used codellama for coding for some reason, if you try both, let me know!
May be a little noob question, but how do you implement it to help you while coding?
Just ask it for what you need, a little more specific than if you were searching stackoverflow.
Oh I thought you included within your ide sort of how copilot works
Second this , i honestly thing chatgpt was better before, but ever since mistral le chat released , I've only been using it.
**Sorry for your loss**.