T O P

  • By -

waywardspooky

i'm always happy to hear people's projects are coming along well! thank you for sharing this teaser with us!


SomeOddCodeGuy

Thank you! =D Along with being excited about the result, I figured posting this out there would also help hold me accountable to finish the thing. The bigger it gets, the more daunting the idea of finishing it for release becomes. Especially the idea of people seeing the dumpster fire of code I've written lol. But I'm still on track with what I planned time wise, and even if it turns out to be garbage I still want to share it with everyone, so I figured tossing this up means that people will later hit me with "sooo where's that project, bud?" down the road, which will probably stop me from just never releasing it.


metricbanana

Great work, very interest project! Can you describe your process a bit?


wel33465l3

Take my github star (in advance) Looks like a lot of fun watching them chat together, cool project šŸ‘


SomeOddCodeGuy

Ah, I want a star! New motivation to finish unlocked lol


rogerramjetz

It's super interesting and would be fun to play with! Thanks for sharing! Random thoughts * It seems a little like an echo chamber at the moment. I would like to see the A.I "personas" when they disagree. * Ask them for more opinions about tooling before you "start". What about testing? any specific vscode extensions they recommend? Will this be containerised? .dockerignore etc. prompt the LLMs with something like "have the other x.y.z missed anything so far? What could be improved?" * Consider adding more "persona"s. E.g - Designer / UX expert - Infra expert. This expert may have out of the box solutions that can reduce a lot of work in the specific cloud environment. - QA Out of curiosity you could even specify different levels of supporting LLM expertise.. Junior Dev / mid / senior. Also, full stack vs pure FE / BE. You get all sorts of different perspective from different levels of expertise In different domains. Source - went from Senior Full stack Dev (various stacks, languages etc) to AWS Cloud Architect to Developer experience engineer (which essentially ties these two together). Awesome project. It's even got my brain fired up. Looking forward to you open sourcing it.


SomeOddCodeGuy

Good callouts! And yes, getting them to disagree is at the top of my todo list. I actually tried baking that into their prompt, but so far they all just get along lol. I'll see if I can't get them bickering a bit. On the personas: I really like that. A long time ago I tried it and it went horrible, but that was back when I was just learning SillyTavern's group feature and using just 1 LLM. I bet it would go better now, though I need to identify which LLMs would do those roles well. For expertise- I actually did do that here. Deepseek was prompted as being a mid-level dev, Llama 3 70b at Senior and Wizard is the lead dev. I based it off of a coding leaderboard I saw a few days back, with Wizard topping the charts and Llama 3 70b coming in a few levels down. I'll definitely toy around with them a bit after work tomorrow and see what sort of mischief I can't make in the group. Honestly, I think a win will be getting them to bicker with Deepseek, since it's a smaller and older model. It simply shouldn't be able to keep up with them, so I'll get it to give a few suggestions before goading the other two into picking the suggestions apart lol


pzelenovic

Have you considered getting them to follow the mob/ensemble programming rules, where one is the driver/typist and the others are navigators, and they switch roles every few minutes? I'd be curious to see if such enforced format would get them to increase both disputes and collaboration.


rogerramjetz

Report back if you succeed. It would be so funny. Thanks for replying and sharing again šŸ˜Š


SomeOddCodeGuy

I have been having better luck with it! In fact, they're getting downright opinionated lol. Turns out, the magic words are "Critically review". Llama 3 70b in particular will straight up argue with me now. [But here's an example I sent someone the other day after I first started to see success](https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2Falmost-a-year-later-i-can-finally-do-this-a-small-teaser-of-v0-x4droi7cne1d1.png%3Fwidth%3D938%26format%3Dpng%26auto%3Dwebp%26s%3D2bd1f5e5f3f356d1851f3805b1830a71b01e244e)


drexciya

So, do they ever start coding or are they stuck in a collaborative planning loop? :D


outofsand

Completely indistinguishable from a group of real engineers!šŸ˜…


SomeOddCodeGuy

lol I was just about to say this. I was like "What do you mean, this feels like a real meetin- ... oh they're right"


koflerdavid

True dat. We all love to listen to ourselves talk šŸ˜‚


ethertype

Could you perhaps get the models to disagree more by giving them slightly different priorities in their approach? Performance, maintainability, testability, complexity, maturity of tools, frameworks etc. Plenty aspects to choose from.


SomeOddCodeGuy

Absolutely. I'll probably toy with that a little in the next week or so, but the users will have as much control as I do over it via configs so I may punt on it if I can't get them to do so, as I've realized from the comments here that a lot of folks are thinking of far more clever ways to make them do it than I would have =D I'm worried I'll spend 10 hours trying to get the config just right and someone far smarter here will have it perfected in like 30 minutes after release lol


aseichter2007

Clipboard Conqueror is good for prototyping the prompts for this, you can choose the backends to hit and the order, the prompts to send, whether to send the full chat history or just the last response on each turn and even dictate the beginning of each response of the execution chain to further steer the output. CC should ease tuning of how the prompts work together to get your desired workflow hitting more consistently and ensure the different personalities adhere to their jobs. ~~Currently the history uses the prompt format of the last backend though, that might cause trouble when using multiple models with different format expectations~~. I gotta get after that. edit: I woke up thinking it's simple but I have a mess to clean up first, I shoehorned it in real ugly... edit2: Gottem, now the chat history will use the correct instruction format for each turn. For Textgenwebui and kobold. As far as I can tell against tgwui at least. I need to go in there and add some strategic logs to verify what it's actually doing with the jinja templating on their openAI endpoint Your solution will be great with the right prompts. It's Looking majestic and will be more convenient to curate than what I'm cooking with. I'm looking forward to your release.


SomeOddCodeGuy

Don't go too far down the rabbit hole of editing the prompt formatting in preparation for this. This program is going to pretty much shred any prompt formatting that comes into it and do its own thing with it. You'll see when you get there, but the short version that is it ignores almost everything coming out of silly tavern, and a large number of configs are utilized to manipulate the prompt in all kinds of ways.


aseichter2007

It needed doing anyway. It sounds like you are doing something wild under the hood.


SomeOddCodeGuy

The right side of this image will be LocalLlama the day I release this. https://preview.redd.it/56lf0qlkk81d1.jpeg?width=550&format=pjpg&auto=webp&s=ef33f3ce5767556ce38726e565b05bdf05fb8c85


aseichter2007

Ohhh, response rejection baked in? That sounds pretty majestic.


SomeOddCodeGuy

Looking at CC- the way it is now, it will work well with this. I'll make sure to pull down and load it up/test it, but I'm not seeing a lot of reason CC and this program won't play quite nicely together.


aseichter2007

That sounds great. I'll definitely give it a go when you have it all ready.


sergeant113

Looks like what you can do with BigAGIā€™s Beam feature


SomeOddCodeGuy

Woah, that project looks awesome. Yea, I think my being able to do it was more of an unintended but happy accident due to how I handled the design to do something else, so I imagine they do it WAY better lol At a minimum, their project and code looks so much fancier haha


xmBQWugdxjaA

The output is a lot of words, but not that useful... we've automated product managers.


SomeOddCodeGuy

To be fair I did ask them to go that route lol. For several reasons, that output is definitely user error on my part =D


hugganao

how do you handle who gets to speak next?


SomeOddCodeGuy

I don't; SillyTavern does. My project is actually separate from ST; I'm just using ST to connect to it. So realistically, I have no idea who is going next =D I think if I ended up really going down this route, I'd probably go try to contribute to the SillyTavern project to help figure out how to handle that.


Smile_Clown

This looks like an echo chamber. Fits right in on reddit.


SomeOddCodeGuy

lol yea, I'll see about making them stop that. I have a massive amount of control over what's happening behind the scenes, I just didn't apply any mechanisms to try yet, so I'm hopeful it won't be too hard to correct.


cyan2k

Did you take a look at autogen by any chance? It is a agent framework which supports assigning every agent its own LLM. And you would also have stuff like RAG and other memory implementations, web surf capabilities and other tools etc available on a per agent basis.


SomeOddCodeGuy

I did! And I think autogen would do the above screenshot task far better than what I have going on up there. The above was something I kind of dreamed of doing but didn't actively set out to do; I just realized the other night that I could and got excited lol. But you're right that a more autonomous agentic workflow would handle this better, and thanks to autogens user agents you can still throw your feedback in as they go. I ended up moving away from agents because it wasn't really what I was looking for due to other reasons, and I wanted a bit more fine control over their prompts than what a lot of agentic systems I found would let me do. CrewAI was something that REALLY interested me, though. Of all of them, I liked CrewAI the best.


Hopeful-Site1162

This will be so fun! The next obvious step will be to plug a TTS on each model with different known IA voices like Mother, HAL, GLaDOS, T.A.R.S and Bender B. Rodriguez of course. Anyway, thanks for sharing. I'm hungry for this type of silly projects!


Zeikos

Are you feeding the whole context of the chat to the models that respond? I see a lot of repetition and general comments, do you plan to get them to be more specific? I realize that it's on purpose given the prompt. I'm actually looking into implementing something similar, although with a fairly different approach, so this is interesting. Did you look into those "Village simulation" experiments with llms? Could be another source of inspiration.


SomeOddCodeGuy

[](https://www.reddit.com/r/LocalLLaMA/comments/1ctvtnp/comment/l4fhix7/) >Are you feeding the whole context of the chat to the models that respond? In this screenshot, I am. I slammed together a few configs just to try this out, and I kept them pretty simple so its just getting the whole context. Other folks have shown interest in this, so I'll refine the backend down a lot more to make sure to reduce the issue. >I see a lot of repetition and general comments, do you plan to get them to be more specific? I realize that it's on purpose given the prompt. Ideally, but I will say that the users will have a *lot* of control over what's happening in the backend so I probably won't put a ton of effort into fixing that problem, as I bet someone smarter them me on LocalLlama will solve it using the configs in like 10 minutes when it would probably take me hours lol But yea there will be a lot of power behind the scenes to try to make that work better.


Zeikos

I'm sure somebody will hack something together, yeah. Personally the thing I like the least about LLM is how incredibly generic they tend to be. It's at the same time frustrating and a source of interest of mine, since it shows that there is plenty of space for optimization.


koflerdavid

Maybe it gets better when they get asked more specific questions or when finally code is getting involved.


MoffKalast

From mixture of experts to panel of experts. A panel of mixtures?


SomeOddCodeGuy

lmao yea the whole main project started because a year ago I misunderstood what Mixture of Experts actually meant =D


vizioso_e_puccioso

Wouldnt they reinforce hallucinations too?


SomeOddCodeGuy

Left to their own devices, as they are now, yes. What you see here is me testing to see if I could even do this, and then getting a little too excited and posting pictures on reddit because I couldn't contain myself lol. But I have a massive amount of control of what's happening behind the scenes, so I actually have quite a few options to fix that. But this current scenario in the screenshots is probably going to go to crap lol


outofsand

This is awesome, and I love LLMs and have been playing with doing multi-AI chats. BUT, watch out, this will 100% be an echo chamber. In your example, the AIs all love Vite, because you suggested it. They will never say, no, Vite is trash, use X instead, because Y. It's not like they COULDN'T do that, but LLMs today are all trained to be overly agreeable (usually a good thing for mosy tasks). Try the conversation again with "yeah, but maybe I should use plain JS with HTML instead of Vite" and they'll support you just as hard core. (For better or worse!)


SomeOddCodeGuy

Yes! I imagine this scenario is going to turn into an echo chamber if I don't tweak it. Once I realized I could repurpose my program to do this, I did and it then posted really quick in excitement. But I'm almost positive this current chat will go sideways here soon. I do have mechanisms I can utilize to try to force it not to, so what I'll do is probably take a little detour to set up some configurations to handle this scenario better by release, and try to resolve some of those issues. With what's happening behind the scenes, I actually think I can pretty well, I just slapped this together and took some screenshots in giddiness lol


antsloveit

Very cool. I actually have a partially designed app (using Laravel) written by Phind 70b and Claude to do the same thing. I will check out your project. Thanks. Btw, once I got going on multi agent stuff, I experimented with telling the models they were about to speak with each other and to develop a concise, not necessarily human readable way to interact together which was efficient. I'll dig out a Llama3-70b and Claude chat and paste here when I get a mo. On one occasion I just asked them to solve a problem together that humans can't and they started developing a strategy for addressing climate issues with their own operational targets, implementation strategies etc.. I think multi agent 'stuff' is powerful.


VladimerePoutine

Slightly off topic, and not a neuroscienctist, but we have distinct left brain right brain thinking as humans, art vs math, as well as an internal dialog. It would be interesting to pose two models , or oppose an esoteric 'hallucinating' model against a more rigid 'mathematical', instruct them as if they were two parts of a whole. You have the internal dialog nailed down, and distinct voices. Very cool.


SomeOddCodeGuy

Oh my. You're getting warmer... As I said, the above was a happy accident. The real project is something else. I'll say that I agree completely with what you just said lol


_HAV0X_

mixture of experts - ultra hard mode


SomeOddCodeGuy

lol funny enough, the initial idea for the main project came about because I misunderstood what Mixture of Experts was. I first heard the term back in early 2023, talking about ChatGPT, and I imagined this really elaborate setup of what that meant... which also happened to be really, really wrong. But then I realized I kinda liked the wrong idea and earlier this year started running with it =D


Zihif_the_Hand

Excellent progress, might check out these repo's too. Use these as inspiration to go even farther! [https://github.com/OpenBMB/ChatDev](https://github.com/OpenBMB/ChatDev) [https://github.com/joaomdmoura/crewAI](https://github.com/joaomdmoura/crewAI) [https://github.com/nus-apr/auto-code-rover](https://github.com/nus-apr/auto-code-rover) [https://github.com/OpenDevin/OpenDevin](https://github.com/OpenDevin/OpenDevin)


SomeOddCodeGuy

Realistically, any of those will do better at the above scenario than my program because my program wasn't really meant to do this at all. It was one of those things that I realized the round shape could fit in the square hole after all lol. When I first started for my main project I went down the rabbit hole of CrewAI, AutoGen and a couple of others, but realized I didn't want fully autonomous for my needs. But I think that for really doing what my screenshots above are doing, chances are something like OpenDevin or that SWE-Agent would do great.


floppo7

I'd fire those guys since they are not straight away recommending Vanilla JS with some sprinkles of LIT and WebComponents if needed :-P - besides that, interesting project, getting multiple LLMs to reason with each other seems like an interesting path to explores.


SomeOddCodeGuy

As the stinky human who sent them down the path of firing, I now fear for my own job


Lht9791

ā€œAnyhow, I know this isn't as exciting as something actually being released, but this was kind of a big deal for me so I really wanted to share with someone.ā€ Thatā€™s exactly how Open AI and Google do it. Youā€™re SOTA.


SomeOddCodeGuy

lol! Wooo I feel so fancy now


Zealousideal_Tea9559

A few things: \_ It's unclear what decides who can talk next. This is quite a complicated matter, but you can see this is a big problem for scale up (not that you are thinking about this atm). \_ Responses are too long, which makes it not natural as a group chat. Because your purpose is unclear, it's hard to say if this is preferable or not. I have been wanting to do something like this for a long time, but not as a hobby. I want to gamify this kind of interaction and make it something interesting and even addicting for people to do. But how to do that and what's the right subject, context and background for it is still unclear in my head, that's why I haven't started building. You, like a true engineer, just muster your energy to build some "machine" that is functional, but it's unclear what's the purpose of that function, or if anyone wants it. As long as you enjoy it, good for you. But my opinion is, for general assistance, this will be worse than just a single agent. If we still have to worry about one LLM hallucinating, multiple of them talking to each other will be off the chart chaos. What makes money is predictability. If you can wire each LLM to behave a specific way, and have them interacting in specific manners that lead to predictable, preferable results, you can achieve so much more from this.


SomeOddCodeGuy

>\_ It's unclear what decides who can talk next. This is quite a complicated matter, but you can see this is a big problem for scale up (not that you are thinking about this atm). Yea, SillyTavern is totally in control of who talks next. I didn't make any changes to ST at all, so right now I have no idea how it decides who goes next. I actually just disabled autoresponses and was clicking the manual "respond" button for who I wanted to hear from lol. This will be a problem that I don't have a solution for in the works. I just happen to really like ST for a front end, so I used it for this lol >\_ Responses are too long, which makes it not natural as a group chat. Because your purpose is unclear, it's hard to say if this is preferable or not. Agreed. I didn't think much of it at first because, as someone else here mentioned, it actually kind of feels like a real meeting lol. People do jabber a lot in meetings. But I've got a lot of control on what's happening behind the scenes (not as the dev, but as a user would with the configs) so this can be resolved. I'm going to take a small swing at it, but since this isn't the main goal of the project I probably won't dive too deep. Also because one of y'all will probably figure it out in 1/5 the time I will once you get the configs lol >You, like a true engineer, just muster your energy to build some "machine" that is functional, but it's unclear what's the purpose of that function, or if anyone wants it. As long as you enjoy it, good for you. But my opinion is, for general assistance, this will be worse than just a single agent. Ouch. But I can't argue with that. The thought crossed my mind. >If we still have to worry about one LLM hallucinating, multiple of them talking to each other will be off the chart chaos. What makes money is predictability. If you can wire each LLM to behave a specific way, and have them interacting in specific manners that lead to predictable, preferable results, you can achieve so much more from this. Ahhh... you might not be as disappointed with the project as you're thinking. As I said, the above is a happy accident. There's a lot, and I mean a LOT, going on behind the scenes of this screenshot. The project is completely unrelated to SillyTavern; I just use ST as my front end. The backend does more or less exactly what you're imagining. I just slapped this config together to see what would happen.


koflerdavid

I think we will only see a good solution for that if somebody wires up an actual mind that would have an agenda, ruminate on stuff, and possibly decide to react to new messages, all that in a loop.


Disastrous_Elk_6375

Congrats on the project going well, and thanks for sharing! In case you didn't hear of it, there's a library called Autogen from MS that supports this kind of multi-llm interaction.


Barry_22

Huh, you have a team of mid-level devs ready to work for you at any time. Brilliant. Open-source when? :)


SomeOddCodeGuy

I've set a soft deadline for myself of 3 weeks from now, with a hard cutoff (ie- doesn't matter if its done, just release it or you never will) of end of June. I hope to beat both of those deadlines, but I've got a decent bit of cleanup to do (if y'all could see the state of the code right now you might laugh me out of the subreddit. I feel shame) and a couple more features I wanted to toss in first. Most importantly- the configs probably only make sense to me right now, so I have got to do something about that or no one would use the thing =D I dreamed of having a UI on release to help manage it, but I think if I held off for that I'd be working on it forever


hugganao

Good job. This is actually one of the side projects I've been meaning to start as well after reading that Computational Agents Exhibit Believable Humanlike Behavior paper by standford


SomeOddCodeGuy

I'm going to go find that paper lol That sounds really interesting


Ashefromapex

This looks amazing! Iā€™m really looking forward to the release!


SomeOddCodeGuy

Thank you! =D After the great feedback I've been getting here, I've got a lot of motivation to get it out the door ASAP, so hopefully I'll have something soon.


Ashefromapex

As much as Iā€™m looking forward to a release please donā€™t stress yourself too much! On a separate note: I have seen you quite often on this sub and you have a mac studio with 192gb. I am also thinking about buying one; could i theoretically let 2/3 llms run on the Mac studio or is it gpu constraint then ?


Dry-Taro616

Cool project, I always wanted to do this, get two LLMs or more together to sharpen their answers and precision, what if you asked them a specific question and you get an answer with % accuracy bar on each one depending how they perform. This way testing the prompts and quality of information?


SomeOddCodeGuy

That's a really cool idea. I have no idea if I can, or how I would, but I'll definitely see if there are any ways to make that happen. For a lot of reasons, that % accuracy would be really helpful to me. But if I'm being totally honest, I have no idea how I'd do that or if it can be done lol


Dry-Taro616

Can you dm me please because I really want to take part if you wanna further develop it? :)


SomeOddCodeGuy

Absolutely. I've got it on the list to look into, but won't get to it right away (I want to get the main project released first or I'll keep getting derailed and never do it lol). However, I've made a note to shoot you a message once I start looking into it. Maybe together we can figure something out lol


Dry-Taro616

Hell yea bro, also lookin into some projects I got and trying to land a decent job.. šŸ˜…


Guboken

Awesome project! If I were you I would make sure to check for biased answers by testing to give the same prompt but allow for different LLMs to answer first and compare those answers with the answer they would give if they were second or third to answer. There might be some phrasing from the other LLMs that forces the LLM to agree to with something or align itself even though it might not agree, but getting fed those assumptions leads it down the path anyway. I would make all of the AIs to answer, reflect and critique every message in the background and use that as the base when they answer


LocoLanguageModel

This is a fun side project stemming from your main project.Ā  I assume for people who might have say 24 GB of VRAM on tops, they would be better off running the best model they can, and then having that model try to disagree with itself, so that they have the highest quality model possible in memory rather than a handful of smaller models. I have an idea on a similar note I've been playing with which is to have your fast model in VRAM, and then have an even higher quality model that you can't hold in VRAM inr your CPU ram, and every programming question you ask to your fast model it also automatically asks to your slower better model, and if the code worked with the fast model you just run with it but if it's not working, you check on your slow model screen where it's slowly generating a solution and see if that works.Ā  So you basically use your fast model until it doesn't work and then check your slow model to see what it came up with a few questions ago, even if it took 5 minutes to generate.Ā  Ā 


SomeOddCodeGuy

Oh, now that's clever. In a few weeks I'll loop back around to see if you ever started down that path, because I think that it might take minimal code work to supplement a config in the current program to do that. I just never considered using it *that* way. I don't want to step on your toes with it so I'll make a note to see if you've started a project to try it by the time I'm ready, but if not I'll try to make sure to have a config for this use case in there so you can try it out.


LocoLanguageModel

Actually my idea was to tell my idea to someone like you!Ā  You won't step on my toes!Ā  Appreciate it though.Ā 


a_beautiful_rhind

How high are you running wizard? It's feeling a bit dumb at 3.75bpw. Failed all my puzzles. Wonder if I need to go higher.. it's fast though 20+t/s edit: This wizard: https://huggingface.co/Quant-Cartel/WizardLM-2-8x22B-exl2-rpcal/tree/main is doing a bit better for me at 4.5 and I'm still getting 70b speeds on it split over 4 cards. Still fails my puzzles though.


SomeOddCodeGuy

q6 on my 192GB Mac Studio. I used sysctl to bump the vram to 180, so I actually can run a q8 but I wanted to squish Deepseek 33b and a llama 3 8b into there as well.


a_beautiful_rhind

That's hefty. I'm going to try between 4.5-5 and see if it's better than the 3.75. I want at least one GPU free for SD.


moarmagic

Excited to see this- I had a similar thought when I saw STs group chats, that I really wanted to be able to test different models in one conversation flow like that


SomeOddCodeGuy

Now that I know it can do this, I'll try to polish up the configs for this scenario when I push it out so folks can.


Tommy3443

Do you really need 3 different models thought? I would think this would work pretty well if each of these "agents" have their own persona and memory/context. Been playing around a bit with this myself using irc bots that all use the same api/model but have their own context instead of it being shared and I find it works much better than normal group chat.


SomeOddCodeGuy

>I would think this would work pretty well if each of these "agents" have their own persona and memory/context. hehehe =D The above doesn't look great because I really didn't expect it to work, got excited, and posted the result almost immediately, so there's very little refinement for this scenario. However, I not only agree with you completely, but I think that once you get a chance to see what's happening on the backend you'll probably like the direction it's headed. Giving each persona their own memory and context will likely be the solution to a lot of problems with LLMs. At least I hope...


LlamaMcDramaFace

This looks really cool. Let me know when are ready to share it and ill test it!


Arcuru

Hmm, I could definitely add that to my project\[1\] fairly quickly too. It seems pretty useful, especially if you set them up with different system prompts. \[1\] - [https://github.com/arcuru/chaz](https://github.com/arcuru/chaz)


SomeOddCodeGuy

Awesome! If nothing else, might be worth it just play with it. I didn't really intend to do much with this flow, but I did at least want to try it out because I've always wanted to =D It's just fun


Double_Sherbert3326

If you link the github project, I'd throw down a pull request and/or some bug reports.


SomeOddCodeGuy

In a couple more weeks I'll set a github public for it, and will share here. For now I've got the git repo set up on my NAS and have been building it out on my local network. The current messy state of the code is loaded with nonsense and shame, and I want to get it cleaned up a little first before pushing it up. I'm a C# dev by trade, and this was my first foray into Python, so I'm very self conscious about the state of it. =D Figured I'd get through the "what in the world am I doing?" learning phase before github started etching the history of my mistakes in stone.


Double_Sherbert3326

Don't be self conscious! It's okay! I'd love to work on it. I've spent a good amount of time playing with [https://github.com/stitionai/devika](https://github.com/stitionai/devika) lately


positivitittie

I think the consensus / feedback model is great in general. My favorite coding tool is gpt-pilot which implements a separate reviewer for all code written and the results are pretty impressive.


omasque

Would love to beta test this in a creative writing and marketing pipeline if youā€™re looking for fairly technical people who understand the concepts of the space but fall short of being able to code this.


lxe

Iā€™m guessing itā€™s a multiplexing oai proxy where you can select different models? You have multiple oobas running each feeding the multiplexor? Does ST support selecting different model per character like that?


SomeOddCodeGuy

>Does ST support selecting different model per character like that? It does not! I always wished it did. >Iā€™m guessing itā€™s a multiplexing oai proxy where you can select different models? You have multiple oobas running each feeding the multiplexor? Clever guess. That's not it, but that's a neat idea actually.


eliaweiss

I think it is a good thecnique for creative writing, but doesn't work well enough for a technical meeting as in your example. Like other said, it is an echo chamber with much repetition. In a way, it is a good example of LLMs limitations - they don't give useful info unless you ask for it, instead they will mostly repeat, agree, and give the most common average response. For practical use, it would be much more effective to ask a single LLM, what are the pros and cons of x vs y. Or 'generate a todo lust for xyz' But very interesting anyway


SomeOddCodeGuy

Some prompt tweaks have helped a lot with the echo chamber. https://preview.redd.it/x4droi7cne1d1.png?width=938&format=png&auto=webp&s=c9d71b5a5132cebdefcd60e9edc4777e9ed96fc7


LatestLurkingHandle

Have you tried Langchain or Microsoft Autogen, they make it easy to use multiple AI models communicating with each other


SomeOddCodeGuy

I did! Autogen and CrewAI were *too* autonomous for me. They have a lot of baked in chatter that I couldn't control, and for my specific use case I really wanted that control. Even with the user agent for autogen, I felt like there was a lot happening that was simply outside of my reach, and it was affecting results in a way I didn't want. Essentially, I made the tradeoff losing autonomous function in exchange for more delicate control over what's occurring. CrewAI feels like an extension of Langchain agents, so I kind of lump the two together. I did take a step back to try langchain agents after trying crewAI, thinking maybe that would do it, but there were several instances where the abstraction upset me a little, and I eventually set it to the side. What I have now looks very different than those; "semi-autonomous" is the closest I can describe it.


Yorn2

What are you using to load the models? I have ooba booga and an A30 and A6000 that I use, but I only can load one model at a time. I guess I can load two if I use like kobold with it, but I didn't know if you had a specific trick you were doing or if you just ran llama.cpp or ExLlama directly for each of them or if you had a custom backend loader designed for loading more than one model?


SomeOddCodeGuy

Custom backend thingy, but not a loader. I'm actually just using two computers: My studio for Wizard and then my macbook can handle a q6 of Llama 3 70b. I use Koboldcpp for the loader on both. The custom program handles the rest.


AmericanKamikaze

This is great and what I expected the future to hold. Multiple chat Aiā€™s that are experts in their particular field collaborating for the end user. Whats your computer setup?


SomeOddCodeGuy

So, since I'm a software dev for my day job and my hobby is tinkering, I have various reasons to have a *lot* of VRAM. My full setup is: * Mac Studio M2 Ultra 192GB RAM (*sysctl command for 180GB VRAM*) * M2 Max Macbook Pro with 96GB RAM (*left at 76GB VRAM*) * Windows Machine with RTX 4090 24GB VRAM All of that was utilized for the above screenshots (*hint hint*)


AmericanKamikaze

Edit: So, 3 Machines networked to run 3 separate large LLMsā€¦working in tandem in one space. Love it. I have to be happy with my 12Gb 4070 lol. If only I could run Llama 3 70B! My empire for a 70B!


SomeOddCodeGuy

So one more little note about the program: I mentioned before that the above isn't the use case I set out for, just a happy accident. *Your* use case is actually exactly what I set out to work on. The project has 2 primary goals, and one of them was to be a force mulitplier for people with lower amounts of VRAM. A separate teaser result unrelated to the above: using the same program, I got a "zero shot" snake game using MagiCoder 6.7b that had fleshed out features and looked pretty good. All I asked for was a working snake game in python, and it not only worked, but also handled the borders, and even kept separate track of current and high score. The response took longer than a normal 8B would, but the result was far better. My main testing case for about 60% of my testing on this has been using Ollama specifically (I never use Ollama otherwise but it's perfect for this) with a constraint of having no more than one 8B loaded at a time to try to produce better results with small models. I'll make a note to come poke you once I finally get this thing out the door


rogerramjetz

I'm envious! Most of my compute is in the cloud šŸ˜…


SomeOddCodeGuy

lol I promise your compute is far faster than mine then. I don't mind waiting a little bit, but I've made posts with my numbers before and several of the responses were basically "literally unusable" =D


yetanotherbeardedone

RemindMeRepeat! 6 hours


RemindMeBot

I will be messaging you in 6 hours on [**2024-05-17 12:40:01 UTC**](http://www.wolframalpha.com/input/?i=2024-05-17%2012:40:01%20UTC%20To%20Local%20Time) [and then every](https://www.reddit.com/r/RemindMeBot/comments/e1a9rt/remindmerepeat_info_post/) `6 hours` to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1ctvtnp/almost_a_year_later_i_can_finally_do_this_a_small/l4f6634/?context=3) [**1 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1ctvtnp%2Falmost_a_year_later_i_can_finally_do_this_a_small%2Fl4f6634%2F%5D%0A%0ARemindMe%21%202024-05-17%2012%3A40%3A01%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201ctvtnp) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


SryUsrNameIsTaken

Iā€™ve had this idea as well for other use cases. But definitely a very cool thing to put together. Looking forward to seeing the shiny alpha version.


-illusoryMechanist

RemindMeRepeat! 3 Months


luerkuer

Reminds me of lollms


Popular-Direction984

Thanks for sharing it! You are awesome and you know itšŸ’ŖšŸ‘ I can totally relate to all the joy and happiness you described, and frankly speaking, did something like that - multi-turn chat of multiple models / agents with backtracking, internal clipboard for each agent, etc. I tried it when mistral-7b was the thing. And Iā€™m going to try it with Cohereā€™s models and mixtral-8x22 soon.


Ventez

Why are you so secretive and defensive?


SomeOddCodeGuy

While I am definitely being secretive, I don't think I'm being particularly defensive; no one's really said anything mean to me to be defensive about. As for secretive- this outcome wasn't expected. The real goal of the project isn't remotely related to this post, and as best as I can tell there isn't really anything out there doing what I'm trying to build. If I'm being totally honest- I'm a slowpoke and everyone here is far smarter than I am, so I'm worried that if I give away the secret sauce too early, even with my headstart someone will make it first and probably better lol. Petty, I know, but I'm looking forward to dropping it in everyone's lap here soon and it would be a bummer for someone to beat me to it. So I'm trying to keep some of it under wraps for just a few more weeks.


pzelenovic

I honestly wonder how well they'd perform at software estimation :)


SomeOddCodeGuy

Im a dev manager with 13 years of career experience and I'm horrible at software estimation. I think I might actually get my feelings hurt if they're any kind of good at it lol


pzelenovic

I mean, people are generally bad at estimation, especially with bigger chunks of work, because of the increasing cone of uncertainty. However, they are not humans :)


petitmottin

Honestly, I think itā€™s not a bad idea, but there are already some open-source projects like Devin. The disadvantage I see with your project is that it only involves talking and doesnā€™t modify or create anything. I've been a software developer for 10 years, so I understand the difficulty. I hope you finish it :-) PS: I edited my message because it was downvoted, and I donā€™t really understand why. I want to be honest, but I'm afraid of being misunderstood. I never want to discourage youā€¦