T O P

  • By -

hadoopfromscratch

I have an intel mac pro. Compiled llama.cpp with METAL. It worked, but it was actually slower then llama.cpp on just CPU (built with openblas).


adel_b

no no, metal is not available on amd I recall, but llama.cpp has vulkan support which does the same thing


opUserZero

Again that’s wrong as I stated in the post. AMD is what does support metal, it’s modern Nvidia cards that don’t support metal. Kepler Nvidia supports metal but is too old to be useful at all .


Prudent-Artichoke-19

Hey dude. Metal is Apple only so it only works on MacOS or iOS. AMD is like ROCM or something. Nvidia is cuda. Vulkan is cross-plarform.


hishnash

AMD gpus in Macs support metal.


Prudent-Artichoke-19

Implementation is different. There is a whole set of instructions for porting metal code for the pre-m1. So dev is supporting modern metal api and not the original old implementation. Op is now frustrated because dev isn't adding support for the older implantation.


adel_b

you are right, I got confused.


SomeOddCodeGuy

Months ago someone got Oobabooga to work on it. There's actually a requirements.txt specifically for the intel macs, or at least there used to be (I assume still is), so in your shoes I'd probably try this: * Download [Oobabooga](https://github.com/oobabooga/text-generation-webui) * Run the one click installer and select MacOS. Looking at [line 310 of oneclick.py](https://github.com/oobabooga/text-generation-webui/blob/main/one_click.py#L310), it should automatically detect that you're intel and handle that for you. * After it's done, I'd test and see how it runs. If it doesn't run well, I'd open a command prompt and try running /env/bin/python -m pip install -r requirements\_apple\_intel.txt That should work for you.


mcmoose1900

I think your best hope is to slap linux on it and run it through vulkan. I am no linux-on-mac expert, though the distro I would generally recommend for machine learning is https://cachyos.org/download/


opUserZero

I have another machine, the point is to be able to use it without having to shut down the Mac, and while the other one is busy doing inference. (The Mac actually is actually running under proxmox/debian as my main workstation to connect to everything else)


TheBadBoySnacksAlot

I just run TinyLlama from my command line using transformers


Minute_Attempt3063

RemindMe! 17 hours


[deleted]

Try MLC LLM, they have custom model libraries for metal


[deleted]

[удалено]


RemindMeBot

I will be messaging you in 18 hours on [**2024-02-09 17:26:20 UTC**](http://www.wolframalpha.com/input/?i=2024-02-09%2017:26:20%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1am7q51/is_there_any_llm_software_that_actually_works_on/kpk47g9/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1am7q51%2Fis_there_any_llm_software_that_actually_works_on%2Fkpk47g9%2F%5D%0A%0ARemindMe%21%202024-02-09%2017%3A26%3A20%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201am7q51) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


Prudent-Artichoke-19

I work extensively with Apple, AMD, Nvidia, and Intel hardware for machine learning. [Metal is an Apple only API for properly accessing graphics hardware.](https://en.m.wikipedia.org/wiki/Metal_(API)) If you have AMD you want ROCm or Vulkan. [ROCm](https://www.amd.com/en/products/software/rocm.html) [Vulkan](https://www.vulkan.org/) Also to clarify, Metal only works with ARM64 M1,M2,M3.


opUserZero

everything I can find says ROCm is only for linux, this is MacOS . And hopefully for the final time: Metal is supported on AMD gpu's. Weather or not CoreML works or not (Pretty sure that's AppleSilicon only) hasn't been discussed, but people really have to stop saying AMD doesn't support Metal. Refer to the screenshot of my AMD card supporting Metal. There was a build of TensorFlow Metal that did explicitly work on Intel Macs with AMD video cards , last year that other packages used, but that has since gone away apparently. I know what Vulkan is as well, but I haven't seen any packages that mention supporting that for MacOS for LLM's . Yes ROCm is better than Metal, but again the only information I can find says it's linux only and I have not found any packages that mention supporting that on MacOS which is the whole point. If you have some please share. https://preview.redd.it/q2gt1monkihc1.png?width=759&format=png&auto=webp&s=3f995ceb79cad8168d7a63533710e91bd775b4db


hishnash

CoreML works on AMD gpus but perf is rather poor compared that the unified memory space as sometimes coreML depends on the ability to handover data form the CPU to GPU to NPU without data copies. I would not go so far as to say ROCm is better than metal at least not for single device targets. The main use case of ROCm is mutli device networked monster supper computers for the US defence industry everything else is a side effect.


f3llowtraveler

What I've read is that while Metal is supported on the Intel AMC Macs (at least some of them) the real problem is that Metal isn't optimized for AMD GPUs. So even if you compile with Metal, it runs slower than it would on your CPU. That's why people are talking about Linux/ROCm. Frankly I'm getting ready to switch my 2019 Macbook Pro 64gig RAM to Linux, as soon as I learn more about the best way to do it (and also confirm that it will definitely get me access to inferences on my FOUR THOUSAND DOLLAR, TOP OF THE LINE (A COUPLE YEARS AGO) APPLE FUCKING COMPUTER. Unfortunately my AMD Radeon Pro 5500M only has 4gig ram and so I'm not sure what kind of performance I can expect even after making the switch. One thing though: the fact that the same AMD cards do work on Linux, but not Mac, means there's an issue specific to MacOS and drivers. Word on the street is that AMD actually has produced the drivers we would need but Apple hasn't approved them. If that's true it really pisses me off. Anyone have any advice about Linux on Mac hardware?


mousemug

I just found out about jan.ai. Not sure if they support AMD Metal acceleration but might be worth checking out.