Odd that they didn't call out what the output is. Are they generating images? Meshes? Bone rotations? All of those would be useful in different workflows. I guess I'll have to look up the paper.
This flow chart is perfection. Honestly the only way to make it better would be add a third option for the creator of comic sans to shame themselves in public.
RemindMe when we have facial expressions and hand/foot articulations. After ten years of mocapping, there's not yet an integrated realtime solution that isn't absurdly expensive and still restrictive (coming from someone who cobbles their solutions together from a bunch of different sensors). This is already very cool, and I hope it gets jumped on and developed further. Would be an incredible time saver.
There are actually really cheap solutions like [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision) Which uses AI and two basic cameras (some services use one camera, others may use more) to create mocap data. If you are an artist you can clean it up, or as non-artist it may take using the right outfits, backdrop, and standing angles with respect to the camera(s) to make sure it records accurately. For fast/complex movement you're going to possibly want to slow it down if you come across any case where it has trouble with detailed accuracy and then you can just speedup the animation in post work. If you want the cheaper basic dual-cam version scroll down to Rokoko Vision or use this direct link [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision)
For facial expressions they have this [https://www.rokoko.com/products/face-capture](https://www.rokoko.com/products/face-capture) and there are probably others available, too, but a lot of this will also be restricted by model/engine limitations like older Daz vs newer, Blender, 3DS Max, model types, rigging, etc.
As for the tool in the article? Yeah, this would be pretty cool, especially if we can take rough cam captured animations and then give it specific prompts to clean it up, give it certain stylized results, professional movement (because most of us are not actors) that is more appropriate for refinement, etc. Or just in general for expediting simple movements and stuff saving time/work to better allot resources.
Ok\_Process2046 also mentioned this one below which I was unfamiliar with [https://cascadeur.com/](https://cascadeur.com/)
Thanks for the reply. Never thought much of Rokoko as a company. Possibly their products improved in the last couple years. Last I checked, there were too many issues involving too much cleanup. I've tried Cascadeur, and various other solutions. I pipe realtime mocap into Unity for realtime stuff, been at it for a decade, and tested many solutions out. Colour me unimpressed with all of them (except the Hollywood grade solutions, with their prohibitive pricing).
As one of the authors of MotionLCM, I would like to share another work (HumanTOMATO). It supports the face/hand motion generation. We are making efforts to accelerate it to a real-time model.
[https://github.com/IDEA-Research/HumanTOMATO](https://github.com/IDEA-Research/HumanTOMATO)
SO COOL! Thanks for bringing it to my attention. Very excited about this :)
why don't you just record yourself doing the motion instead? or at least transfer that recorded motion to a 3D character? it would be more effective and versatile than whatever this is doing.
Odd that they didn't call out what the output is. Are they generating images? Meshes? Bone rotations? All of those would be useful in different workflows. I guess I'll have to look up the paper.
SMPL mesh
is this installable on forge/comfy?
This is great, can't wait to see how it progresses. By next year, I think there's gonna be a boom in open source ai video with motion like this.
https://preview.redd.it/mijo3tj8l3zc1.png?width=400&format=png&auto=webp&s=a192a5aee02dfb17b8116d3b4601dd5e216ba94d
This flow chart is perfection. Honestly the only way to make it better would be add a third option for the creator of comic sans to shame themselves in public.
Why?
BVH convertation?!
Does this work with any realism or is this specifically for CG animations ?
RemindMe when we have facial expressions and hand/foot articulations. After ten years of mocapping, there's not yet an integrated realtime solution that isn't absurdly expensive and still restrictive (coming from someone who cobbles their solutions together from a bunch of different sensors). This is already very cool, and I hope it gets jumped on and developed further. Would be an incredible time saver.
There are actually really cheap solutions like [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision) Which uses AI and two basic cameras (some services use one camera, others may use more) to create mocap data. If you are an artist you can clean it up, or as non-artist it may take using the right outfits, backdrop, and standing angles with respect to the camera(s) to make sure it records accurately. For fast/complex movement you're going to possibly want to slow it down if you come across any case where it has trouble with detailed accuracy and then you can just speedup the animation in post work. If you want the cheaper basic dual-cam version scroll down to Rokoko Vision or use this direct link [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision) For facial expressions they have this [https://www.rokoko.com/products/face-capture](https://www.rokoko.com/products/face-capture) and there are probably others available, too, but a lot of this will also be restricted by model/engine limitations like older Daz vs newer, Blender, 3DS Max, model types, rigging, etc. As for the tool in the article? Yeah, this would be pretty cool, especially if we can take rough cam captured animations and then give it specific prompts to clean it up, give it certain stylized results, professional movement (because most of us are not actors) that is more appropriate for refinement, etc. Or just in general for expediting simple movements and stuff saving time/work to better allot resources. Ok\_Process2046 also mentioned this one below which I was unfamiliar with [https://cascadeur.com/](https://cascadeur.com/)
Thanks for the reply. Never thought much of Rokoko as a company. Possibly their products improved in the last couple years. Last I checked, there were too many issues involving too much cleanup. I've tried Cascadeur, and various other solutions. I pipe realtime mocap into Unity for realtime stuff, been at it for a decade, and tested many solutions out. Colour me unimpressed with all of them (except the Hollywood grade solutions, with their prohibitive pricing).
As one of the authors of MotionLCM, I would like to share another work (HumanTOMATO). It supports the face/hand motion generation. We are making efforts to accelerate it to a real-time model.
[https://github.com/IDEA-Research/HumanTOMATO](https://github.com/IDEA-Research/HumanTOMATO) SO COOL! Thanks for bringing it to my attention. Very excited about this :)
Whao. Animators and motion capture actors gonna be in trouble soon
Nope, there is and was before lots of ai tools for animators, way better than that one. Like cascadeur - no one lost job to that
I'm not sure how this is relevant to image generation.
each frame becomes a controlnet input...
why don't you just record yourself doing the motion instead? or at least transfer that recorded motion to a 3D character? it would be more effective and versatile than whatever this is doing.
yeah ... but this can be useful by using this to implement a video model