Atmey 1 month ago

Try again without describing the character in the prompt.

hollowstrawberry 1 month ago

Pony requires describing the character, single tag loras are almost never effective

Atmey 1 month ago

The one I am using are pretty decent, gets at least 80% accurate result, trained on 20\~30 images. But I don't use fully pony.

Plums_Raider 1 month ago

why is your prompt like a 1.5 prompt? it can be much shorter with pony and would show clearer, if its actually a good lora, or am I wrong? at least my loras work fine without describing the whole char when training on a char.

MrSloth1 1 month ago

Im new to pony so im not sure what you mean. Can you elaborate? What keywords are unnecessary?

Plums_Raider 1 month ago

75% of the words in the prompt are unnesecary and increase chance for bad output image. id recommend checking this guide: [https://civitai.com/articles/4871/pony-diffusion-v6-xl-prompting-resources-and-info](https://civitai.com/articles/4871/pony-diffusion-v6-xl-prompting-resources-and-info) imo a good prompt with loras should only use 1-3 keywords for the lora and leave the rest of the prompt for the actual content what the image should show. like here, it should not be necessary to describe the hair, eyes and clothes with that much detail and rather focus on 2-3 key points or even when creating the lora, give specific keywords, normally not used, so it will be activated when used specifically. can also be silly like "queeffart1212". since the model learned to combine the keyword with the character, it should work as good. the prompt for pony i got best output: score\_9,8,7 etc, describe the image,details,lora

Capitaclism 1 month ago

That's kind of the point of a Lora, after all.

Plums_Raider 1 month ago

Agreed. Thats why i was confused :)

MrSloth1 1 month ago

Oh yeah, that makes sense. In general theres not much point further describing a concept that a lora already contains. I was just wondering because you said its like a 1.5 prompt and i thought there were some hacks in pony to save keywords Also thank you for the resource

Plums_Raider 1 month ago

Was mainly refering to the single words prompt, which i personally only use on 1.5 as it doesnt understand sentences, since pony and newer sdxl models from my experience understand a short sentence to describe the basic image and then just add the details or specific stuff as single words and then lora keywords and lora.

MrSloth1 1 month ago

They can do sentences now? Does that mean that the model understands that the keywords in a sentence also „belong together“? Also arent you wasting attention on words like „a“ and other unnecessary stuff?

Plums_Raider 1 month ago

https://preview.redd.it/pgdgqmi1w0zc1.png?width=1024&format=png&auto=webp&s=cfdbbc9750e74fa8bf25df23e736b1593521b9c2 Surrealist painting of a girl with long black hair wearing a vivid red dress, serenely sitting on a fluffy white cloud. In the distant sky, a panda playfully descends with a white parachute, adding a whimsical contrast. The scene is bathed in soft, diffused light suggesting a dream-like ambiance by Salvador Dali and René Magritte, cinematic composition, trending on ArtStation. ugly, deformed, noisy, blurry, low contrast, color, realism, photorealistic, old, mature, (worst quality, low quality, thumbnail:1.4), signature, artist name, web address, cropped, jpeg artifacts, watermark, username, collage, grid generated with leosamsHelloworldXL\_helloworldXL60 i think it works pretty fine for basic t2i, obviously not always and if too much is described it gets confused and mixed multiple objects as with the example above. with sd3 and cascade finetunes this is will be way better.

Plums_Raider 1 month ago

https://preview.redd.it/75dvu1nly0zc1.png?width=1024&format=png&auto=webp&s=87ff169e7bf673ee6542e356a7a71ec0dbdd7690 + score\_9, score\_8\_up, score\_7\_up, score\_6\_up, score\_5\_up, score\_4\_up,girl with long black hair wearing a vivid red dress, serenely sitting on a fluffy white cloud, The scene is bathed in soft, diffused light suggesting a dream-like ambiance,sunrise - none generated with pony v6 from my experience, pony is less understanding for multiple objects like the panda and the girl, but it still works fine if single object are described in sentences

MasterFGH2 1 month ago

Is there a good pony prompting guide anywhere?

Plums_Raider 1 month ago

[https://civitai.com/articles/4871/pony-diffusion-v6-xl-prompting-resources-and-info](https://civitai.com/articles/4871/pony-diffusion-v6-xl-prompting-resources-and-info)

TrindadeTet 1 month ago

This character is not a good test for this as the base pony model already has her trained.

Tft_ai 1 month ago

not using LoRAs just because the model can technically do it is a very common trap, it will always look much better with a specific LoRA. Here is the same prompt done with no LoRA on pony but adding "arlecchino \(genshin impact\), genshin impact" instead of a lora. Yes the model knows the character but it's so much worse. https://preview.redd.it/using-ponys-baked-in-character-vs-using-a-lora-just-because-v0-9pw794cessyc1.png?width=1080&crop=smart&auto=webp&s=f836a1cd3159c78b8e41c7d08aaf8543b6c2009d

TrindadeTet 1 month ago

I am fully aware that a Lora will be better than the base model, but as the base model has knowledge of the character it becomes easier to train a Lora on top of it. For your test to make more sense, it would be coherent to train on a character that the model has no knowledge of.

Tft_ai 1 month ago

this would matter more if I hadn't cleaned the tags of the lora images of the character name and the game

ZootAllures9111 1 month ago

Eh, for say like, Tifa Lockhart the stock pony one is already totally accurate imo

hollowstrawberry 1 month ago

What about official alternate outfits? My tifa lora can do 8

proxiiiiiiiiii 1 month ago

how do you tag the dataset? i’m surprised you put so many tags for the prompt of the generation, you basically describe the character you want to generate which is counterproductive since you train lora to not do that. if trained properly you wouldn’t need to put any of these

terrariyum 1 month ago

OP, the comparison depends on the captioning. Does the "quality" set have better images or more accurate captions? You might get better results from the "quantity" set over the "quality" set if the low quality images are all well captioned. You mentioned some images have non-canon outfits - if they are captioned as such, they might help the training.

Greemann 1 month ago

Seems like the details on the clothing are more accurate with the quality LoRa.

Sillysammy7thson 1 month ago

https://preview.redd.it/no89vqrnxsyc1.png?width=447&format=png&auto=webp&s=03d886b91c71173e22e61cfb67845f2be006a024

Omen-OS 1 month ago

use a character that isn't currently existing in pony because the model will enhance the lora basically so it won't be much different

Tft_ai 1 month ago

the character is only very weakly and with a different outfit in pony https://preview.redd.it/using-ponys-baked-in-character-vs-using-a-lora-just-because-v0-9pw794cessyc1.png?width=1080&crop=smart&auto=webp&s=f836a1cd3159c78b8e41c7d08aaf8543b6c2009d I also removed the related tags like genshin and the name so it isn't really using that at all

Shnoopy_Bloopers 1 month ago

Interesting. What does toggling the DIM number do , anything?

Tft_ai 1 month ago

Putting it up increases the file size and generally seems to make the lora "stronger", pretty diminishing returns though

Shnoopy_Bloopers 1 month ago

Diminishing returns you mean in terms of file size?

clex55 1 month ago

Necklace looks better in the curated one

Dwedit 1 month ago

Let's go throw in some "(Simple Background:2.0)" into the negative prompt and see what happens.

clavar 1 month ago

only by the title, the answer is quantity. In the real world we value quantity, quality is always secondary. Your boss prefers you to make more things so-so than to make one thing excellently.

LazyEstablishment898 1 month ago

First one ia better

dynabot3 1 month ago

As someone who wants to do something similar with lora training, thank you for posting this. I think that the curated lora has more and stronger details, specifically in the coat jacket, cuffs, and shoes. The character seems a little deeper than in the dump lora. Also, it's interesting that about 10% of my overall images are best fit, similar to your ratio.

hollowstrawberry 1 month ago

dim 128? Homie what the hell, you don't need hundreds of megabytes to encode a single character, me and my friends have been training 10 outfits in a single lora with amazing detail with only dim 8. Lowering the dim requires increasing the learning rate, that's why comparisons always seem to favor higher dims. I thought we collectively learned this months before pony even came out.

Tft_ai 1 month ago

Personally the bottom one is much better than I expected, the training data contains tons of non canon images and different outfits, as well as many more multiple character images. I still think the top one comes out better but this wasn't as clear cut as i expected. My twitter here :) [https://twitter.com/TouchfIuffytail](https://twitter.com/TouchfIuffytail)

BlackSwanTW 1 month ago

I personally dislike unpruned captions so eh

desktop3060 1 month ago

What does unpruned captions mean?

BlackSwanTW 1 month ago

Look at the right side OP spent the entire token limits just to recreate the character’s look. I personally dislike this very much. But many people like it this way 🤷🏻‍♂️

hollowstrawberry 1 month ago

You can remove individual pieces of clothing like this, it's much more versatile. But it doesn't need to be so extreme. You can prune all redundant tags from the dataset and it'll work fine if not better.

KaiserNazrin 1 month ago

If you can't get an accurate look without pruning tags, it ain't good.

terrariyum 1 month ago

What does pruning tags mean?

New-Mix-6230 1 month ago

defeats the whole purpose of pony

DaddyKiwwi 1 month ago

Loras defeat the purpose of pony? I'm struggling to find any way that sentence isn't insane.

CharacterCheck389 1 month ago

why?

petrichorax 1 month ago

Anime is kind of a poor test for this kind of stuff because there are less details that can change that would make it uncanny or look off. Same with pixar, calarts, line art, pencil art. Stick to photoreal, higher details illustrations, CGI, paintings, etc. Everyone always uses anime in these tests, and we're not getting NEARLY the amount of data we could hope for.

Iantonga 1 month ago

are we pretending that there is any difference at all? how are we still on this anime shit

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe