[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


You're Not Alone Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106995676

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 00039-2972526174.png (1.93 MB, 1248x1824)
1.93 MB
1.93 MB PNG
>>
File: 1735142585245396.mp4 (3.76 MB, 564x576)
3.76 MB
3.76 MB MP4
PARTY!!!
>>
>>107001464
hot
>>
>>107001552
*heavy trap bass beat starts playing*
>>
File: 00047-612365413.png (2.02 MB, 1248x1824)
2.02 MB
2.02 MB PNG
>>107001555
chisato is literally 10/10 wife material and mongs the rest of the girls.
>>
Local LLMs for making creative not repeating prompts? Is it doable at all? /lmg/ ignores me.
>>
File: 00056-3989700339.png (2.42 MB, 1536x1536)
2.42 MB
2.42 MB PNG
>>
posting this last trick or treat lain, she came out nicely, now back to anime backlog watching
>>
>>107001586
i'm not sure what you mean by "not repeating prompts" but you can probably set up ollama and find some way to call it
>>
>>107001586
it is entirely doable.
I don't like doing that because I do actually want entire control over what I gen, and I'm not bored enough to have a 'surprise me' button to do some random gens.
I do actually use it currently to automatic sort my prompt tags, but it's really not needed at all, I just use it to tidy up

You have two choices really:
- llama-cpp-python (will start/close/be used withing your comfyui workflow)
- external openai compatible LLM (llama.cpp, ollama, lmstudio, etc...)

Now the real problem is that LLMs are WAY MORE EXPENSIVE to run compared to diffusion models. models in the 7b~ dense range require 8GB vram to run at non retarded quants (q8). With LLMs you could theoretically go up to Q4, some imprecision shouldnt matter much here. If using moe models, you can get away with running bigger models and offload the expert layers to CPU, while keeping cache and rest of layers to GPU. since these are one off gens, you can keep the context to a minimum (1000 should be more than enough).

if you want something really MINIMAL (500M params):
z-tipo (what I currently use to sort), it requires you to manually install the cuda version of llama-cpp-python and it lives within comfy.
there are multiple nodes in comfy, just search for llama-cpp-python, ollama or opena/OAI compatible shit
>>
>>107001586
You can cover 90% of prompts with smart use of wildcards.
>>
File: file.png (359 KB, 1962x1012)
359 KB
359 KB PNG
>>107001699
yeah or just use impact wildcards, this is my current setup
you can see how the normal prompt comes out and the augmented prompt.
SADLY tipo creates a trash augmented prompt. I just randomize artists really
>>
File: 00064-2779316287.png (2.36 MB, 1824x1248)
2.36 MB
2.36 MB PNG
>>
>>107001652
It means that it does not output "top with v cut" every time when I ask for a girl's outfit.

>>107001687
I have a setup, the question is more about models themselves and techniques to get a good prompt. I have tried abliterated 4b and 8b of qwen and they did not follow my prompts enough or were very uncreative.

>>107001699
Yes, but you have to get wildcards first. It takes too much effort for me.
>>
>>107001734
small models are garbage sadly.
You could try nemo instruct, or a recent gemma abliterated.
If you're asking for prompting techniques, then you'll have to play around with samplers, more randomness you want, the higher the temeperature. there are some sampler that help make the bot coherent with hight temp (but I forgot the name, I usually use llms for work and low temp), I'd suggest you ask chatgpt or lmg for this.
For prompting itself, it usually works better if you give the chatbot a list to choose from (but at that point it would be the same as using wildcards substitution) and the prompting techinque GREATLY varies between models, so there's not a general way to do it
>>
Oops didn't see the new thread
>>107000663
>just try it out yourself
I have and it's pretty shit sadly. 1girl already works just fine on sdxl. Boomer prompts start giving shit anatomy and body horrors pretty quickly. Prompting for text doesn't work beyond 1-2 words it seems. I think every time someone shills a model they should be required to present a complex gen that cannot be done with illust at a fraction of time and VRAM with metadata attached. I'm gonna assume from now on that "uuuh skill issue just gen yourself" people are all LLMs or paid indians.
>>
>>107001716
That's slick. Mine isn't as elaborate, I might have to change my setup.
>>
>>107001586
Find cool image you like, i2prompt it and then use that
https://github.com/1038lab/ComfyUI-QwenVL
>>
File: 1745319983509499.jpg (891 KB, 1336x2008)
891 KB
891 KB JPG
>>107001759
show me what this prompt looks like with base ilu at this res on the first pass
https://files.catbox.moe/atrr5z.png
>at a fraction of time and VRAM
when was this claim made?
>>
File: 1753393023324645.jpg (851 KB, 1336x2008)
851 KB
851 KB JPG
>>
File: 00065-1589373952.png (2.26 MB, 1248x1824)
2.26 MB
2.26 MB PNG
>>
Are there any AI that can edit video game and anime characters into being naked? Gemini doesn't allow that
>>
>>107001586
>/lmg/ ignores me.
There must be a reason
>>
>>107001804
>abstract, ghost, fireplace
Wow you're really going out of your way to demonstrate complex composition, character interactivity and anatomy with your gen
I retract my previous statement, shills seem to be just retarded
>>
File: 00078-225239598.png (1.79 MB, 1792x1024)
1.79 MB
1.79 MB PNG
https://youtu.be/Dh56pv7gESM
>>
File: 1748446280861521.jpg (735 KB, 1336x2008)
735 KB
735 KB JPG
>>
>>107002059
Where's the 1girl?
>>
File: input.jpg (181 KB, 1349x2185)
181 KB
181 KB JPG
>>107001841
Qwen Image Edit 2509 with clothing remover lora
lora: https://limewire.com/d/AvpLO#Gd7AyXiz1r
result (nsfw): https://files.catbox.moe/na96bw.png
>>
>>107001841
It does you just have to write it in the most verbose way that makes it think you're doing something artsy
>>
>>107001842
They have miku in op, I'm not surprised.
>>
>>107002093
i want to cum inside 2b
>>
>>107001819
bowsette a shit
>>
>>
>>107002180
sylvanas a shit
>>
File: chfp8_a_00008_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
tell me about neta lumina. I see it being pushed hard now. it's only a 2b parameter model, isn't it? why use it over wan or qwen?
>>
>>107002199
it knows artists, unlike wan or qwen which are limited to a very limited set of styles
>>
File: 251025-165641-wan5s_00001.mp4 (2.57 MB, 1088x1600)
2.57 MB
2.57 MB MP4
>>107001819
>>
>>107002226
ahegao lora a shit
>>
>>107002093
now try on realistic
>>
>>107002211
so does sd1.5, but because it's so small it can't compete now. is neta yume lumina's quality still good compared to the big ones?
>>
CeFurkan is back shillng
>>
Ran took everything from me.
>>
>>107002255
I've been posting some this and last thread. compared to sdxl (illu/noob) it doesnt need upscalers or detailers. Granted gens take way longer, but I'm generating at the resolution you see, and I don't mind waiting since it manages to oneshot most of it. You can additionally use NL, which helps a lot in posing the girl in the composition you want.
>>
anons what are your guys gen times on qwen image with and without 4/8 step lora?
>>
File: 1758656043186527.png (1.36 MB, 832x1248)
1.36 MB
1.36 MB PNG
>>107002082
hiding in the cabin
>>107002199
qwen is large and wan is a middling image model
>>107002255
it uses a 16ch vae if thats what you mean
>>
>>107002293
and last one. wish one of these stupid anime thots would come trick or treating me IRL. SAD.
>>
>>107002300
>qwen is large
as in qwen is not preferred because it's too big?
>>
>>107002324
maybe he meant to say hes poor. but qwen's problem is not its size (can still fit in 16gb with some offload at Q8 or completely at 24gb). The results are almost always GOOD meaning you dont need to re-roll your gens as much, but even fully fitting in a GPU, genning is slower (due to genning at a high 1.3MP size) and it's slopped and has bad styles knowledge/no artists
>>
>>107002199
>wan or qwen
Both need LoRAs to do anything even resembling kino.
>>
how do I speed up wan 2.2 i2v
>>
File: n9t2asb11ywf1.png (174 KB, 640x640)
174 KB
174 KB PNG
absolute legend?

https://files.catbox.moe/2dyn9a.mp4
>>
>>107002295
A lot or not much.
>>
>>107002452
bruh that isnt helpful. some numbers would help
>>
>>107002448
lost
>>
>>107002456
How many cuda cores you have?
>>
>>107002472
10,752
>>
File: 1741658881115885.jpg (738 KB, 1336x2008)
738 KB
738 KB JPG
>>
>>107002434
Use 2.5 instead
>>
>>107002474
Should be pretty quick then.
Think about it as ballpark. If an action is under 10 minutes it is still usable.
In the past and still, renders can take 8 hours per frame.
With AI slop that is condensed.
>>
>>107002487
I dont think anyone waits 10mins for an image
>>
>>107002496
I don't think you have ever been employed or done graphics for a client.
>>
>>107002496
Maybe English is a problem for you. Is it?
>>
>>107002503
youre talking to a motion designer. literally no one waits 10 mins for imagen. youre joking
>>
>>107002509
English is a problem.
>>
>>107002508
yes Im german, english isnt my 1language.
>>
>>107002448
based turk working hard
>>
>>107002448
Not bad anon...not bad...but BEHOLD! MY GOONJITSU!
https://files.catbox.moe/vqq4u0.mp4
>>
>>107002509
If you are such a professional you should already know...
>>
How to make Chroma good?
>>
>>107002541
it's funnier seeing him suck dicks desu, this is a bit too much and well, it's literally a woman with his head.
>>
>>107002549
delete chroma, download gwen + analogcore lora and some insta thots lora for 1 girls and youre done, way better realism than whatever chroma shits out
>>
>>107002554
>this is what qwenfags believe
Advised him that again when your model actually becomes non shit.
>>
File: 1739217727555737.png (512 KB, 875x355)
512 KB
512 KB PNG
the cartoon character in the red shirt is very fat and holds up a sign at the beach saying "tomorrow i'll gen 1girls", while Sonic the Hedgehog looks at him
>>
>>107002579
Why not take a full pic of robotnik?
>>
>>107002612
just to test if it still works, seems fine even with a cropped image.
>>
>>107002627
Yeah it did a good job that's true.
>>
File: 1734447632356783.png (814 KB, 792x1320)
814 KB
814 KB PNG
the pink hair anime girl is sitting at a table in a walmart staff room, smoking a cigarette while sitting at a white table. the walmart logo is on the wall.
>>
how do i make a comfyui tagger workflow with multiple images to txt for trainning lora?
>>
File: 1740344971354841.png (796 KB, 792x1320)
796 KB
796 KB PNG
>>107002643
>>
> Some nodes require a newer version of ComfyUI (current: 0.3.66). Please update to use all nodes.
> Requires ComfyUI 0.3.63:
> c46c74c1-cfc4-41eb-81a8-9c6701737ef6
qwen edit, wtf
>>
>>107002656
Cum ui has gone from being a nice little javascript python shit for images to literal malware.
Year ago it was still okay.
>>
File: WAN_00009_.png (1.9 MB, 1080x1352)
1.9 MB
1.9 MB PNG
wan >>>>>>>>>>>>>>>>>> qwen
>>
File: 1731419830992770.png (652 KB, 944x1104)
652 KB
652 KB PNG
the videogame girl is sitting at a computer and typing in a cave near a fire, on the back of the white CRT monitor is the text "LDG". keep her in the same polygon style.

why does a cave have power? it's a videogame cave.
>>
File: 1742610876525001.png (657 KB, 944x1104)
657 KB
657 KB PNG
>>107002697
>>
>>107002697
solar powered PC
>>
>>107002093
Prompt nodes don't have image inputs links, correct?
>>
The only way to train a wan 2.2 lora is with cloud, isn't it? Aren't you locked out of your computer for like a week with a 5090?
>>
>>107002758
>locked out of your computer for like a week with a 5090

>xhe spent multiple thousand $ on a single pc component but doesn't have an old gpu or money to buy a 70$ 1070
>>
>>107002758
There is a guy trained wan 2.2 lora with ~250p clips on high and ~400p clips and ~700p images on low, 3 seconds clips, with pretty good result. Should be not that long.
>>
ran is not satisfied with his discord
users need blogposting
i will post images and make ran seethe
>>
>Tsukuyomi
>>
>>107002780
I'm sure the results are fine, but it takes so long to train doesn't it?
It's my work pc.
>>
>wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022
got really fast movement with 3.0 strength
>>
>>107002830
link for lora
>>
>>107002842
bruh literally just type wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022 in google
>>
>>107002830
>>107002842

Man what the fuck are you retards doing with that insane low quality flashing with your light lora setups

New HIGH:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22_Lightx2v/Wan_2_2_I2V_A14B_HIGH_lightx2v_MoE_distill_lora_rank_64_bf16.safetensors

Old LOW:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22-Lightning/old/Wan2.2-Lightning_I2V-A14B-4steps-lora_LOW_fp16.safetensors

4 steps, cfg 1, unipc
>>
>>107002853
what about t2v?
>>
File: 1743770310436836.png (951 KB, 960x1088)
951 KB
951 KB PNG
the white character is wearing a white tshirt and blue shorts sitting at a computer desk holding a green lightsaber in his messy bedroom. A large STAR WARS sign is in the background and various STAR WARS merchandise. the image is black and white. keep their facial expression the same.
>>
File: lightx2v_1022_1.25str.mp4 (745 KB, 448x600)
745 KB
745 KB MP4
>>
File: lightx2v_1022_2.0str.mp4 (849 KB, 448x600)
849 KB
849 KB MP4
>>
File: lightx2v_MoE_1.25str.mp4 (736 KB, 448x600)
736 KB
736 KB MP4
>>
File: lightx2v_MoE_2.0str.mp4 (814 KB, 448x600)
814 KB
814 KB MP4
ultimately it's just RNG
>>
one more please?
>>
File: 1753483774945303.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
the man is sitting at a poker table in a casino, keep his expression the same.

was just a cropped headshot of kaiji. zawa zawa...
>>
File: 1747537332271143.jpg (53 KB, 500x479)
53 KB
53 KB JPG
>>107002245
https://litter.catbox.moe/k2q4xggehhc770ms.png
>>
>>107002968
>>107002958
>>107002951
>>107002949
These are great! Would be cool to see more.
>>
reasons not to use lightx2v?
>>
It's out.

https://civitai.com/models/1901521/v7-base
>>
>>107003075
It's noticeably shit when comparing to not using it
>>
>>107003101
you had my hopes up with that image.
>>
>>107003101
kek, dalle-mini is so sovlful even after all those years
>>
File: 1732994965175552.png (464 KB, 609x679)
464 KB
464 KB PNG
>ctrl-f Pony
>0 results
What do we think? I couldnt care less about weeb slop, or steven universe and furry faggotry. How's the realism? I doubt it can ever stand up to Chroma
>>
File: WAN_00046_.png (2.35 MB, 1080x1352)
2.35 MB
2.35 MB PNG
>>
>>107003146
just two more finetunes
>>
File: 1759517492477396.png (1.2 MB, 1128x920)
1.2 MB
1.2 MB PNG
>>107002295
qwen image edit, 8 steps, 1 megapixel images, rtx 3090
first gen: 156 secs
second gen, same image and prompt: 49 secs
change image: 91 secs
change prompt: 62 secs
disable 8 step lora, 20 steps: 95 secs
>>
>>107002643
now this is autism
>>
can you train qwen loras with 16gb? does it take 12 hours per lora? do the loras come out even remotely well with what i assume is 40 block swaps?
>>
File: chroma multi concept lora.jpg (1.67 MB, 3936x1264)
1.67 MB
1.67 MB JPG
Looks like multiple concepts for Chroma loras work after all. Just need to crank up early training lr without frying the whole thing. I think Chroma/flux are super sensitive to gradients that pull into opposite directions so you need to let them reserve their space during early training. Otherwise it will just overwrite everything and you'll get generalized mess with combined concepts.
>>
>>107003203
>can you train qwen loras with 16gb?
yes

>does it take 12 hours per lora?
depends on how you train it

>do the loras come out even remotely well with what i assume is 40 block swaps?
I don't know if lora quality has anything to do with block swap
>>
>>107003044
could've taken even lower quality picture
>>
would loras for flux work for chroma or would i need to retrain them?
>>
>>107003333
Some of them work.
>>
When will /ldg/ wake up and see Qwen is bad?
>>
>>107003418
when it's proven...?
>>
> wan t2v
> 3d animation of...
works

> 3d blender animation of ...
50/50

> ... 3d animation
does not work

why the fuck
>>
>>107002805
https://huggingface.co/quarterturn/wan2.2-14b-i2v-city-the-animation
5 1/2 days on a 4090D 48GB for 101 15-second 640x360 clips, which was the biggest I could use without OOMing.
>>
>>107003427
blender isn't and "of" is the trigger
>>
>>107003418
things that qwen does great: text editing, prompt adherence, structural correctness

things that qwen does bad: image editing, speed, style variation, realism
>>
Is Wan really the only good video model we got?
>>
>>107003473
yes
>>
File: 1736785332101770.jpg (252 KB, 1500x1302)
252 KB
252 KB JPG
https://huggingface.co/purplesmartai/pony-v7-base/tree/main
THIS IS IT, LOCAL IS SAVED
>>
>>107003485
I hate to say it but unironically it's the best model we've gotten for realism and combining out of left field concepts into.
>>
>>107003496
>it's the best model we've gotten for realism
prove it, show some examples kek
>>
>>107003473
hunyuan is good, but not as good as wan
if you were here at the time people went nuts for it, it raised the bar so much plus did NSFW
>>
>>107003501
>if you were here at the time people went nuts for it, it raised the bar
true, the bar was raised higher between Mochi and HunyuanVideo compared to HunyuanVideo and Wan 2.1
>>
>>107001750
Thanks, will try.
>>
>>107003461
>image editing
literally the best
>speed
yeah
>style variation, realism
loras
>>
>>107003521
if you need loras to make your model good then your model was never good to start.
>>
Does anyone here know about superesolution models?

I want to train a model with my own dataset, because my dataset shares the same colours, patterns and style, but it has low resolution images, so I want to upscale them as faithfully as possible.

Please somebody help me
>>
>>107003532
I'm completely fine with that
>>
>>107003521
>tells it to edit out a part in the photo
>whole image changes
yeah truly the best
>>
File: geeeeg.png (72 KB, 1369x473)
72 KB
72 KB PNG
>>107003485
>
>>
>>107003532
>if you need loras to make your model good then your model was never good to start.
this
>>
File: Pony V7 base.jpg (305 KB, 1024x1536)
305 KB
305 KB JPG
>>107003496
>I hate to say it but unironically it's the best model we've gotten for realism and combining out of left field concepts into
totally
>>
>>107003586
local is saved!
>>
>>107003586
put this in the next OP
>>
On the frame interpolator what does 'clear cache after n frames' do? How high or low do I want to try to set this?
>>
These V7 pics look like we went back 2 years in time.
>>
>>107003586
I want to fuck a grass girl so bad bros...
>>
>>107003622
pony v6 is miles ahead better, and he did this shit 2 years ago
>>
>>107003586
I haven't tried it yet but this seems to occur for very short prompts because the model was trained with long and detailed ones
>>
File: 1737630597983915.png (43 KB, 753x375)
43 KB
43 KB PNG
i dont think i can ever go back to prompting like this bros
>>
>>107003253
Cool. I don't get it
>>
File: WanVid_00008.webm (939 KB, 720x960)
939 KB
939 KB WEBM
dude on the left is striking out, feel bad for him
>>
>>107003851
dude on the right wishes he was home watching youtube
>>
File: image_00016_.jpg (398 KB, 984x1264)
398 KB
398 KB JPG
>>107003851
great lora
>>
>>107003496
no its synth slopped, but somehow in a far more retarded way than flux/qwen
>>
>if you have an unhooked image loader with a different image to the hooked up image loader the unhooked still counts and fucks up the gen

Thanks, open source.
>>
how come v7 was open sourced?
thoughts on krea video?
thoughts on new ltx lora?
>>
>>107003970
light x2v lora*
>>
>>107003787
oh yeah I'll do style_cluster_1610, my favourite!
>>
hey there faggots, sick of transparent promotion campaigns for shit-tier models? that's because you're browsing 4chan instead of trying THIS shit-tier model!
you can tell it's bad because I made it and even I won't post any gens from it, but I'll be unironically fucked if I can't manufacture some hype and at least get some downloads!
>>
File: 1656210369404.jpg (10 KB, 250x250)
10 KB
10 KB JPG
pony 7 is even worst than sdxl 3
>>
>>107004033
>worst than
Please enable flags on all boards, chink-moot.
>>
>>107004024
im actually downloading because at the end of the day, it doesnt hurt to try really
>>
File: 1636758330040.jpg (6 KB, 200x202)
6 KB
6 KB JPG
>>107004079
>spitting on chinks, when you use their tools every day
lmaoooooooooooooooooooooooooooo
>>
>>107003970
>>107003984
>krea video
no GGUFs
>light x2v lora
suffers from ghosting and lip flapping
>>
>>107004137
sdxl isn't chink, illustrious isn't chink but you might be
>>
>>107004148
redeem
https://huggingface.co/6chan/krea-realtime-video-fp8/tree/main
>>
File: 1747808295679103.png (146 KB, 500x338)
146 KB
146 KB PNG
>masturbate to horses
>pour tens of thousands of dollars into horse porn generator
>the horse porn is subpar
>>
>>107003851
Is that the same Sabrina lora from weeks ago?
>>
File: 251025-222606-wan5s_00001.mp4 (3.59 MB, 1168x1488)
3.59 MB
3.59 MB MP4
>>107003914
what model?
>>
>>107003044
I wish qwen edit didn't have that sameboob syndrome where it always estimate the same shape, size and look.
Same for bodies, genitals and so on.
>>
>>107004033
The worst thing is that v7 could have saved SD3.5 the same way v6 saved SDXL, if he were to train on Medium. Small size, faster training, but all the benefits of 3.5M namely 16ch VAE, T5XXL and native 1.4MP out of the box. We could have had v7 as early as like the first quarter of 2025, and today would have been swimming in loras and merges.
>>
>>107003787
>style_cluster
I don't even blame the model for being shit, I blame the dev for thinking this was somehow necessary
what a disgrace
>>
>>107003333
they're convertable at least
https://github.com/EnragedAntelope/Flux-ChromaLoraConversion
>>
>>107004179
>fp8
>goof
pick one
>>
>>107004179
Workflow?
>>
File: ComfyUI_00002.webm (3.69 MB, 960x960)
3.69 MB
3.69 MB WEBM
>>
>>107004353
kek
>>
>>107004353
I cammed
>>
>>107004353
More
>>
>>107004214
chroma 2k
>>
thought picrel was AI at first and /ldg/ had breached containment into social media apps
>>
>>107004296
>I don't even blame the model for being shit, I blame the dev for thinking this was somehow necessary
no one asked, so I can only conclude he's thinking somehow hiding artist name is "safer"
which is sad and retarded
>>
What's the next big hope after the huge successes of Chroma, PonyV7 and Neta Lumina?
>>
>>107004489
chroma and leto are good doe
>>
>>107003586
At this point why not just partner up and go all in on Chroma? They "sponsored" Chroma, but a full blown partnership would be better. Pony v7.1 is Chroma, then a tune of that is Pony v7.5
>>
>>107004353
My gens with this image aren't as creative or as safe for work
>>
>>107003253
This is big. Never seen anyone combine concepts with a LoRA (which is one of the main perks of API, but that means local just catched up). Could you write a rentry with your findings?
>>
>>107004196
A new I2V dropped yesterday on Civitai
>>
>>107003473
This just came out https://meituan-longcat.github.io/LongCat-Video/
Very promising, too bad comfyui is just a shill and scammers framework now, if it does nsfw could probably kill wan 2.2
Gotta wait till someone implements it somewhere.
>>
>>107004603
im going to cum inside your ass
>>
>>107004426
omg its migu
>>
File: naiandnow.jpg (3.02 MB, 3106x2177)
3.02 MB
3.02 MB JPG
went back and tried 1.5 again to compare vs illustrious based.
1.5 mixes still do decently actually
>>
>>107002549
Know how to use plain English to describe what you want.
>>
>>107004603
So to implement this to comfyui we would need a non jew autist to create nodes for it?
>>
>>107004629
"girl with only 2 arms and 2 legs"
>>
>>107004707
why not implement it in neoforge or sdcpp instead? why does it always need to be cumfart?
>>
File: ugfzopvmbov21.gif (2.97 MB, 480x360)
2.97 MB
2.97 MB GIF
>>107004714
>>
>>107004738
I mean I personally don't care. As long as it's usable for VRAMlets like myself.
>>
>>107004748
then aggressively fud comfyui so devs make other options vramlet friendly
>>
>>107003253
>>107004585
I second this anon. There's a huge lack of chroma training resources. I've been mostly flying by the seat of my pants trying to experiment with training.
>>
>>107004714
Git gud at prompting.
>>
>>107004768
I'd use sdcpp but it doesn't do ram offloading in proper fashion. This is very annoying.
Somewhat strange that llama.cpp is apparently its main influence.
>>
>>107004804
ask devs to contribute. hell, ask the nunchaku devs to make a sdcpp implementation. nobody does shit unless they know it's what people want
>>
are there any realistic models that can do really realistic startrek style aliens or orcs and goblins? i find photoreal shits itself when trying to do anything fantasy. the most fantastical thing i can do is a giant human
>>
>new wan ditto model works nicely for style transfer
>Still super inconsistent across individual videos
Wish there were a way to sort of lock in character consistency. But maybe I can just turn down the model noise.
>>
Is there a node in ComfyUI that can duplicate another node with the exact same settings? For example, I’d like to have two KSamplers, and whenever I change the settings on the main one, the other automatically updates to match.
>>
>>107004842
make all the widgets an input and use a variable node spaghetti to both.
>>
>>107004707
Basically, we are slaves of the whims of Kijai, that is now on comfy's payroll as well.
Comfy themselves don't implement shit now for several months.
It's a shame because the model seems amazing at motion, prompt following and actual action/prompt sequences. Better than wan 2.2 from the examples.
Not to mention it has both long generations and even multi minute generations.
Just look at this
https://meituan-longcat.github.io/LongCat-Video/assets/videos/interaction/2-1.mp4
https://meituan-longcat.github.io/LongCat-Video/assets/videos/interaction/2-2.mp4
And a fucking 2 min video with amazing coherence https://meituan-longcat.github.io/LongCat-Video/assets/videos/long/3-4.mp4
>>
>>107004842
You can use a get set node and just change the value from the node instead of the ksampler
>>
>>107004897
But what does Kijai actually do?
Looking at my nodes I don't think I use any of his, but maybe I use branches?
GGUF and MultiGPU are the mains I use
>>
>>107004714
tsar truthnuke
>>
>tfw you made an excellent image and the wan is extremely coherent with the pose

Feelsgoodman.
>>
>>107004818
fuck off trani kys
>>
just how do I tell wan to not move the camera?
>>
>>107004989
fixed camera
>>
>>107004999
doesn't work
>>
>>107004989
luck of the draw. Try different prompt, add stuff that gets out of frame to the description for it to still show, I don't think its wan fault it's light lora fault on my case at least
>>
>>107005005
add camera movement to the negatives
>>
>>107004585
>>107004776
I can write down stuff later for sure. Hll anon used LION to create a huge multiple concept lora and only trained Text Encoder too so I think there's lots undocumented stuff that works really well.
>>
>>107005031
doesn't work
>>
>>107004818
No one cares about your wrapper trani
>>
>>107004989
>>107005044
At this point you deserve to never get what you want.
You fucking retard can't even bother to learn the very basics of prompting with wan.
I know what it is but I hope nobody else spoon feeds your jeet ass.
>>
>>107005046
I think he's drunk again >>107004925
>>
>>107004618
Now make them do literally anything other than standing.
>>
>>107004989
I have trouble telling wan to do anything with the camera at all other than zoom or close up
>>
>>107005069
yeah, make them crouch, point at viewer and laugh
>>
>>107005067
>t. no gen
>>
>>107005067
either A
>someone told you and your just a faggot that gatekeeps
or b
>you're full of shit

There's no third option
>>
>>107004975
he wraps the main application that's in diffusers but comfy has a vendetta for making diffusers as abrasive to use as possible to use his slower implementations
>>
>>107005122
Ok I believe you but I'm gonna be honest I don't know half of those words
I just wanna take clothes off women in funny ways.
Is that too much to ask?
>>
>>107003945
what
>>
>>107004707
It came out 10 hours ago, come on dude, this is ridiculous.
>>
if I gen a 10s (161 frames) video on wan, is there a way to prompt it to do one thing then another without the second taking over immediatly?
"she types on a computer for 3 seconds, then she gets up and walks away"
>>
Also, pov looks fucking great https://nitter.net/Meituan_LongCat/status/1982083998852763838
>>
File: 1759106424259375.jpg (479 KB, 1536x1536)
479 KB
479 KB JPG
Behold the power of Pony 7...
>>
>>107004897
This Is So Funny To Say About Open Source hahaha, Nigga It's On Github ROFL It's Literally Got A Readme With Instructions hahaha This Nigga Cannot Read And Is On /g/ lmao
>>
you forgot to apitalize g and lmao
>>
>>107004986
Ran is annoyed again. Many such cases.
>>
>>107005147
Use a first frame last frame workflow.
Use different images in the two image loader nodes. Only use one frame with the unused one unhooked.
>>
>>107005239
sovl... sd1.6...
>>
>>107005293
Nigger I'm not gonna run their interference code myself, I'm too lazy for that, what would be the use case of ComfyJew-I if everyone just did that?
>>
>>107005339
Boo Hoo Nigga, I Wanna Gen But I Don't Know How hahaha Nigga Boo Hoo
>>
dunno why schizo is so anti anistudio. I've been asking for an exe since 2022 and finally someone is working on it. fuck python
>>
ani hours are the best
>>
i believe in ani
>>
>>107005355
You Niggas Need A Pampers hahaha
>>
>>107005369
>>107005384
Brap
>>
>julien
>>
File: 1754997267162139.mp4 (1.22 MB, 720x896)
1.22 MB
1.22 MB MP4
>>107002688
>>
>>107005421
FUCK OFF RANFAGGOT
>>
>>107004603
I see it's a dense model. Realistically, how long would it take to gen 2 min videos on a 3090?
>>
>>107005408
Can I have one too?
>>
>>107004603
uncanny af
>>
>>107005427
they are sisters what are you doing
>>
No surprise comfy claimed that trani "has a ton of issues"
He's spiraling
>>
File: svi.jpg (43 KB, 1437x204)
43 KB
43 KB JPG
Wansisters, we're about to eat good once more

https://github.com/vita-epfl/Stable-Video-Infinity/commit/34e4c505a0d77d59a738a08c161fe7d11dff8fc5
>>
>Ran took my shota collection.
>>
Who the f is Ran
Who the f is trani
>>
>>107005610
t ran i
they're the same person as far as i know
>>
>>107004842
double click on the input and connect to both ksamplers
messy but it's cumfart ui get used to
>>
>>107005122
> diffusers
> his slower implementations
>>
Bros.. I just gooned to a 480p test gen because it was so good..
>>
File: 1759399874666.jpg (1.52 MB, 2024x2424)
1.52 MB
1.52 MB JPG
>>
>>107005067
> bother to learn the very basics of prompting with wan
there is no good guide
>>
>>107005653
post it
>>
>>107005680
No, I will post the finished part later.
Now I will go sleep like a baby.
>>
>>107005702
i will rape you like a niggerbaby
>>
>>107005610
Oh just more names to add to the filter
>>
>>107005653
> gooned
retard
>>
I'm having some trouble, recently I tested a freeware version from a very expensive local software to upscale videos from low quality up to 1080 and 4k.

The freeware had about 3 files worth of use, I was surprised to restore some old episodes from 90's sitcoms.

I did look at worflows for comfyAI and tried to accomodate it for my old episodes, doing the same as this bullshit 300USD licensed software.

The only trouble my workflow fucks up and runs out of RAM when I'm around 13%-17%.

So far I'm using nodes taking the whole video file and run it through the workflow.

My question is should I...

>split each video into ten pieces to make em go through my workflow,

or

>split the original video frame by frame and get another node where it would cycle though a massive batch of frames.

and that is assuming I've got it made right and I'm not fucking up in my end.

and that's assuming what I've said can't be quantified like those massive models do on their own, but since its something I've been able to do in my computer with private paid software I wonder there must be a way to figure it out how to do it my way with comfyUI, that program literary used upscalers from the internet made by other people so I reckon its a codemonkey taking stuff from others into his own app, I hope I can recreate it on my own.
>>
Kill AI bros.
Behead AI bros.
Roundhouse kick an AI bros head off from his shoulders.
Slam dunk an iPad baby into the trashcan.
Crucify manipulative AI scammers and grifters.
Hammer a stake into an AI gooners heart while they are sleeping.
>>
>>107005840
i will rape your twink ass
>>
>>107005847
It's funny that you say that because I plan on installing Linux mint today.
>>
What if nu pony is actually really good but we can't see it because we all suck at prompting
>>
>>107005866
what?
>>
>>107005866
Linux Mint is probably the most approachable distro in terms of matching Windows' usability but even then it's a clusterfuck of issues.
It's a-okay but goddamn do I hate linux already. Endless stream of dependencies etc.
>>
File: 20251025_123651.jpg (402 KB, 1527x1113)
402 KB
402 KB JPG
I have been experimenting with Chroma1-HD-Flash as part of a larger workflow. I have this issue where if the prompt has "elf" or especially "pointy ears" in the prompt, it /always/ sticks these crappy earrings in. Always the same style of earring. Even if I leave off jewelry, earring, everything from the prompt, they still appear. If I img2img an existing image that has no earrings, it will insert them. I tried to partially mitigate it by adding "stud earrings", hoping they would at least not hang so easier to remove manually. But all this did, mostly, was add stud earrings *and* hanging ones. Adding earrings to negative and raising the cfg helps somewhat, but they still appear about 1/3 of the time and of course that massively slows down chroma flash defeating the purpose of it.
This must be bad tagging, isn't it? The training images had earrings and it was not mentioned in the prompt, so they slip in undesired.
>>
File: 108469 - SoyBooru.jpg (248 KB, 816x1024)
248 KB
248 KB JPG
>>
>>107005795
you can try to find out what models that app uses and google or ask chatgpt how to run them
>>
File: file.png (1.78 MB, 1280x1536)
1.78 MB
1.78 MB PNG
>style_cluster_1610, score_9, rating_safe, cowboy shot of iwakura lain wearing a sexy halloween witch dress with a witch hat, holding a hallowen basket in one hand and putting her other hand behind her head. She has a mischevious evil grin looking at the viewer. She's standing in front of a door, behind her a faintly lit road in a suburb. The point of view is from inside the house facing the door and the girl. The atmospherie is eerie and supernatural
>default settings from the official workflow
BROS this is FUCKING GARBAGE, fucking ponyV7 I CANT FUCKING BELIEVE I DOWNLOADED THIS GARBAGE
>>
File: file.png (1.87 MB, 1280x1536)
1.87 MB
1.87 MB PNG
>>107006018
2nd try with another seed. might be irredeemable, unless im prompting wrong
>>
File: file.png (2.75 MB, 1280x1536)
2.75 MB
2.75 MB PNG
>>107006036
3rd attempt.
Also errata corrice for the 1st attempt, I used 'full body shot' instead of 'cowboy shot'
anyway, garbage all around.
>>
>>107006018
>style_cluster_1610, score_9, rating_safe
I thought this was the stuff everyone hated about Pony... he kept it anyway!?
>>
>>107006076
>>107006036
>>107006018
Skill issue.
>>
>>107006018
>>107006036
>>107006076
sovl
>>
>>107005656
cute
>>
>>107005795
the most annoyingly formatted post on this site fuck off
>>
File: file.png (23 KB, 376x232)
23 KB
23 KB PNG
>>107006138
>>107006139
Amazing if organic
>>
I think the problem might be with the style cluster? the default one was for pony fuckers I guess but on the model card in HF I see no mention at all of where these fucking styles are.
but first error I see that I did was this:

>When referring to characters use pattern: <species> <gender> <name> from <source>
>For example "Anthro bunny female Lola Bunny from Space Jam".
something that no other model has required before lol, I'll try by changing some of the prompt around too.
>>
Seedream is cool but it is so completely constrained by your prompt that it quickly becomes boring. There was far more variation between same-prompt gens in Dall-E 3 than there is in Seedream 4.

Local is still king, I think.
>>
File: file.png (112 KB, 877x753)
112 KB
112 KB PNG
>>107006181
gem
>>
File: file.png (1.94 MB, 1280x1536)
1.94 MB
1.94 MB PNG
>style_cluster_1610, score_9, rating_safe, human girl Iwakura Lain from Serial Experiments Lain. She is wearing a sexy halloween witch dress with a witch hat, holding a pumpkin hallowen basket in one hand and putting her other hand behind her head. She has a mischevious evil grin looking at the viewer. She's standing in front of the viewer's house's door, behind her a faintly lit road in a suburb. Cowboy shot. The atmosphere is eerie and supernatural
nailed the character this time, and adjusted some of the prompt to make it simplier to understand where she is. Also I added the word pumpkin for the next gen. Tbh it looks a bit undercooked, I'll try adding more steps, maybe that'll fix it
>>
File: file.png (1.95 MB, 1280x1536)
1.95 MB
1.95 MB PNG
>>107006278
2nd gen same default steps and specifying the pumpkin. I'll try the first seed's image without pumpkin and doubling the steps
>>
>>107005962
Elf styles are so drearily conventional that it's really hard to fight against the model, and that's not just with Chroma. But yeah I'm trying right now with Chroma1-HD-Flash to see if I can do it, and I can't lol. Gonna keep trying though
>>
>>107006278
Did he overtrain the model, chose the wrong parameters or is Auraflow just that shit no matter what you do?
>>
File: file.png (2.39 MB, 1280x1536)
2.39 MB
2.39 MB PNG
>>107006278
40 steps instead of 20 of this. Better but ultimately still looks like fucking garbage in the details (eyes/hands) Maybe this needs even more steps? Trying with 60 now
>>
>>107006018
>fucking ponyV7 I CANT FUCKING BELIEVE I DOWNLOADED THIS GARBAGE
I thought that was only for generating horses
>>
File: file.png (2.54 MB, 1280x1536)
2.54 MB
2.54 MB PNG
>>107006340
60 steps, not much difference.
Might test 30 steps, but now im gonna test CFG change.

40 steps 4.5 cfg next (default cfg was 3.5)
>>
is it possible to make funny videos in wan, or is that out of the model's purview?
>>
File: file.png (2.35 MB, 1280x1536)
2.35 MB
2.35 MB PNG
>>107006340
this is 40 steps at 4.5 CFG,
hand are decisely better, so is the eye
>>
File: file.png (1.68 MB, 1280x1536)
1.68 MB
1.68 MB PNG
>>107006278
and this is 20 steps 3.5 CFG lmao bros what the fuck
>>
nodes are kinda shit when it comes to videos. where is a UI that has sequencers and timelines? is that too much for techbros to handle? all this node kikery is a waste of my fucking time
>>
>>107006392
Ideally we would have an interface like blender, where we have nodes and timeline/sequencers. would be fucking kino actually.
>>
what base of chroma is everybody using to train on?
>>
>>107006384
*4.5 CFG
anyway I'm done testing for now. It isn't that half bad desu, I'm sure I'm fucking up the prompting in some way, but for now I can't be bothered to look at civitai's gens examples to see how people are doing the good gens.
I actually just checked the official examples, and they're all 40 steps 3.48cfg.
I don't understand why the comfy workflow comes with 20 steps, gens are fucking undercooked.
4.5 CFG looked better to me than 3.5, would require a bit more testing.
I still don't see a way to consult the style clusters, so if anyone could point me to the right direction I would be grateful
>>
>>107006384
I have found that the model is extremely sensitive to literally everything.

Try CFG as high as 6. Try schedulers like dpmpp_2m_sde_gpu, or euler_cfg_pp (with low CFG). You can get dramatically different styles and vibes.

Another thing I noticed: natural language prompt gives strong western / digital art style. Danbooru tag prompt gives a decent default anime style. Clearly the training data wasn't uniformly captioned in both styles.
>>
>>107006392
AniStudio will have that soon according to the dev
>>
>>107006468
>>107006468
>>107006468
>>107006468
>>107006468
>>
>>107006460
I'll maybe wait for another kind anon to do the usual MATRIX of CFG x SAMPLERS.
I thought that pony only worked with NL, that's what the official images are using, I'll try a round with booru prompting, but later.
>>
>>107005067
retard
>>
File: 1742592388306549.jpg (117 KB, 850x668)
117 KB
117 KB JPG
>>107005962
try adding 'frieren' to negatives
>>
trani is a demented faggot that comes here to shill his toy project UI that no one uses and spreads FUD about comfy. ran is a faggot that posts obese women here occasionally and is trani's boogeyman
>>
>>107005962

how model could have i am easy elf ears if not even ears pearced ever
>>
>>107002180
i wanna play this skyrim mod
>>
File: QwenEdit_00183_.png (1.01 MB, 1104x944)
1.01 MB
1.01 MB PNG
>>107002099
Do you mean Miku Hatsune?
*ducks*



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.