[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106975747

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>106978583
I hope it keeps motion with multiple frames otherwise I don't see the point
>>
is it even worth putting anything in the negative prompt with chroma hd flash
>>
Where do i get the new lightx2v?
>>
anyone has nsfw vibevoice examples?
>>
>>106978624
https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/tree/main
>>
>>106978608
don't you use a cfg of 1 with that?
then no
>>
>>106978608
>>106978662
you can use negative prompts with cfg 1 if you go for NAG, dunno if there's a Chroma NAG though
>>
>>106978641
thanks
>>
>>106978629
you can make it moan and do (very gacha) kisses, though nothing reliably. you basically need to give it porn as a reference
>>
>>106978693
ok thank you anon, I will check then
>>
Blessed thread of frenship
>>
>>106978722
bye bye microsoft safety researcher
>>
Which one you use?
Also, does it matter?
>>
>>106978652
>>106978781
Didn't see we are already into a new thread. I have some thoughts on how it could work using native nodes apart from some helper nodes. In any case comfyanon is probably going to add nodes for it within days.
>>
File: 00033-2286827173.png (2.55 MB, 1152x1440)
2.55 MB
2.55 MB PNG
>>
>>106978775
Do we even use clip vision? It was breaking my gens on the first day I got 2.2 and ive left it off
>>
File: 1731503335572031.png (163 KB, 285x281)
163 KB
163 KB PNG
when will API keks be able to do this, again?
https://files.catbox.moe/0bv1ex.mp4
>>
>>
Wansisters, we long video now (maybe), settings and findings

>wan 2.1 i2v 14b Q6 gguf
>10 seconds (165 frames for 16 frame_rate)
>euler a, beta, 1 cfg
>lightx2v_I2V_480p_cfg_step_distill_rank256 and svi-shot (https://huggingface.co/vita-video-gen/svi-model/tree/main/version-1.0) loras
>no context nodes, no last frame, no tricks
>prompt: portrait video of a cute asian woman waving at the camera then she pulls out her camera phone to mockingly take a photo
>it did repeat the waving at the end, will try with longer prompt
>>
>>106978843
This cured my porn addiction.
>>
>>106978589
> SVI will iteratively generate a video clip for each prompt within the prompt stream, which uses the last frames of the previous generation as the conditions.
> SVI-Film supports end-to-end long filming controlled with a storyline-based prompt stream. We use five motion frames and replace the padding frame with zero for the image latent.
but i've though i2v wan can only work with 1 input image
need to read i2v paper too
>>
>>106979047
>but i've though i2v wan can only work with 1 input image
This is why I'm also scratching my head, and wouldn't it then only just be treated as a new video using lastframe? The only thing I can think of is either taking last 5 latents and combining somehow for the next sampler, or using wanAnimate nodes in a slightly unintended way.

promising but as always shit presentation and almost zero instruction on how to actually get it working correctly within comfyui.
>>
>>106978926
workflow where
>>
>>106979086
the thing is wanimagetovideo node output latent has no initial frame, it's conditionings using it according to the code
>>
>>106979151
Just use any of your wan 2.1 workflows, download svi-shot from https://huggingface.co/vita-video-gen/svi-model/tree/main/version-1 into your loras folder, set your frames, that's literally it
>>
>>106978926
so you didn't use a new prompt, it's one prompt for the full gen
>>
File: ComfyUI_23136_.jpg (1.98 MB, 2304x2304)
1.98 MB
1.98 MB JPG
>>
>>106979193
What? I'm still in the middle of testing.
>>
File: anchorman-anchor-man.gif (205 KB, 220x165)
205 KB
205 KB GIF
>>106979177
pic related
will try
>>
>>106979181
Yes but it will 100% always miss the nipple with the mouth.
>>
>>106979216
>will try
No you wont
>>
>>106979177
but it will oom after so many frames... they say it requires frames from previous? Anyway I'm currently building a test workflow using imagetovideo for first sampler then wanAnimate node and i will select from range the last 5 frames and feed into continue motion. But again its not wan doesn't treat each as a new video, it has no context.
>>
From what I understand, this SVI lora thing doesn't make long videos possible magically, it just stops the last frame you reuse before the next video part from drifting too much from the original one, including brightness, colors etc.
So your original "long video" workflows will work the same, with less errors from each extracted last frame, that's it.
If it's just that it's great but doesn't solve motion issues from reusing last frame.
Ah and apparently this is only for wan 2.1, they're working on 2.2... 5B. Why?
"PS: Wan 2.2-5B-SVI is coming."
>>
File: 00045-2117482319.png (2.48 MB, 1152x1440)
2.48 MB
2.48 MB PNG
>>
>>106979255
lol so it's literally garbage
>>
>>106979255
And the lora to do that multiple stitched videos is :
"SVI-Film": this has been trained for multiscene generation in mind, so it can be used for the last frame feeding a new video use case.
>>
>>106979275
No, it's pretty cool to not have color and brightness issues, it's a major pain in the ass.
But it'll be unusable for most people unless they go back to wan2.1 or use a shitty 5B model.
At least until the team makes a wan 2.2 14B version lora, which they didn't promise.

I hope they don't do a nunchaku and just disappear or do random models after the 5B one.
>>
File: multigpunodes.jpg (27 KB, 628x337)
27 KB
27 KB JPG
>>106979229
Depends on your hardware I guess? I can load up 245 frames on 16gb vram. I was having ram issues before with regular unet loader, you could try the multi gpu nodes

>picrel

>>106979255
Yeah, it seems to be repeating the motion but its only slightly better than the context nodes.
>>
File: 1752418828157052.jpg (861 KB, 1416x2128)
861 KB
861 KB JPG
>>
>>106979306
>I hope they don't do a nunchaku and just disappear or do random models after the 5B one.
then it'll be another promising tech dead in the water because they only used it on irrelevant models
many such cases
>>
>>106979317
>>106979255
oh well then just use the context window durr, it won't oom with that, and it should work already on wan2.2 if its a lora as most every wan2.1 lora just works with 2.2
>>
>>106979340
It's a lora especially finetuned to solve errors for wan2.1, so not sure it would work with 2.2 at all.
I didn't check sliding windows yet, but as far as I know you cannot prompt every n frames with it, so it's kind of useless for long gens if you want to have multiple things happening over time.
>>
>>106979255
and where did they mention that? Or are you just talking out your ass? I don't see them providing a workflow...
>>
File: ayysian lady.mp4 (3.08 MB, 480x640)
3.08 MB
3.08 MB MP4
So this is like the 6th attempt. While I'm impressed that it holds quality, consistency and movement, it didn't follow the end part of the prompt. I'm sure there's something we're missing but over all, I like it. I'm out, think there's another anon doing test somewhere too.

>portrait video of a cute asian woman waving at the camera, she then pulls out her phone to mockingly take a photo, next the phone then flashes, she then puts away the phone and finally she walks out of shot at the end
>>
File: file.png (44 KB, 2086x439)
44 KB
44 KB PNG
>>106979367
https://github.com/vita-epfl/Stable-Video-Infinity
>>
>>106979357
>but as far as I know you cannot prompt every n frames with it, so it's kind of useless for long gens if you want to have multiple things happening over time.
but they literally claim the opposite, which why anons are left wondering if something needs to be changed for it to work as they intended/advertised.
>>
File: 1755112089695132.jpg (64 KB, 719x688)
64 KB
64 KB JPG
>>106979384
>5b
>>
>>106979394
I was talking about the context window node.
>>
>>106979381
how do you set that up? couldn't figure it out for the life of me.
>>
is there a word token limit in noobai pos prompt? wasnt there some limit back in the day or am i remembering it wrong? seems like there isnt one now
>>
>>106979384
>5b
kek
>>
>>106979411
See >>106979177
>>
god chroma sucks so bad at doing anything POV beyond muh sex
>>
>>106979322
Based edgekinoposter, catbox?
>>
>>106979427
>god chroma sucks so bad at doing anything
you could've stopped there lol
>>
>>106979425
I'm not convinced it would be that easy. do you understand what unlimited video with a 20 minute test with no drift means in terms of wan? You're not gonna be doing 20 minutes of video on your 16GB vram card. Unless it truly is just using the last frame as input for each 5 seconds of video, but that don't make sense when pic related. Why would they go out of their way to make so many versions?
>>
>>106979456
>>106979425
like wan has no concept of a previous gen, so i don't understand what they fucking mean. fucking why u no explain properly god damn it.
>>
>>106979478
>>106979255
>>
>>106979432
https://files.catbox.moe/p44x6s.png
>You are an assistant designed to generate anime images based on textual prompts. <Prompt Start>
>@Ash Thorp, @Amu Aoi, @82 PIGEON, @Chris Bachalo, @Carpet Crawler Ccrawler Art, traditional media, painting \(medium\), canvas \(medium\), scan artifacts, magazine scan, scan, artbook, production art, novel illustration, horror \(theme\), non-web source, original, commission, dark,
>abstract, abstract background,
>outdoors, landscape, scenery, castle, battlement, palace, fortress, sky, cloud,

>You are an assistant designed to generate anime images based on textual prompts. <Prompt Start>
>@Avogado6, @Katsuya Terada, ai-generated, stable diffusion, midjourney, ai-assisted, 3d, render, blender \(render\), sepia, obese, fat, overweight, digital media, toon, cartoonized, western comics,
>bad quality, worst quality, worst detail, sketch, censor, transparent background, 1girl, solo, solo focus, english text, magazine cover, photograph \(medium\), 1other,
>>
svi does not work in comfy natively plug and play retards, it samples 5 frames in sections, I2V just gives it 1 frame, someone will have to implement it for comfy
>>
>>106979456
kek I never said anything about being able to do unlimited/20 minutes, either way, I dont care, feel free to try it out your self
>>
here is SVI
>>
>>106979381
nothing for 2.1 t2v?
>>
>>
>>106979525
incredible
>>
>>106979525
Wow, it's shit.
>>
>>106979512
>SVI will iteratively generate a video clip for each prompt within the prompt stream, which uses the last frames of the previous generation as the conditions.

image + video 1 + svi lora -> last image
(last image = first image) + video 2 + svi lora -> last image 2
etc
>>
>>106979525
>time to rape
>>
>>106979525
i genned a few with his film one to try it out but it's p wonky. not enough time to try without the self forcing lora.
>>
>>106979571
>>106979552
this is 2 + 2 steps with light, cba to wait for testing
>>
>>106979581
oh, yea, it somewhat works with 2.2 btw, same with other 2.1 loras, use at a higher weight on high noise, dont use on low noise
>>
File: 1741642244886373.jpg (864 KB, 1416x2128)
864 KB
864 KB JPG
>>
why does chroma so easily drift between normal realism and SD 1.5 2.5D hyperslop "realism"
>>
File: ComfyUI_01233_.png (698 KB, 1040x1000)
698 KB
698 KB PNG
>>106979398
>>
>>106979585
so about 4 - 5 strength in high then? Thanks for testing it on 2.2, did you compare outputs with and with out the lora? It really needs to be tested with longer and more complex videos.
>>
>>106979585
tried 1.0 in high, got endless loop of same motion ignoring prompt. will try higher now
>>
>>106979604
because its fucking shit, stop talking about it. qwen is arguable better if you know what you're doing. only probably is moral fags reporting nsfw loras on civitai, or just a lack of them. Chroma is the same but only because people lost interest.
>>
>>106979525
this is so fucking bad lmao
>>
>>106979628
>>106979581
And have you tried stringing together gens before? this is night and day better, and this is not implemented right
>>
>>106979623
>qwen is arguable better if you know what you're doing
seed variety is utter horseshit
>>
>>106979636
this, qwen needs loras or every image looks exactly the same, and then you gota train a fucking lora for any little thing, so qwen is useless cept for super specific shit
>>
File: file.png (185 KB, 302x460)
185 KB
185 KB PNG
>>106979529
where did it go
>>
>aitoolkit FINALLY got offloading for wan 2.2
about time
>>
>>106979636
>seed variety is utter horseshit
Probably because qwen has much stronger prompt adherence? As in it gives you what you prompt no matter the seed?
>>
>>106979664
qwen is overcooked and so is super locked across seeds with no variety
>>
>>106979664
Nope. You can give it a minimal prompt, and despite there being many ways to depict it, it will rigidly do the same exact thing.
>>
>>106979664
I think that's because of the flow architecture, flux also has that issue
>>
>>106979664
alibaba's models are just overtrained with synthetic shit, this company is overrated as fuck, the llm community is making fun of them because they're too obvious they're gaming the benchmarks
>>
>>106979672
flux was also a bit overcooked, that was a large part of what chroma fixed, it destroyed the lack of variety, of course it also destroyed the aesthetic training but I would rather have the more flexible model
>>
>>106979676
>>106979669
fair enough
>>
>>106979671
not 100% on this but this feels like it may be a consequence of doing aesthetic reinforcement with a smaller, limited dataset, and overcooking it that way
>>
>>106979652
https://desuarchive.org/g/thread/106978567/#106979529
you should install https://github.com/TuxedoTako/4chan-xt
>>
>>106979678
>it also destroyed the aesthetic training
if only it only destroyed that, but unfortunately it also destroyed good anatomy and good details
>>
>>106979676
The benchmarks are worthless and everyone is gaming them anyway, but the chinese companies are the main ones actually releasing new top models for end users and especially trying new things
>>
>>106979691
blame BFL for intentionally doing super destructive distillation with the intent of making it very hard to finetune
>>
>>106979691
its really not that bad there considering the crazy nsfw shit it can do, and again, unlike flux / qwen it was not aesthetic trained to hell and back, even illustrious trained the fuck out of their model on that
>>
>>106979690
i have x, he deleted the image too quickly
>>
the main thing is people need to stop using shitty ass T5, and use gemma instead. T5 is the real thing holding models back, its too complex and models can not fit well to the entire English lauange like that, there is a reason why even fuck huge LLMs are all retarded
>>
>>106979633
Exactly, these retards continuously shit on everything that releases. While not perfect, the demos are pretty impressive, especially compared to all of the previous tricks we had to do before it. With these new loras and context nodes, I honestly don't miss daisy chaining frankenstien workflows for sake of mild color change.

>>106979623
Chroma is fine...if you know what you're doing.
>>
>>106979712
>While not perfect, the demos are pretty impressive
we live in the sora 2 era, I can't pretend that's impressive when the API fags are eating so good
>>
here is what im mean btw
https://civitai.com/models/1790792/netayume-lumina-neta-luminalumina-image-20

this will be the proper illustrious 2, trained on actual gemma 2 instead of shitty T5. Its super undertrained though
>>
>>106979721
fuck off. have you actually used sora? it's shit
>>
>>106979721
even sora 2 does not have unlimited length video gen (if you had enough vram) though
>>
>>106979721
>API fags are eating so good
Deboonked already sweaty: >>106978843
>>
File: 1760543941364156.jpg (780 KB, 1416x2128)
780 KB
780 KB JPG
>>106979652
i didnt think it was very good after posting
>>
>>106979658
Does this mean we can train on potato pcs yet?
>>
>>106979729
>have you actually used sora?
I did lol >>>/wsg/6007532
>>
>>106979737
>2boy, talking
whooooooa buddy
>>
>>106979721
Ma'am, this is a local ai thread.
>>
>>106979664
That’s not a good thing
>>
>>106979743
>the 1girl community is making fun of 2characters
KEEEK
>>
Ive been out of the loop for about 2 months

I just use wan 2.2 with the lightx2v lora, which I just updated. Did any other secret sauce come out? When does 2.5 hit?
>>
>>106979754
new light loras are better, use with https://github.com/VraethrDalkr/ComfyUI-TripleKSampler for best results, wan2.1 got a unlimited length gen lora that kind of sorta works on 2.2 high, there are like 3 character replacement models I cba to list that came out this / last week...
>>
>>106979754
>Did any other secret sauce come out
no

>When does 2.5 hit
it's out and not shared
>>
>>106979754
2.5 is already available through the ComfyUI API, which is included as part of the download linked in the OP
>>
>>106979765
>character replacement models
what is that?
>>
lol
>>
>>106979772
replacing a character in a video with another character you give images of
>>
>>106979721
>we live in the sora 2 era, I can't pretend that's impressive when the API fags are eating so good
>Prompt meal prep video
>guardrail kicks in
>knifes are dangerous
>anon bad
>account banned
fuck off
>>
>>106979774
ah ok
>>
like this
https://files.catbox.moe/5exhaa.mp4
>>
File: 1743327215540344.mp4 (1017 KB, 832x480)
1017 KB
1017 KB MP4
>>
https://files.catbox.moe/ffjiwq.mp4
>>
https://files.catbox.moe/hnmys7.mp4
>>
>>106979669
but the entire point of qwen was to be able to edit shitty SDXL nsfw gens, such as fixing anatomy or change pose/position without changing the scene or character. Unfortunately nsfw lora's for edit models = bad and the dickheads report them as fast as they are produced. They should just ban all nsfw content then and be fucking done with it, because anyone with a clue can take any image and edit it the manual way using any decent SDXL based model. but they won't because their website will just die.
>>
>>106979830
not all nsfw loras are bad. the new qwen edit remove clothes one is very effective desu

https://limewire.com/d/AvpLO#Gd7AyXiz1r
>>
ai pornography has fried my brain
>>
File: 1738271039016658.jpg (745 KB, 1416x2128)
745 KB
745 KB JPG
>>
>>106979845
utter kino
>>
>>106978567
this collage fucking blows, kys
>>
>>106979766
>>106979767
Do we have a name for the cycle where new AI firm releases a local model to get their name out, then they pull the rug? Is Hunyun going to be on the cloud too? Will the next upgrade be some new model I've never heard of then the cycle repeats?
>>
>>106979729
>>106979731
>>106979733
>>106979744
>>106979777
y cum u fall for it
>>
>>106979845
catbox for that one too?
>>
File: 1743332505865939.png (754 KB, 1360x768)
754 KB
754 KB PNG
replace the blonde girl in the pink shirt on the right, with the anime girl in image2.
>>
oh, and for the netayume model this is a early aesthetic tune for it https://civitai.com/models/1974130/reakaakas-enhancer-lumina-2
>>
>>106979830
>20b bloated base models so bad they can only serve as a cleanup for superior 3b sdxl
grim state of local, clean it up chinakeks
>>
>>106979926
kek, this
>>
>no lumina support in onetrainer
:(
>>
>>106979926
NetaYume Lumina is the real next gen model, its already close to illustrious with a fraction of the training and its prompt understanding is as good as qwen
>>
>>106979935
note, it is trained on 2 sets of captions, tags and natural lauange, switch between them each epoch
>>
>>106979937
>its prompt understanding is as good as qwen
it's not quite that good at understanding prompts, no. you'll notice when you prompt multiple characters or just a lot of stuff.
>>
>>106979912
https://files.catbox.moe/tap3oo.png
>>106979937
>its already close to illustrious
it surpasses it desu i cant think of anything ilu does better other than the fact that the community has had more time with it
>>
Qwen is the best local base model by far, but it’s too expensive to train. But on the other hand look at the absolute disaster that is Chroma, where if he just spent the $200k on 5 epochs of Qwen it would be way better than the fluxenstein abomination we got
>>
>>106979976
no, if he spent that on lumina 2 we would have had a better model, I said from day one any model trained on T5 was a disaster. The only way to make a model good with T5 is to overcook the fuck out of it
>>
>>106979976
>Qwen is the best local base model by far
and that's sad, this is a plastic factory
>>
>>106979981
and I am praying next wan if it ever releases switched to qwen or gemma or some new form of clip instead of using T5
>>
>>106979937
>its prompt understanding is as good as qwen
absolutely delusional
>>
>>106979963
>https://files.catbox.moe/tap3oo.png
based
>>
>>106979981
Lumina 2 is nowhere near as good as qwen, sorry
>>
>>106979994
enjoy your one image per prompt / style of character per prompt unless you train a lora for any little thing you want
>>
>>106979994
and if you mean aesthetics wise its not aesthetic trained like qwen is (overly) Here is a lora though that is half way there: https://civitai.com/models/1974130?modelVersionId=2309365
>>
File: 1752440880766810.png (1.02 MB, 1360x768)
1.02 MB
1.02 MB PNG
replace the pink hair anime girl waving hello with the anime girl in image2. replace the text on the left saying "I need to make 1girls" with "I need to make more 2B's".

edit of a previous text edit (was testing)

I love how you can emulate fonts/styles too, you can dupe a font that you couldn't find a .ttf for without an issue.
>>
>>106979963
>that many artists
jesus christ
>>
Is there honestly a API that does better genning than local?
>>
>>106980024
Midjourney
>>
>>106980024
NovelAI V4.5
>>
>>106980028
novelai is better and does nsfw if you wana pay pig it
>>
Holy bait, batman
>>
>>106980024
Every single one. Midjourney, Seedream, Sora, NovelAI. Local is behind in all fields
>>
>>
File: wan_22_01912_.webm (754 KB, 480x624)
754 KB
754 KB WEBM
>>106979086
>>106979047
yes, seems like regular i2v accept multiple frames, but still denoise masked and masks aply not per frame, but per 4 frames chunks or something
the video is 16(12 masked) first and last frames as copies of one input image
>>
>>106980036
it's a harsh truth but it's true yeah
>>
File: wan_22_01913_.webm (1.05 MB, 480x624)
1.05 MB
1.05 MB WEBM
>>106980042
4 first and last frames are copies
>>
>>106980020
you only ever use one artist?
>>
APIs can't do degenerate porn so they automatically lose to Chroma
>>
>>106980064
>muh coom
the only cope of localkeks
>>
>>106980064
Please understand anon. ALL FIELDS. ALL. FIELDS.
>>
>>106980067
>muh cat at mcdonalds
>>
>>106980064
NovelAI can.
>>
>>106980075
how about hitler ads >>>/wsg/6006269
>>
>>106980061
only a handful, not this many though. trying it currently and it does seem netayume benefits from throwing a million artists at the board
>>
File: 1745981783844906.png (1018 KB, 1360x768)
1018 KB
1018 KB PNG
>>106980019
ssr teto has arrived
>>
>>106979963
>prompt weights
>plasma noise snakeoil
>scheduler randomizer
>clip in loras when the TE is fucking gemma
the only interesting thing really is the low CFG (but I guess it goes with the euler pp cfg sampler)
all in all, a bad workflow
just curious, which impact wildcard file are you using? or are you rocking your own?
>>
any process to deslop qwen edit skin slopping
>>
>>106980078
It can't do realism for shit though thereby it's irredeemably shit compared to Chroma.
>>
>>106980086
how about coom hitler ads?
concession accepted.
>>
>>106980146
>coom hitler ads?
can wan 2.2 do that? prove it
>>
>>106980113
their whole thing is anime which it is the best at
>>
>>
File: 1738876224753971.png (997 KB, 1360x768)
997 KB
997 KB PNG
>>106980091
okay, now it's better.
>>
>>106980113
Chroma can’t do realism either, only blurry meltyslop
>>
>>106980155
You'll have to pay me a sub fee.

https://www.youtube.com/watch?v=2d6A_l8c_x8
>>
>>106980166
>Chroma can’t do realism
skill issue
>>
File: wan_22_01914_.webm (1.24 MB, 480x624)
1.24 MB
1.24 MB WEBM
>>106980053
8 first only
>>
>>106980176
Correct, the baker has a skill issue where the model somehow got worse over time. Perhaps try contacting him about it
>>
File: the sovl is gone.png (1.22 MB, 1080x906)
1.22 MB
1.22 MB PNG
>>106980166
>Chroma can’t do realism either, only blurry meltyslop
facts, and the skin texture ain't what it used to
>>
File: 1754416542288547.png (3.25 MB, 3082x1866)
3.25 MB
3.25 MB PNG
>>106980020
>>106980087
i used to do the same if not more with ilu
>>106980099
wf schizo but outputs kino. its the same wf i use with noobvpred so much is superfluous and i dont care to clean it up. plasma isnt snakeoil desu. but same wf, picrel is what i see anyway
im using my own wildcards
>>
it looks like illustrious is also trying to train a lumina 2 model https://www.illustrious-xl.ai/model/19 fuck their stardust system though
>>
>>106980205
it's beautiful, and some people have the nerve to say that we don't need artist tags, we definitely do
>>
>>106980205
>stable diffusion and midjourney in negatives
huh, guess I should do that
>>
>>106980205
>hiding the spaghetti
FOR SHAME
>>
>>106980205
>he transformed spagghetiUi into a regular Ui
HOW???
>>
>>106980231
>being this new
do you really just look at the ugly spaghetti when gening?
>>
>>106980234
yes? :(
>>
>>106980231
https://github.com/chrisgoringe/cg-controller
>>
>>106980249
based, thanks anon
>>
File: 1747705298616472.png (1.27 MB, 1080x834)
1.27 MB
1.27 MB PNG
https://civitai.com/models/1134895/2000s-analog-core
Qwen Image is saved!
>>
>>106980266
Welcome back, kino boringreality LoRA.
>>
>>
>>106979916
I wonder how long it took him and with what hardware.
>>
>>106980292
its on the page
>10 million images. Training was conducted over a period of 3 weeks on 8× NVIDIA B200 GPUs.
>>
>>106980295
so about $30k rounding up a bit for storage
>>
>>106980295
only 3 weeks? impressive, took that furry fag 6 months to finish chroma's finetune with 5 millions images
>>
>>106980295
>>106980309
What? The description says "Trained with total ~7k images." and no indication of hardware or time. He does mention some settings though "rank 16 alpha 16, adawm, constant lr 0.0001"
>>
>>106979916
this lora is fucking garbage tho
>>
>>106980313
chroma is 50% bigger (2x as much compute) and again, T5 is fucking trash, also flux was distilled which is another level of fuckery
>>
>>106980315
? ah, I was talking about the actual model, not the lora, I didn't notice
>>
>>106980316
well its merely an aetheric tune so
>>
>>106980316
no its not?
>>
>>106980340
it is, crushes details and slops the output. if you don't think so you need to get your fucking eyes checked
>>
>>106980343
show me comparisons
>>
>>
File: tiled.mp4 (3.7 MB, 960x624)
3.7 MB
3.7 MB MP4
>>106979086
>>106979047
yes, i'm a retard
multiple images input was here all the time

81 frames from the first gen + 76 (-5 starting) from the second gen
left is both vids with svi_film lora at 1.0

i see no difference
>>
>>
>>106980266
>correct number of strings and pegs
This is impossible for chromakeks btw
>>
>>106980442
if only Qwen Image wasn't so slopped it would be an incredible model
>>
File: 1731307781456788.jpg (790 KB, 1416x2128)
790 KB
790 KB JPG
>>
>>106980447
now this is the good stuff
>>
>>106980447
>broken strings
>sovl style

>>106980266
>good strings details
>slopped base model

when will we reach the both at the same time?
>>
>>106980470
the problem is to get perfect details like that requires cooking a model until it becomes slopped, that is how it works. Its all about balancing it instead of a light aesthetic tune
>>
>>106980487
>instead of
instead with
qwen gets those small details right so well because they threw a shit ton of alibaba's compute at it until it pretty much fit to images to the point where a prompt just gives that single image
>>
how long til qwen edit but it handles nsfw concepts well
>>
Wansisters, do you think this could work on wan? Wonder if we can use this to set it to a higher cfg to get the movement but still keep sampler cfg to 1. I cant test, not at home https://github.com/Extraltodeus/Skimmed_CFG

>A powerful anti-burn allowing much higher CFG scales for latent diffusion models (for ComfyUI)
>>
>>106980515
I already saw WFs with that before, it wasn't very good compared to light lora and did not work well with it
>>
>>106980515
Skimming, thresholding, etc. is aesthetic snakeoil.
>>
>>106980515
I'm using NAG to get better prompt understanding on Wan personally (that way I still keep cfg 1 and get that 2x speed increase over cfg > 1)
https://github.com/ChenDarYen/ComfyUI-NAG
>>
>>106980515
>literal snake oil
even worse than ''''plasma''' latents
>>
>>106980399
you can do 1girl, shaking ass 161 frames natively
>>
>>106980529
> ComfyUI-NAG
yet another snake oil
>>
>>106980549
wrong, nag actually is great
>>
>>106980529
Yea nag is pretty good, thats in all of my 2.1 workflows
>>
File: 1747390921813263.png (426 KB, 1080x413)
426 KB
426 KB PNG
>>106980549
>another snake oil
*Extremely Loud Incorrect Buzzer*
>>
>>106980546
yes, but 561 frames?
what if need 81 frames of promptA, then 41 frames of promptB, then 201 frames of promptC?

>>106980557
[citation needed]
>>
>>106980559
works great with 2.2 as well, just plug a different nag in for each model, wan2.2's default negatives make a big diffrence, speaking, talking, moving mouth and the same in chinese is 100% required or characters will always fucking talk
>>
>>106980560
nag for gwen when?
>>
Reminder that comfyui logs your prompts
>>
>>106980588
[citation needed]
>>
>>106980567
so you just paste the default chinese negatives in there? why do you need a seperate nag for each model, and do you have any other tips? it always seemed like voodoo to me.
>>
>>106980578
>nag for gwen when?
I'm also surprised it hasn't been implemented at all
>>
>>106980594
he's probably using the lightning loras so he's at cfg 1 (and therefore can't use negative prompts unless he activates NAG)
>>
>>106980562
>[citation needed]
Nta but see >>106978926 I have nag in it, notice theres no slow mo

>>106980567
Interesting, I'll have to give it a test later when Im home again
>>
>>106980589
>he doesn't know
>>
>>106980597
last I read up on it he was working on making it work for flux nunchaku
>>
File: bruh.png (134 KB, 640x526)
134 KB
134 KB PNG
>>106980610
>he was working on making it work for flux nunchaku
>>
>>106980559
>>106980560
>>106980606
i had nag too but then turned it off and have seen that my gens are better without
>>
>>106980606
yea, always double up your negatives in chinese, I found that works way better, having it just in chinese or english only works half the time I found
>>
>>106980515
>>106980529
might need to tweak depending on what lightx version and strength used. And yes it does follow prompt much better obviously.

I read about these settings here.
https://civitai.com/models/1889070/camera-tilt-down-undershot

don't use NAG its fucking excrement.
>>
>>106980623
the only negative is very slightly slower gens, it only increases quality when you have a negative prompt with stuff like blurry, low res and stuff in it, not using NAG if using CFG 1 is just retarded
>>
>>106980635
>not using NAG if using CFG 1 is just retarded
this
>>
File: kek.png (36 KB, 835x275)
36 KB
36 KB PNG
didn't know ComfyUi has a channel
https://www.youtube.com/watch?v=JIBba5zZ38k
>>
>>106980647
>literal head of onions in the bottom right
kek
>>
File: 1758717862697768.jpg (58 KB, 500x500)
58 KB
58 KB JPG
>don't use NAG its fucking excrement
>not using NAG if using CFG 1 is just retarded
>>
>>106980661
its easily testable, nag only improves quality if using CFG 1
>>
>>106980661
for kontext dev it was really effective, I'm not sure for Wan though, I didn't see much difference
>>
>>106980661
>CFG 1 is just retarded
using that is also fucking retarded because without cfg it won't follow complex prompts. but keep being a fucking idiot like most people here.
>>
>Totally organic anti-Chroma posting and Qwen shilling
>>
>>106980678
have you ever heard of light lora? that is the entire point of it, you must be new as fuck
>>
>>106980678
NAG is relevant nowdays because we all use lightning loras (which put the cfg back at 1)
>>
>>106980656
hes done some wild animatediff vids back in the day, even had a motion lora training tutorial for https://github.com/kijai/ComfyUI-ADMotionDirector could think of it like training a wan video lora but instead its 1 video and its for animatediff lol

>>106980656
kek
>>
File: wan22_ext_00003_.webm (2.77 MB, 480x624)
2.77 MB
2.77 MB WEBM
wan2.1 with stable video infinity

so it can be
lightx2v
vae
some work under the hood required
yet another snake oil from china
>>
>>106980686
yeah i have
>>
>>106980706
huh? besides the low quality it seems to be doing what is advertised
>>
>>106980707
what value do you put in the skimmed cfg node?
>>
>>106980707
>5 strength, 4 cfg
jesus, your shit must be burnt as fuck, also the old 2.1 light lora sucks now compared to the newest one, especially for 2d animations

Use the latest one at 1.2 weight, 2 + 2 or 3 + 3 steps, also use the triple k sampler so you actually use correct time steps for 2.2
>>
>>
>>106980722
>skimmed cfg node
i don't use that for wan.
>>
>>106980723
https://github.com/VraethrDalkr/ComfyUI-TripleKSampler
>>
>>106980711
no it doesn't
compare frames from the middle of both videos
you can't make long vids with such quality drop
>>
>>106980706
>motion snap
this is worse than just using the last frame to gen a new video and stitching
>>
>>106980688
What are the settings? 11, 0.25, 2.5?
>>
>>106980744
works in vace with 8-15 frames tho
>>
Wow! What an amazing upscaler! I totally must check this out!
>>
Oh I'm gonna get real sloppy later with

>https://github.com/stduhpf/ComfyUI-WanMoeKSampler
>https://github.com/ChenDarYen/ComfyUI-NAG
>https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Lightx2v
>https://github.com/Extraltodeus/Skimmed_CFG
>https://huggingface.co/vita-video-gen/svi-model/tree/main/version-1.0
>>
>>106980723
>>106980627
i don't care or give a fuck, you can see examples of videos using those settings by the person who created this lora. Also he new light lora I've read mixed opinions and i'm not going to waste my fucking time until someone else figures out the best settings. I tweaked the settings based on what that guy said and i was more than happy because it follows prompt. Otherwise i'd not even bother using the light lora's because again 1 CFG is fucking shit and i don't care what you or anyone has to say about it.
>>
>>106980758
>no smoothmix
>>
>>106980758
>SlopMaxxxing
based!
>>
$17m must be enough for hiring one or two shills from india
>>
File: 1748232789041082.png (103 KB, 564x966)
103 KB
103 KB PNG
>>106980747
for wan I went for those values
>>
>>106980765
Thats what https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne is for when I wanna get reeeeally sloppy, oh yeah
>>
>>106980758
dont use skimmed cfg, its useless
>>
>>106980758
also it looks like kijai does not have the latest light lora, use this https://civitai.com/models/1585622
>>
>>
>>106980777
for wan it is yeah, but for when i really wanna force shit and then process in controlnet with another sampler.
>>
>>106980765
>no gguf
>>
>>106978641
>https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/tree/main
>lora key not loaded:
what's their fucking problem? why can't they make their shit compatible with ComfyUi? it's always the same shit with them
>>
>>106980828
just wait for kijai
>>
>>106980828
https://huggingface.co/lightx2v/Wan2.2-Distill-Loras/discussions/6#68f91663a3f9e27c0ee46820
KJBoss says you can ignore those warnings
>>
File: ComfyUI_temp_xoulu_00001_.jpg (1.39 MB, 1600x1280)
1.39 MB
1.39 MB JPG
>>
File: 00034-1810046425.png (1.76 MB, 1792x1024)
1.76 MB
1.76 MB PNG
>>
>>106980828
it works fine, it just has extra keys that comfy does not use for modulation
>>
>>106980892
oh yeah bro so SOUL, the grain is so AUTHENTIC AND ANALOG bro, cant wait for you to post your other gens oh yeah damn THAT treeline looks so natural and REAL omg bro I think you hit jackpot with your gen techniques, I love chroma so much bro POST MORE
>>
>>106980892
>noise, the model
>>
what's hunyuan good for
>>
>>106980899
>>106980902
Post your realistic car in forrest gen
>>
>>106980892
where's the girl?
>>
they deleted smash cut lora
>>
>>106980913
why would I dedicate my GPU time to gen garbage? fucking retard kys with your shitty gens
>>
>>106980916
jej
>>
>>106980923
I accept your concession
>>
>>106980916
>smash cut lora
what's that?
>>
>>106980931
do you seriously believe that garbage gen is good? get your eyes checked, I pity you
>>
File: 1749450930214532.mp4 (3.64 MB, 624x848)
3.64 MB
3.64 MB MP4
>>106978641
looks like an improvement over the previous I2V lightning loras
>>
smoothmix with new light loras when
>>
>>106980951
create your own smoothbrainmix, just merge a general NSFW lora + the new lighting. Or are you this incapable?
>>
>>106980936
See >>106980931
>>
>>106980914
bound and gagged in the trunk
>>
>>106980951
https://civitai.com/models/1995784?modelVersionId=2323420
it's like a finetune of wan?
>>
>>106980971
no it's a jeetmix
>>
File: 00037-4224779138.png (2.11 MB, 1792x1024)
2.11 MB
2.11 MB PNG
>>
>>106980024
Restricting the user.
>>
File: AnimateDiff_00001-1.mp4 (3.18 MB, 314x412)
3.18 MB
3.18 MB MP4
>insert tesla quote
>>
File: 1759370439494010.mp4 (3.66 MB, 928x576)
3.66 MB
3.66 MB MP4
>>106980945
>Hatsune Miku appears on screen from the left and shakes hands with the blue-haired anime girl
you know what's sad, is that Wan is able to add a new character to the scene while keeping the same artistic style while Qwen Image Edit can't
>>
>>106980994
butiful
>>
File: file.png (2.43 MB, 1328x1328)
2.43 MB
2.43 MB PNG
>>
>>106981005
keeek
>>
>>106981016
>>106981016
>>
File: 00047-2044473451.png (1.95 MB, 1792x1024)
1.95 MB
1.95 MB PNG
>>
>>106981017
lmao bro, this is pathetic
>>
>>106980527
>Skimming snakeoil
its really good for composition for hard to achieve poses by being able to give a somewhat good outline at something like 60 CFG. then you can use the model as normal in a second stage sampler, send in the burnt image as a depth map to control net and use a fresh empty latent as not to transfer any colour. Saves time and frustration with messing about with prompts, it can also produce interesting results on lower strength.

Another set of nodes i use does something slightly different pre_cfg_comfy_nodes_for_comfyui, there is also cfg_pp versions of samplers that work at low cfg typically 1 cfg. So all is not necessarily lost, don't be so quick to dismiss things you have no fucking clue about.
>>
File: 00058-1971371768.png (1.54 MB, 1792x1024)
1.54 MB
1.54 MB PNG
>>
File: 00057-1971371767.png (2.28 MB, 1792x1024)
2.28 MB
2.28 MB PNG
this is for the guy who, i assume, came from here that wanted 300 of these. here's the first.
>>
File: radiance.png (3.42 MB, 864x1488)
3.42 MB
3.42 MB PNG
>>
File: 00071-351438899.png (1.88 MB, 1792x1024)
1.88 MB
1.88 MB PNG
adios, дo cвидaния



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.