[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106727118

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
hunyuan image 3.0 on comfyui (fast generation with comfy cloud) when?
>>
File: l1k0tiem.webm (509 KB, 748x914)
509 KB
509 KB WEBM
>>106730007
blessed thread of friendly-zoned friend-zoned frens ;3
>>
>>106729994
yeah it's not good either like I said, I think it's arguably at least better resolved than the Hunyuan 3.0 image though which had grainy bullshit lines everwhere. And again 2048x2048 instead of 1024x1024, with better text.
>>
1000 more years of sdxl
>>
>>106730007
/r/ing the omelette woman getting chopped up or flipped with a spatula or something
>>
how's the pose where the girl is doing the split with her ass up in the air?
>>
>>106730053
SDXL is all this general needs, all you guys post is 1girl slop so you don't need anything else.
>>
File: 1742186297792786.mp4 (2.93 MB, 720x480)
2.93 MB
2.93 MB MP4
wake me when 8gb can do good videos
>>
>>106730079
>pircel
what's the source?
>>
>>106730069
spread legs, top-down bottom-up?
>>
>>106730074
Um, are you even looking at the OP image? That's clearly 1wizard,1naziloli,1gothlolita,1fairy,2fatmen,1omelettedufrumage,1negress. Retard.
>>
>>106730088
https://x.com/kirkinator_sol
>>
>>106730079
wan2.2 can do THIS?
>>
by the way, anon whos making the omelette girls, whats your prompts? I wanna try em in various models.

>also itll get me closer to recreating that vid of the pancake girl getting syrup jizzed on
>>
>>106730102
no idea, it's a cryptobro account so they probably have a rack of H100s powered with fissile material
>>
File: optimizedSERENEBEAN.gif (3.88 MB, 271x480)
3.88 MB
3.88 MB GIF
>>106730079
why is it SO long ;_;
my sides hahahah
>>106730074
>1girl
YUSS!
>>
>>106730079
and they say ai cant be kino
>>
File: ChromaDC-2K_00036_.jpg (796 KB, 1408x1952)
796 KB
796 KB JPG
>>
Is 1k imgs overkill for an XL style LoRA?
>>
>>106730079
I recognized Hulk Hogan... how terminally online do you have to be to recognize the rest? Like teenage girl tier? lol
>>
>>106730094
I guess yes, lol
>>
>>106730138
it's maybe overkill but it definitely won't hurt as long as the images are varied and good quality
>>
>>106730172
>>106730094
"jack o pose"
>>
File: ChromaDC-2K_00038_.jpg (848 KB, 1408x1952)
848 KB
848 KB JPG
>>
File: 1734551871173993.mp4 (521 KB, 832x560)
521 KB
521 KB MP4
>>106730145
Teenage girls are on the streets protesting fascism now, not sitting on the internet like us
>>
File: file.png (27 KB, 925x133)
27 KB
27 KB PNG
incredibly based
>>
>>106730231
i love him
>>
>>106730231
>>106730231
I really like my speculative dev! By the way, what are his thoughts on SDXL? People will never rent GPUs in ComfyCloud if there's a lightweight model, it doesn't make sense to keep supporting it.
>>
>>106730231
Imagine your model being so bad that it's not even worth bankrolling it lol.
>>
>>106730208
cool crown
>>
File: image(3).png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
I post some more outputs of Hunyuan Image 3.0. They are all terribly underwhelming for 80b, but maybe I'm doing it wrong. The prompts are all very simple. They take 15-20 minutes each with RTX 6000 and 192GB RAM. SDPA because flash attention isn't working properly.

Generated with the stock gradio UI and system prompt:
>You are an advanced AI text-to-image generation system. Given a detailed text prompt, your task is to create a high-quality, visually compelling image that accurately represents the described scene, characters, or objects. Pay careful attention to style, color, lighting, perspective, and any specific instructions provided.

Prompt for this one is
>Please provide a map of lower Manhattan
>>
File: image(6).png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>106730288
>Please provide an illustration showing the interior of an Island 3-type O'Neill cylinder. The perspective is from near the central axis looking outwards along the length of the volume. Cities, towns, farms, forest and other biomes cover the interior surface not taken up by the long window panels. The reflective mirrors are partially visible outside through the windows. The apparent time of day is the golden hour. The illustration style is a detailed goache painting with high contrast.
>>
File: 00539-3103169474.jpg (966 KB, 2048x2688)
966 KB
966 KB JPG
Taking the LLM approach to image gen is fucking retarded and I said this during the SD3 days. The problem is in most cases they make these models inefficient to put you into the API ecosystem while providing nothing on part with modern day tools that ironically are built by the community.
SD would have never taken off if it was inaccessible by mid tier hardware on the consumer level.
>>
>>106730288
>not even a map
lol
>>
File: omelette.webm (3.72 MB, 1088x832)
3.72 MB
3.72 MB WEBM
>>106730057
>>106730079
here's your 8gb video
>>
>>106730293
oops wrong one
This is
>Please show me a silk-screen print of a space marine firing their weapon upwards in triumph. Only two inks are used. The print is on white paper.
>>
File: image(4).png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
>>106730293
>>
>>106730138
I've done 1000 image generalist detailer loras before, not so much style ones. That said it won't hurt if your images show a good variety of different things in the particular style you're going for, the more the better really.
>>
>>106730288
no yeah it's not good, i dunno how it even got approval for release, they could have just iterated on Hunyuan Image 2.1 moreso and probably have gotten something better
>>
>>106730334
They are doing the same short sighted shit that killed SD and in turn damaged flux because the retard devs took the same philosophy. I'm pretty sure BFL is underwater financially because there is zero reason to pay for their service.
>>
>>106730319
>generalist detailer loras
whats the training / dataset look like for those if you dont mind my asking
i used to use that detailer lora for 1.5 but never since so im moreso curious than anything
>>
>>106730301
I presume this would require learning several complex functions which are beyond the basic knowledge of a humble swarmUI user such as myself...
>>
>>106730345
nope, just download the Pinokio app, it is basically plug and play.
>>
>max queue size in comfy is 100
lol
>>
>>106730365
until it fails to download packages like huggingface and gets stuck in a download loop lmao
>>
File: ComfyUI_07166_.png (2.06 MB, 1152x1152)
2.06 MB
2.06 MB PNG
>>106729758
So this is the power of a 4o slop dataset... At this point I'm not surprised there isn't always a catch to a Chink model we got for free, and 80B really was too good to be true unfortunately.
>>
>>106730374
maybe max batch size to add to the queue, the queue itself doesnt have a limit
>>
>>106730365
I installed that months ago and got stuck somewhere along the process of getting it to work.
But i'll try it again, thanks!
>>
File: elf-bride-unicorn-cartoon.jpg (1.14 MB, 2344x1336)
1.14 MB
1.14 MB JPG
>>106730019
Well I am just saying that based on what they claim. All I know is that when I do latent upscales with 0.6+ denoising at 4 megapixel with Illustrious, I get a lot less decoherence than I do with, say, Pony or another SDXL model. I've also done some experimenting with base 1.5 megapixel generations and it works fine.
>>
>>106730231
he's not wrong, that's really an useless model
>>
anyone try local music generation? quickly searching there is one called YuE and one called ACE-Step
>>
>>106730343
IDK, I think Kontext Max and Flux Pro Ultra (the one that does like 4K outputs by default) probably get decent use API wise by different third party inference providers
>>
File: 09432548574345.jpg (376 KB, 1380x1427)
376 KB
376 KB JPG
The frustrating bit about Hunyuan Image 3.0 is that the whole point of upping the parameter count is to capture more specifics about the world, pop culture, etc. Look how the post trained it. It's just said how benchmarks kill creativity.
>>
File: kekked333.jpg (89 KB, 906x592)
89 KB
89 KB JPG
>>106730376
just reload and be patient
the files are in the 10s of GB it takes a few hours but it does work :p


>WELCOME 2 THA SALTY SPITO0N HOW AUTiSTiC AREE YA????
>>
>>106730440
>The frustrating bit about Hunyuan Image 3.0 is that the whole point of upping the parameter count is to capture more specifics about the world, pop culture, etc. Look how the post trained it. It's just said how benchmarks kill creativity.
wait they used SPRO and it still looks like slop? it's over..
>>
>>106730418
YuE is semi good, quality very comparable to early Suno. ACE Step instruments are best, it's the fastest, but it needs an update for proper voice which can be hit or miss, but mostly miss (they are currently cooking version 1.5 which is to address this) Songbloom is the best for voice quality, but right now its variety sucks, does not support instructions and they changed the license to non commercial. None of the models are a match for Udio or ElevenLabs for music.
>>
>>106730458
thanks ill try ace, wanted to do some background music anyways
>>
File: 1747491181383949.png (1.04 MB, 824x1264)
1.04 MB
1.04 MB PNG
kek, qwen edit v2 (2509) can use openpose/depth as image references (for image2, etc).

replace the man in the blue police uniform with Miku Hatsune using the same pose as image2.

it didnt change the face but this might be even more hilarious, lmao
>>
File: 1729315784313444.png (200 KB, 1024x1561)
200 KB
200 KB PNG
>>106730493
and the original openpose for image2 from a aio aux preprocessor -> dwpreprocessor node:
>>
File: 1754395378965256.png (1.06 MB, 824x1264)
1.06 MB
1.06 MB PNG
>>106730498
there. second iteration did it. Good job Miku.
>>
>>106730493
>>106730498
>>106730504
kek
>>
>>106730345
>>106730383
btw for this video i used wan 5b and 15 steps. took a little more than 4 minutes for 1 second
>>
File: 1736929077521056.jpg (514 KB, 1024x1504)
514 KB
514 KB JPG
What is happening when you give negative weight to positive prompt tokens?
>>
>>106730529
>What is happening when you give negative weight to positive prompt tokens?
You go to hell before you die!
>>
File: 1732780749614033.png (67 KB, 1813x1024)
67 KB
67 KB PNG
from this...
>>
File: 1745914622071272.png (700 KB, 848x1224)
700 KB
700 KB PNG
>>106730555
to this

replace the anime girl in image1 with the anime girl in image1 that is pointing, using the same pose as image2. the background is white.
>>
>>106730563
prompt seems redundant but I had double 2bs until I said replace.
>>
File: 1734307971474595.png (355 KB, 1024x1024)
355 KB
355 KB PNG
>>106730563
>>
>>106730555
>>106730563
Impressive. Very nice.
>>
>>106730529
i think it's the same as if they're positively weighted in the negative prompt?
>>
>>106730583
nop
>>
File: ChromaDC-4K_00041_.jpg (975 KB, 1632x1848)
975 KB
975 KB JPG
>>106730440
All these reward hacks should be loras or alternative models.
>>
File: image(7).png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
>>106730288
>Portrait illustration of a young woman. Media: quill pen with india ink. Highly calligraphic line work.

It still has the old problem of mistaking the medium requested with the content. Also we're back to spider web / crosshatching for all fine detail.
>>
>>106730440
It is amazing how few people in this space understood why 1.5 became so dominant. Seed gacha is the price to pay for having a model with genuine creativity, which is lacking even in modern SDXL descendants and many other new models.
>>
>>106730440
>The final refinement is achieved though the application of SRPO
Why would it be a refinement step? Isn't the whole point of using SRPO is to be able to use it in training over GRPO? There's no way unless you extensively trained with it SRPO can get rid of any slop that came before it.
>>
why did the hype for radial att die? is it worthless for wan2.2?
>>
File: ComfyUI_07171_.png (1.9 MB, 1152x1152)
1.9 MB
1.9 MB PNG
>>106729921
The issue is that their smaller model that they're scaling sucks (which I suspect is because they're using synthetic data). If you stack more layers on top of a good model then that doesn't happen.
>>
>>106730643
CFG peaked with CFG
Attention peaked with sage attention.
It literally cannot get better.

Source? My dad works at stable diffusion.
>>
tell your dad to put the busty asian sluts back into the dataset for SD4
>>
File: WanVideo2_1_T2V_00214.mp4 (1.38 MB, 1872x1088)
1.38 MB
1.38 MB MP4
Sure are a lot of no through signs on this street.
>>
I can't face swap using Qwen Edit Image 2509 (+ lightning lora) without first superposing the face on the target area using an image editor. "Put image1 face on image2 face", or any other prompt, doesn't work. I guess they censored it. Also, with the image editor trick, I then fail to turn her face around so it looks realistic, like "tilt her head to the right" doesn't fucking work.
They trained it to generate e-commerce images, isn't it?
>>
File: 1738995891855565.png (1.74 MB, 1440x1120)
1.74 MB
1.74 MB PNG
>>106730609
absolutely awful.
comfy should not support this model, nobody should.
>>
>>106730678
I said stable diffusion, not stability AI.
>>
File: pip.jpg (1.81 MB, 1664x2432)
1.81 MB
1.81 MB JPG
>>
File: 00058-695067368.png (3.08 MB, 1344x1728)
3.08 MB
3.08 MB PNG
>>
>>106730691
>80b for this
>>
>>106730609
>>106730691
looks like the propaganda blasts louder tonight as punishment. very very shamefur.
>>
File: 1739100289132757.png (1.1 MB, 864x1208)
1.1 MB
1.1 MB PNG
the blonde man holding the glass in image1 is shaking hands with the anime girl in image2. keep his facial appearance the same. keep the background the same.

changed both their poses and combined it with just 2 nodes. qwen edit v2 is neat stuff.
>>
>>106730007
I havent kept up with local genning for a while, there's so much new shit... Got a workflow from a Chroma image on civit and I've not seen the majority of these nodes before.
>>
File: ComfyUI_18655.png (2.9 MB, 1152x1728)
2.9 MB
2.9 MB PNG
>>106730294
The problem is LLMs can't be creative or controlled enough to come up with a scene in their "mind" and turn that into a usable prompt (it's always too busy and unfocused). Using vision-enabled LLMs work infinitely better though because they use that image as context to build out a prompt. I think this is most evident with Kontext/Qwen edit models.
>>
File: 1757986634925895.png (1 MB, 864x1208)
1 MB
1 MB PNG
>>106730744
>>
File: file.png (18 KB, 1208x386)
18 KB
18 KB PNG
>>106730374
>>
>>106730458
>None of the models are a match for Udio or ElevenLabs for music.
That's also my experience, udio is really good (didn't test elevenlabs).
>>
>They'll be calling me promplet for the last time
dios mio https://civitai.com/images/91532907
>>
File: 1743559860765947.png (1.05 MB, 1360x768)
1.05 MB
1.05 MB PNG
>>106730764
>>
>>106730710
very cute
>>
File: 1755881485280594.png (25 KB, 596x122)
25 KB
25 KB PNG
>>106730779
>>
File: 1729056010550420.png (1.81 MB, 1440x1120)
1.81 MB
1.81 MB PNG
qwen is fucking weird sometimes
>>
File: 1728890437273406.png (1.05 MB, 1360x768)
1.05 MB
1.05 MB PNG
>>106730783
very nice to meet you
>>
>>106730294
Okay I need more of that trap on the left
>>
>>106730686
It works when I ask to "put image2 next to image1", but after generating 8 images, I can see the model has 0 creativity. They are almost identical.
>>
File: 1755048711664409.png (1.04 MB, 1360x768)
1.04 MB
1.04 MB PNG
>>106730795
>>
File: ChromaDC-2K_00047_.jpg (813 KB, 1632x1848)
813 KB
813 KB JPG
>>
any news of wan2.5 being local yet
>>
>>106730838
No and it never will be. I can't believe people are still so delulu over this. "Maybe" means "No" in China.
>>
>>106730838
wan is their video equivalent to qwen-max which they also never open source
>>
File: 1734776298511440.png (1.11 MB, 1088x952)
1.11 MB
1.11 MB PNG
the man in image1 is shaking hands with the man in image2. keep both men's facial appearance the same. keep the background the same.

if you say to keep their appearance the same it generally works well. good one, mr fors
>>
File: 1756055072437960.png (1.09 MB, 1088x952)
1.09 MB
1.09 MB PNG
>>106730853
this one is basically 1:1 with the source.
>>
>>106730845
so it really is over for local video gens. oh well. this gives people plenty of time to make finetunes/loras and optimize the hell out of like SDXL.
>>
>>106730853
jesus so we've finally reached the peak of a.i meme capabilities huh? we can finally just generate ANYONE/ANY character shaking hands with our buddy spitler. Awesome.
>now make hitler shaking hands with hitler
>>
File: 00060-1192150331.png (3.32 MB, 1824x1248)
3.32 MB
3.32 MB PNG
>>
>>106730864
>>106730853
The obvious use case here is to edit people into candid photos with Jefferey Epstein.
>>
>>106730873
also you can swap any outfit onto any character with a cropped cutout of the outfit. tons of possibilities. it's like a swiss army knife for an AI model.
>>
>>106730294
>Taking the LLM approach to image gen is fucking retarded and I said this during the SD3 days.

GPT Image disproves this. The problem will always be two things with local models, and these two things only. Censorship and data.

Another anon said it. The best local model we ever got as a base was 1.4 and 1.5. Why? Pretrained on a crap ton of data, and mostly uncensored.

After the lawsuits started rolling in, we got SD 2.0 and its been downhill from there while companies like OpenAI and Google can shield themselves with the best lawyers.
>>
>>106730881
yeah, and normies are too stupid to know the difference, im surprised they dont use AI to lie when they lie all the time on the news.
>>
does chroma flash always give artifacts? i'm using heun 8 steps and at 1024 it looks like the image was jpegified
>>
File: WanVideo2_1_T2V_00215.mp4 (3.17 MB, 1872x1088)
3.17 MB
3.17 MB MP4
>>
File: 1730929355101994.png (939 KB, 824x1264)
939 KB
939 KB PNG
replace the man with the blue police uniform in image1 with the man with the moustache in image2.

image2: random hitler pic

kek, gave me DAS POLICE
>>
>>106730294
repurposing (newer, architecturally superior to T5 without even necessarily having as many params) LLMs as straightforward text encoders seems to still be a decent approach though. The fact that e.g. Gemma 3 has a context length of 8192 vs the 512 of T5-XXL is sometimes noticeable on NetaYume for example.
>>
>>106730913
I think this video single handedly proves that you should avoid RIFE like the plague.
>>
>>
>>106730917
(samefag) TLDR a well trained 2B - 6B model attached to the most modern text encoder available that strikes the best possible balance between size and performance itself is basically the "ideal" thing.
>>
Excuse me, how do I block >>106730932 this user?
>>
>learning how to use flux for the first time
>combined a 2d style lora with a photoreal prompt into a subject that'd be normally impossible in xl
i get it. i completely fucking get the hype now. its like a dopamine rush i haven't felt since like 2023. hooooooly.
>>
>>106730932
RIFE has no impact on quality.
>>
>>106730950
>learning how to use flux
Just type in the thing you want to see nibba
>>
>>106730686
And I can't seem to transfer style from one image to another. Simply asking to turn an image into a drawing can sometimes work (although it's shit at anime style), but this is definitively not a use case. I think that my final verdict is that this model is garbage. Thanks China.
>>
god i wish comfyui had nodes that would let you loop through a series of prompts or something
>>
>>106730991
>Just type in the thing you want to see nibba
coming from tagslop/mixture of natural language, no it wasn't that easy.
>>
stagnant thread of stagnant tech
>>
File: ComfyUI_03544_.mp4 (928 KB, 1144x912)
928 KB
928 KB MP4
>>
I'm just glad we're past the stage where anything less than glazing chroma would result in an autistic freakout 12 hour spree of posting warped asian feet.
>>
moving thread of advancing tech
>>
File: 1750902125954352.png (879 KB, 1088x952)
879 KB
879 KB PNG
the man is giving a speech at the Reichstag in Germany. He is pointing at a large video screen behind him says "RTX 5090: $4000", and has a picture of image2 behind the text. Keep the man's expression and facial appearance the same. The image is black and white.
>>
>>106731061
pictured: jensen huang
>>
>>106731061
Hi, I'm new.
What are you using that allows prompting like that?
>>
File: 1746659894613185.png (1.17 MB, 1088x952)
1.17 MB
1.17 MB PNG
>>106731061
>>
>>106731081
Nano banana diffused via API nodes.
>>
>>106731061
>>106731072
i like where this is going. continue.
>>
>>106731081
thats qwen image edit v2 (2509). can use multiple image nodes, thats just with a person and a gpu as reference (2 nodes).
>>
File: 1747062395677552.png (1.08 MB, 1088x952)
1.08 MB
1.08 MB PNG
the man is giving a speech at the Reichstag in Germany. He is holding an Nvidia GPU in his hand with the appearance of the GPU in image2. Keep the man's expression and facial appearance the same. The image is black and white. At the bottom of the image is subtitles that say "Jensen is a tyrant for these prices!"

pretty cool how versatile qwen edit is.
>>
>>106731106
its sad it doesn't make images match in quality and style, as it just looks like a shitty photoshop job
>>
File: 1729438394462714.png (1.28 MB, 1088x952)
1.28 MB
1.28 MB PNG
the man is giving a speech at the Reichstag in Germany. He is holding an Nvidia GPU high in the air with his hand which has the appearance of the GPU in image2. Keep the man's expression and facial appearance the same. The image is black and white. At the bottom of the image is subtitles that say "No more Jensen holding VRAM hostage!"

he has a point
>>
>>106731087
thanks fren

>>106731094
this better not be a ruse
>>
File: 1731797052967450.png (1.18 MB, 1176x888)
1.18 MB
1.18 MB PNG
>>106731123
nah, it's fun. you can do basically anything and manipulate images to do almost anything.
>>
File: 1743244626010425.png (1.27 MB, 1088x952)
1.27 MB
1.27 MB PNG
>>106731119
>>
>>106730997
I finally got something, with >>106730529 as a reference image.
>Change the style of image1 to an (anime style:1.25) similar to image2. Thoroughly change anything, don't keep details, I want an anime style. Keep the pose of the woman in image1.
No neg prompt. It failed with >>106730563
>>
File: 00108-3454568152.png (2.97 MB, 1824x1248)
2.97 MB
2.97 MB PNG
>>106729065
>>106729967
could i get catbox and model name/links for this two stocking gens. They look way too good.
>>
File: ComfyUI_temp_ndrdv_00002_.png (1.73 MB, 1280x1280)
1.73 MB
1.73 MB PNG
>>
File: 179.jpg (1.21 MB, 1600x1952)
1.21 MB
1.21 MB JPG
>>106731005
There are various nodes for dynamic prompting, but that can only get so elaborate depending on what you want to do.
>>
>>106730917
Is the neta anime models good yet?
>>
Can I get an amen if you'd rather see Mayli than plastic sluts in bikinis?
>>
>>106731183
this post explains how i managed the realism >>106729199
and sure (the face in the 2nd quote is custom thoughhowever but you might be able to achieve it with one of the civitai stocking loras) https://files.catbox.moe/wjh50f.png
>>
File: 1737939237308727.png (788 KB, 1360x768)
788 KB
788 KB PNG
the man is at a press conference with a large video screen with an image of image2 on it. He is pointing to the screen. Below the GPU is the text "Only $8000, you dumb fucks". keep his appearance, pose, expression, and facial structure the same.
>>
File: still not enough vram.png (98 KB, 1137x483)
98 KB
98 KB PNG
>>106730301
>>106730079
>>106730345
just buy more vram anon
>>
File: ComfyUI_07174_.png (1.81 MB, 1152x1152)
1.81 MB
1.81 MB PNG
>>106730899
>does chroma flash always give artifacts? i'm using heun 8 steps and at 1024 it looks like the image was jpegified

No?
>>106730657
>>106730380
These are all made with Chroma Flash anon. Likely one of your settings is wrong.
>>
File: ChromaDC-2K_00056_.jpg (664 KB, 1632x1848)
664 KB
664 KB JPG
Simon Bisley getting there
>>
File: ComfyUI_temp_qvidr_00001_.png (2.19 MB, 1152x1728)
2.19 MB
2.19 MB PNG
>>
plebbitors are claiming you can legit train character/person loras in flux at 512x and get good results, chat is this true or are they r/tarded?
>>
File: 1727965967276350.png (833 KB, 1024x1024)
833 KB
833 KB PNG
neat, I can make super quick awoos now.
>>
>>106731266
what settings are you using? i've been using the rank 32 lora from here https://huggingface.co/silveroxides/Chroma-LoRAs/tree/main/HD-Flash at 1.00 strength with heun/beta 8steps. artifacts don't seem to go away in either the fp8 or full hd model
>>
File: 1750502353560332.png (906 KB, 1024x1024)
906 KB
906 KB PNG
>>106731296
Change her appearance to look like Miku Hatsune, and give her teal colored eyes.
>>
im trying out ace_step but its fucked up. like most of the time the song cant even keep a beat
>>
File: 1736647040460219.jpg (242 KB, 1824x1248)
242 KB
242 KB JPG
>>106731296
>>106731306
The future is now
>>
>>106731324
it's such a neat tool, doing all these edits manually with inpainting or with controlnets would take so long, those are still good tools but this makes things way faster, and you can use this with wan, qwen/flux, noob/illustrious, whatever.

take gens from those and manipulate it with the edit model.
>>
>>106731306
I'll say it again. I loathe how you call her Miku Hatsune and not Hatsune Miku. I think it actually give me autism.
>>
File: 1737088749418597.png (872 KB, 928x1120)
872 KB
872 KB PNG
the man is holding an open pizza box with one hand and is holding a pizza slice with the other hand. keep his appearance the same.
>>
File: 1752925681276275.jpg (114 KB, 1536x1536)
114 KB
114 KB JPG
>>106731330
It requires (un)comfy UI though, or does it?
>>
>>106731343
once you have a good workflow it's very comfy, you just gen. save + export the workflow and it will always be there even if something screws up.
>>
thanks, do you have a link to reality stone upscaler he used?
>>
>>106731339
could you gen him holding up a snickers bar?
>>
>>106731380
sorry that's a scaler i was given a while ago. dunno if he'd want me sharing it (if he did he'd probably post it to civitai himself kek)
>>
File: bunny2.webm (305 KB, 872x1006)
305 KB
305 KB WEBM
>>106731027
bruh ;_;
>>
File: ComfyUI_07178_.png (1.72 MB, 1152x1152)
1.72 MB
1.72 MB PNG
>>106731029
I one shot these now with Flash anon kun. I'm not even trying as hard with feet pics anymore. They are so easy, and they come out flawless 99% of the time.
>>
File: 1735183073564256.png (789 KB, 928x1120)
789 KB
789 KB PNG
the man is holding a TV with image2 displayed on it. keep his expression the same.

what a shame
>>
>flawless
>she has two left feet
>>
File: ChromaDC-2K_00059_.jpg (942 KB, 1408x2064)
942 KB
942 KB JPG
>>
File: 1754440767121452.png (760 KB, 928x1120)
760 KB
760 KB PNG
>>106731382
the man is holding a candy bar in his right hand with the appearance of image2 on it. keep his expression the same.

just needed to grab a snickers bar as an image reference for the other node and...it more or less works
>>
>>106731410
I assume it was subtle ironic humor.
>>
>>106731387
wtf i can't find it anywhere. please share it. just litter catbox the upscaler. Didn't even know there were private upscalers people kept to themselves.
>>
File: 00037-1149459993.jpg (77 KB, 450x450)
77 KB
77 KB JPG
>>106730007
>posting my Chroma gens in the inclusive collage.
this lightweight model has so much potential...
>>
File: ComfyUI_07181_.png (1.72 MB, 1152x1152)
1.72 MB
1.72 MB PNG
>>106731410
Rare case, easy fix
>>
File: 1756649930109896.png (736 KB, 928x1120)
736 KB
736 KB PNG
>>106731432
the man is holding a chocolate candy bar in his right hand with the appearance of image2 on the wrapper. keep his expression the same.

almost, the image source is kinda potato so it's gonna be a bit off.
>>
File: 00048-981546305.jpg (67 KB, 450x375)
67 KB
67 KB JPG
No tag bleeding
Different character body type handling
Handles various body types properly
>>
>>106731432
this is all i ever wanted. thank you.

>>106731447
eh alright fuck it why not. in fairness i use it with like every single one of my sdxl based realism images so i may as well.
https://files.catbox.moe/2nolmf.pth
>>
File: 1751453819410162.mp4 (2.89 MB, 816x1280)
2.89 MB
2.89 MB MP4
>>106731471
Me on the right
>>
File: 00064-2846957914.jpg (67 KB, 450x450)
67 KB
67 KB JPG
Inclusive model. Qwen or WAN only makes rigid body types aesthetically. But Chroma has more variety.
>>
File: 1729779632422501.png (756 KB, 928x1120)
756 KB
756 KB PNG
okay this is as good as it will get with the low fidelity of the original image.

qwen edit is pretty neat though. just use image2 or image3 instead of image stitch jank.
>>
>>106731484
>arch linux
>>
File: 00003-1289225036.png (2.62 MB, 1824x1248)
2.62 MB
2.62 MB PNG
>>106731482
thanks a thousand anon. The upscaler and fp16 vae seems to have made a difference between this gen and my older gens. :) really appreciate it.
>>
File: 1743727117830.png (2.87 MB, 1200x1600)
2.87 MB
2.87 MB PNG
>>106731286
This is from a 512px dataset (pre-cropped). The only downsides are generally softer output and a more unreliable likeness at distance. It can absolutely get the job done though.
>>
Feeder fetish for women is like meh, I get it. But feeder fetish for men makes no sense to me.
>>
After a day of testing I believe Forge Neo is worse than Forge old.
>>
>>106731569
after a minute of thinking I believe your post is unhelpful
>>
Day 36 of no good fast chroma solution that doesn't fuck up my images
>>
File: 1000024082.png (201 KB, 450x321)
201 KB
201 KB PNG
>>106731549
Now?
>>
>>106731534
Awesome, have fun.
>>
File: WanVideo2_1_T2V_00216.mp4 (676 KB, 1872x1088)
676 KB
676 KB MP4
The new 2.2 LoRA really does have a tendency to produce over saturated marvel slop.
>>
>>106731613
try the 2.1 lora for high at 3 strength, and 2.2 lora for low at 1 strength. idk why but 2.2 high can mess with motion and results a bit.
>>
File: ComfyUI_07190_.png (1.76 MB, 1152x1152)
1.76 MB
1.76 MB PNG
>>
>>106731635
oh, nm there is a new lora out. havent tried it yet.
>>
>>106731578
>fav extension for custom resolutions broke
>has weird behavior when hitting generate after switching a model
>worse performance it seems
where is the benefit
>>
Where the hell are the essential Chroma loras? Why do none exist? There's like nothing. The model has been finished for months now. Who the hell is even using it?
>>
>>106731646
some are saying t2v works fine with i2v, i'll give them a try when they are done dling.

https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928
>>
>>106731662
>nazi spoted
>>
>>106731662
you can just look on civit
>>
What's the best Wan 2.2 LoRA for general-purpose nsfw stuff like taking off clothes/etc.?
>>
>>106731680
They're all crappy.
>>
>>106731662
>Where the hell are the essential Chroma loras?
they don't exist
>Why do none exist?
one of the chroma team members had a repo full of good experimental loras but he removed them and likes to chimp out at anyone who reposts or uses his stuff
>Who the hell is even using it?
a handful of people on civitai and this thread. most people aren't going to bother with it for a variety of reasons
>>
>>106731682
>general-purpose
The only general purpose wan LoRA are the lighting ones, everything else is quite specific
>>
File: 1744875412766165.png (998 KB, 1360x768)
998 KB
998 KB PNG
give the blonde girl in the dress teal color twintails like Miku Hatsune, in a low poly style.

neat. source is parasite eve 1 on ps.
>>
>>106731662
>essential Chroma loras
Specify or STFU
>>
>>106731704
>good experimental loras but he removed them and likes to chimp out at anyone who reposts or uses his stuff
What why did he remove them?
>>
>>106731724
because the people working on chroma are self-sabotaging retards. see >>106728978 in the last thread
>>
>>106731724
please join the discord to find out ;)
>>
>>106731662
>Who the hell is even using it?
Anyone who wants to generate realistic (as in non-plastic) people, particularly for NSFW

For anime / hentai etc, Illustrous remains the only real choice

The question you should ask yourself, who the hell is using Qwen (not Edit), or even Flux these days, they have zero usecase with their plastic censored output
>>
>>106731722
Basic shit like NSFW positions, body sliders, races, Camera positions. You know, essential loras that every model has?
>>
File: WanVideo2_1_T2V_00217.mp4 (569 KB, 1872x1088)
569 KB
569 KB MP4
>Heh, oops, sorry, must have been that Chroma gen I posted.
>>
>>106731724
>What why did he remove them?
Probably because they were made against ancient versions of Chroma and don't work well with the final releases
>>
when will imagegen/videogen figure out MoE?
>>
>>106731766
cope, they worked just fine
>>
when will we figure out how to make SDXL into a video model?
>>
>>106731751
Really, show me the Qwen loras for these subjects

Qwen has been out longer than Chroma and has had a Civitai section much longer than Chroma
>>
File: ComfyUI_07203_.png (2.26 MB, 1152x1152)
2.26 MB
2.26 MB PNG
>>106731300
You want to use this
https://huggingface.co/lodestones/Chroma1-Flash/tree/main
Here's catbox
https://files.catbox.moe/xnc4fe.png
>>
>>106731788
Everyone uses Qwen Edit, and you don't need those kind of loras for that.
>>
https://github.com/comfyanonymous/ComfyUI/issues/10068
>>
>>106731789
gotta love how this guy always posts the template workflow. absolutely nothing different about it from the one comfy provides.
>>
>>106731789
So if I wanted this as an FP8 would it be better to just quantize the full Flash .safetensor versus using a lora?
>>
>>106731801
kek, backtracking directly, you're so full of shit

and of course you need loras for Qwen Edit since it's massively censored just like Qwen
>>
>>106731782
seed variation animations homie
>>
File: ComfyUI_07204_.png (2.61 MB, 1152x1152)
2.61 MB
2.61 MB PNG
>>
>>106731811
comfy is right. HunyuanImage-3.0 is meme shit. still, if he wants comfyUI to not get a bad rep, he should just do it anyway to please the 5 people that will use it.

swallow your pride comfy
>>
>>106731811
Hate to say it, but the shithead Comfy is correct for once, there's zero reason to add Hynuan 3.0 unless there's some insane quantization magic that makes 1bit work well with this model
>>
File: 1749388313161788.jpg (2.2 MB, 3456x3456)
2.2 MB
2.2 MB JPG
>be me
>using illustrious in SwarmUI like a pleb
>inpaint twice
>image now has yellow tint
what am I doing wrong here
>>
>>106731830
>>106731832
You don't see this retarded mindset of "waaaa model too big for my gayman GPU, so it's useless!!111!" in the LLM scene (thankfully).

Some non-poorfag people on /lmg/ have even come around agreeing to the fact that models larger than 200b can, in fact, be "local" as long as you are not poor.
>>
i bought the bank
>>
>>106731858
wouldn't you only need enough space for a 13b since hunyuan 3.0 is an moe? the rest could get offloaded
>>
>>106731858
since you love /lmg/ so much that you have to post about them in literally every thread, you can go BACK to sucking their cocks in THEIR thread you cum guzzling dickbrain.
>>
>>106731814
Yes. This Flash version is very special, it's based on an experiment Lodestone made while training HD so it fixes fine details. Results in less limb horror, higher quality images, much closer convergence than regular Chroma HD. Only cost so far is the style variety is not as large, and prompt following especially with multiple subjects is not fully on par with it.

Here's another workflow for 2k images that merges HD(v50) with the Flash delta for better prompt following performance and full leverage of Chroma Flash detail fixes. 2k images are faster than regular Chroma HD, so there's no downside to using that over HD.

https://files.catbox.moe/pg1c1o.png
>>
God I cannot stand the furry dick sucking.
>>
File: 1728119902657081.mp4 (642 KB, 480x672)
642 KB
642 KB MP4
hmmm

https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928

interesting results, it doesnt seem to have the slow motion that i2v 2.2 lora has, despite being a t2v lora.
>>
>>106731811
do people use ltx video, sky reels or hidream? There are tons of dead models that got no traction or community support. Wonder if there's going be "lost media" when it comes to ai models files in the future.
>>
File: 1737996163830207.mp4 (711 KB, 480x672)
711 KB
711 KB MP4
>>106731898
*this was updated recently, they say they are making an i2v update too but for now it seems ok?

1 strength for high and low.
>>
>>106731858
RTX 6000 Pro 96gb vram can't even run this image model without massive offloading slowing things down

LLM is less taxing overall, you can even run it on pure cpu solutions, for image / video generation, it's GPU or bust
>>
>>106731811
if it was at least good, but why support something that is huge AND shit
>>
>>106731901
>do people use ltx video, sky reels or hidream
no, no one uses those. hidream requiring quad clips is what killed it. if people ditched hidream, then they absolutely will ditch hun image 3 after a week, no question.
>>
>>106731915
>1x rtx pro 6000
he said non poor fags anon
>>
>>106731891
Is there a difference between the full HD-Flash checkpoint there and merging the delta weights into HD? Sorry in advance if you already answered that and I'm just not understanding.
>>
File: ComfyUI_39083_.jpg (1.67 MB, 1352x1976)
1.67 MB
1.67 MB JPG
>>106729359
to answer my own question...
turns out the change in my old gens wasnt caused by pytorch at all!
the differences between pytorch versions are actually very small in most cases, so the original comparison image must have been very cherry picked

of course the actual cause was comfyui updates, and rolling back to november 2024 did the trick

>>106731848
could be a bad vae
>>
File: 1728604594000983.mp4 (906 KB, 672x480)
906 KB
906 KB MP4
another test: the man gets up and runs very fast out of his office.

he's a bit confused, don't mind him
>>
>>106731811
I hate the argument of "you have this much money, why can't you do everything we want no matter how stupid it is". ComfyUI has no obligation to support every model, especially if it barely qualifies as local for the average user.

This can actually have negative consequences for Comfy. If Comfy supports it, retards that don't understand the insane requirements will think they can run it, get angry at how slow it is and blame Comfy for not optimizing it.
>>
>>106731933
>three legs
>broken leg
>watermelon
>>
>>106731945
>the chair turns into a CRT TV with a random guy behind it
>>
>>106731945
i2v is still more fun than animate
>>
File: 1756394537910897.mp4 (1.64 MB, 672x480)
1.64 MB
1.64 MB MP4
another test: wan 2.2 i2v high into wan 2.1 i2v lora

plus wan 2.2 i2v low into wan 2.1 i2v lora

1 str for all:
>>
>>106731930
Yes, in particular, prompt following, and what I perceive as sloppiness in the regular HD. For instance, when you're prompting for two subjects, like the one I linked, in regular Flash HD it tends to give you 3 or 4 because it's not as strong at prompt following. That is fixed witht he mixed version I gave, with the caveat that to see the amateur photo look and not slop, from my tests you have to prompt it at least at 2k or close enough to it.

Here's another example where I put prompt following to the test (this time in 1152 res):
https://files.catbox.moe/nkn901.png
Regular Flash version falls apart there, and I tested other ones. Regular Chroma HD does all of them fine, so that's the reason for the merge.
Note you'd want to save the checkpoint instead of loading the two separate checkpoints each time.
>>
>>106731953
it was the image that described my current mood the best
>>
File: 1751333383290207.mp4 (704 KB, 672x480)
704 KB
704 KB MP4
>>106731976
test 2: the man drinks a bottle of beer and smiles.

oddly enough, using both loras sorta works? does 2.1 fix the speed issue? idk.
>>
File: 1732723151598856.png (110 KB, 1359x567)
110 KB
110 KB PNG
>>106731999
and idk why but using both seems to work.
>>
>>106731811
>NOOO YOU HAVE TO IMPLEMENT EVERY USELESS MODELS EVER
lol, I hate Comfy but he's 100% right there, who's gonna run a slopped 80b model?
>>
>>106731811
who the fuck has 100k in GPUs to run an 80b model
>>
>>106731811
Okay, I kind of agree. But what other pressing project is currently taking up so much of the teams time that they can't hack together a solution to run it? Most of the time support is available within hours. Is there something about this model that makes it hard for them to implement it on comfy?
My instinct tells me it's not for the stated reasons. I believe they are technically incapable of implementing it without serious backend changes.
>>
File: 1732201424407180.mp4 (1.52 MB, 480x672)
1.52 MB
1.52 MB MP4
>>106732012
the asian girl with large breasts jumps up and down.

need more testing to see if this combo is effective or worse than 2.1 high + 2.2 low.
>>
>>106731267
Raw text2img? No LoRA? That's insane.
>>
File: image(8).png (1.37 MB, 1408x640)
1.37 MB
1.37 MB PNG
>>106731811
Not true, I generated like 10 images so far and haven't given up yet!

>High contrast vector illustration of a young woman with magenta hair and blue goggles riding a hoverboard across a cratered desert surface. She is wearing a vest, cargo pants, and sandals. Various gadgets are looped at her waist. She is riding towards the viewer, seen slightly from below and at an angle.
>>
>>106732058
looks like ass, Qwen Image can definitely do something similar
>>
File: 1743378630325859.mp4 (936 KB, 672x480)
936 KB
936 KB MP4
the man in the blue shirt jumps onto the airplane behind him that is taking off high into the sky.

still not sure if combining 2.2 and 2.1 is better or worse :/
>>
>>106731750
Qwen-->Wan looks hundred times better than any shit chroma can do. Like holy shit get some eyes already and notice the man woman hybrids (unless you are into that, than I guess continue on?)
>>
My dream would be that the Qwen team get their shit together to unfuck the base Qwen-Image just like they tried to do with Qwen Edit, by preferably stop fine-tuning the model during post-training on synthetic slop
>>
File: 1738849052155203.mp4 (975 KB, 672x480)
975 KB
975 KB MP4
>>106732076
now we're talking.
>>
>>106732090
hell yeah
>>
>>106731811
Lol we are getting to the point now they are gonna pick and choose what open models they are gonna implement but spreads wide open for any bullshit API. Yeah comfy is cucked.
>>
File: 1757757904051397.mp4 (1.19 MB, 672x480)
1.19 MB
1.19 MB MP4
the man in the blue shirt points at the airplane behind him, and it explodes into fire and smoke.
>>
>>106732090
>>106732109
i prefer this over the original cut of dark knight rises
>>
>>106732090
It's crazy how far local video models have come...
>>
File: 1735415917349450.png (1.08 MB, 832x1248)
1.08 MB
1.08 MB PNG
>>106728954
>Powerpuff girls
could be a good idea, theres already an illust one on civitai that looks not terrible though
>>
File: 1730238158774784.mp4 (886 KB, 672x480)
886 KB
886 KB MP4
the man in the blue shirt is having a fist fight with a muscular man wearing a black tank top and black ski mask.
>>
File: congress.webm (3.87 MB, 832x1248)
3.87 MB
3.87 MB WEBM
>>106731283
>>
>>106732076
>>106732090
>>106732109
>was generating slop part of your plan?
>>
>>106732147
of course!
>>
File: 1729679387423340.png (1.53 MB, 832x1248)
1.53 MB
1.53 MB PNG
>>
>>106732147
The slop rises!
>>
>>106732077
>Qwen-->Wan looks hundred times better than any shit chroma can do
Pajeet thinking slopped plastic looks good, go away
>>
>>106732166
lol, lmao even. Enjoy the blurry trannies I guess
>>
File: 1758449161288921.mp4 (843 KB, 672x480)
843 KB
843 KB MP4
>>106732143
this time, i2v lora 2.1 for high noise at 3 str, 2.2 low lora 1 str.

yeah, the old fix still works better overall. now he's even fighting!
>>
>>106732144
Impressive
>>
>>106732037
lora, not good enough for sharing
>>
File: 46942135.jpg (39 KB, 460x460)
39 KB
39 KB JPG
>nooo my millionaire dev!!! Don't bully him
>>
>>106731283
qwen?
>>
File: 1752224382488516.png (1.2 MB, 1360x768)
1.2 MB
1.2 MB PNG
>>
>>106732211
I like these, how do you make them
>>
File: 1758659681377214.mp4 (783 KB, 672x480)
783 KB
783 KB MP4
the man in the blue shirt holds up a sign saying "you're a big guy", and a muscular man wearing a black tank top and black ski mask punches him, causing him to fall to the floor.

not bad.
>>
What's happening to Comfy is normal. He's got the UI monopoly, so if he wants to stay on top he'll have to keep pioneering and make Comfy work with literally every single new meme model that drops.
That's why he's successful and Ani is a loser.
>>
File: 1748788527342884.png (1.12 MB, 1360x768)
1.12 MB
1.12 MB PNG
>>106732215
its a lora i just finished trained on stills from the camp lazlo cartoon
>>
File: 1749180419946279.mp4 (729 KB, 672x480)
729 KB
729 KB MP4
>>106732223
>>
File: 1732165525564021.png (3.18 MB, 2312x1304)
3.18 MB
3.18 MB PNG
>>106732237
>>
>>106731662
Essential Chroma LoRAs as in what exactly anon? Chroma in its current state is a base model (what comes before IL/Noob style finetune), so you'd not find any anime specific LoRAs that are really good. For realism unless you are going for a style it needs no LoRA.
>>
>>106731662
>he's still not understanding that Chroma is ass and no one want to deal with this shit
lol, some anons are really slow I swear
>>
>>106731398
>calling that flawless
lmao
>>
>>106731398
>flawless
>2 left foot
>>
File: 1752675977421576.png (1.98 MB, 1344x1728)
1.98 MB
1.98 MB PNG
>>
File: comf0.jpg (1.51 MB, 1536x2560)
1.51 MB
1.51 MB JPG
>>
File: 1734358207355715.png (3.52 MB, 1416x2120)
3.52 MB
3.52 MB PNG
hmm...
>>
>>106732235
everyone loses. cool
>>
>>106732263
I like Chroma, and it's the best base model overall, but again it is a base model, which means it's not going to excel at any specific category, just be ok/good at pretty much everything.

So for really good results in NSFW, styles etc, like with all base models, you need to further finetune, typically with a lora since it's easiest.
>>
File: 1744680927821441.png (1.23 MB, 832x1248)
1.23 MB
1.23 MB PNG
i cant figure out a good lineart processor for these thicker lines
>>
File: 1736360515205251.mp4 (909 KB, 832x480)
909 KB
909 KB MP4
default wan size, pretty good result:
>>
>>106732360
>>106732360
>>106732360
>>106732360
>>106732360
>>
>>106732349
Yes. this.
>>
>>106732026
it is a non-trivial architecture, does comfy support LLMs? No. This is an LLM, not a DiT model. Even the guys at llama.cpp are going to take their sweet time:
https://github.com/ggml-org/llama.cpp/issues/16303
and we're also still waiting from them to implement qwen3max.
So yeah, it's going to be a lot of effort for... a shit model.
As for comfy's stance, he's a retard, I would've just said that the implementation would be put on low prio and in 2 weeks when everyone has forgot about it let it rot in the backlog.
>>
>>106730042
>>106730123
>>106730267
>>106730446
>>106731391
fuck off tranny freak
>>
>>106732208
that's fine and all but you're the only one here that knows said millionaire dev's cock tastes like



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.