[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107683139

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of frenship
>>
>>107687569
thanks for the bread anon
>>
>>107687569
Thank you for baking this thread

>>107687582
Thank you for blessing this thread
>>
File: 00133-2501127316.png (2.78 MB, 1824x1248)
2.78 MB
2.78 MB PNG
>>
, anon
>>
comfy should be dragged out on the street and shot
>>
File: Z-image turbo.png (2.42 MB, 1536x864)
2.42 MB
2.42 MB PNG
>>
>>107687600
i love that interior, stylish af
>>
>>107687609
yeah, I love the fruitiger aero style :D
https://www.youtube.com/watch?v=Cz2YCRmDOFk
>>
Looks like Qwen v2 will be a reasoning model (so an autoregressive model?)
https://xcancel.com/cherry_cc12/status/2004741644810383684#m
https://xcancel.com/cherry_cc12/status/2004162177083846982#m
>>
File: 12124124.jpg (457 KB, 1664x2432)
457 KB
457 KB JPG
mirrors can be tricky
>>
What happens to Z image base and LTX2? They were supposed to release by Christmas. Do we still think Z is king? How has Lora training come along. Very well, shit, or so so?
>>
>>107687595
Extremely nice 2D to 3D conversation.
>>
>>107687646
No one knows, yes still king, and very well
>>
>>107687663
Can it train on anime well or is SDXL still best for that.
>>
File: TWO MORE WEEKS.png (248 KB, 669x373)
248 KB
248 KB PNG
>>107687646
>What happens to Z image base
it's inference code PR got merged 4 days ago, for the new version of Qwen Image Edit, the PR got merged 2 weeks before they released the model, make that what you will
https://github.com/huggingface/diffusers/pull/12857
>>
>>107687595
she's cute but the smile is a bit creepy
>>
>>107687600
>>107687609
Reminds me of that game where you're stranded in another water planet and you have to build a base.
>>
File: 1754996956910711.jpg (1.57 MB, 1248x1824)
1.57 MB
1.57 MB JPG
>>
>>107687707
cool gen
>>
>>107687646
Z-base is the next Pony v7. People hype it up as only 'two more weeks' away and then when it finally releases in July 2026 it will look outdated.
It has been over a month since Turbo released, they stated that base was to release 'that weekend'.
>>
>>107687691
>>107687653
trying to get side body view. don't remember it being a annoying get a proper side view from sdxl.
>>
>>107687738
>they stated that base was to release 'that weekend'.
you forgot that they said "I guess", so it wasn't 100% sure... yeah I know I'm coping a bit but still
>>
>>107687646
more like next Christmas
>>
File: Z-image turbo.png (2.09 MB, 1536x864)
2.09 MB
2.09 MB PNG
>>
>Z image base was supposed to drop on christmas
>they didnt release it

goddamnit
>>
File: 1743966188386279.png (1.55 MB, 1496x1120)
1.55 MB
1.55 MB PNG
add the character on the right in image2 over the water, in the sky.

got some ART here, anons.
>>
>>107687669
It's superior with grabbing onto styles but you give up general booru tags. Best to recaption with NLP. But IMO still better despite that drawback.
>>
>>107687764
kek
>>
File: 1766299783879979.png (1.61 MB, 1496x1120)
1.61 MB
1.61 MB PNG
>>107687764
okay, now gigachad is intact.
>>
>>107687646
>They were supposed to release by Christmas.
>>107687763
>supposed to drop on christmas
This was a rumor and never actually stated by the devs. I know they keep saying "soon" and because of this I will kill the next chinaman I see but they did not actually say the birth of christ.
>>
>>107687569
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
is there a reason these are in the OP? they are off topic and encourage flame wars
>>
déja vu
>>
>>107687669
fwiw ive left XL behind entirely
>>
File: 1759725713543857.png (1.52 MB, 1496x1120)
1.52 MB
1.52 MB PNG
>>107687783
kek, and a solo act:

I will make new gens, just testing 2511 edit with 2509 8 step lora, seems good.
>>
>>107687669
nobody serious is bothering until a noob fine-tune is ready
>>
File: Z-image turbo.png (2.07 MB, 1536x864)
2.07 MB
2.07 MB PNG
https://www.youtube.com/watch?v=8mNwEqTlRoM
>>
>>107687827
Didn't they say they'd make an anime finetune themselves? Would be good to make an nsfw finetune from that.
>>
>>107687835
no nsfw. still have to wait
>>
File: 1743355416120299.png (1.73 MB, 1584x1056)
1.73 MB
1.73 MB PNG
the old man is holding a sign saying "LDG". Hatsune Miku is standing to his left.
>>
>>107687801
if you actually cared, you would go back to sdg
>>
>>107687847
why don't you since all you do is shit up the general with your drama?
>>
>>107687847
don't interact with the schizo, no one loves him and he feeds on (You)s
>>
>>107687835
>didn't they say
they said a lot of things since z-turbo, and delivered on none of them. what will happen is radio silence regarding this 'anime finetune' and months will pass with people assuming that they're still working on it only for it to never release.
"why bother finetuning it if they're doing it for us first??"
expect nothing to happen as everyone sits around waiting for a mystical finetune that isn't even being developed.
>>
>>107687827
for serious finetunes sure but loras are fantastic in the meantime
>>
>>107687867
sorry, meant for >>107687855
>>
>>107687904
loras have always been a cope and they are over it and don't have much variation on zit
>>
File: 1757393853998216.png (2.41 MB, 1536x864)
2.41 MB
2.41 MB PNG
Charlie Brown with hair is so uncanny lol
>>
>>107687919
mmmmokay anonie
>>
>>107687919
>loras have always been a cope
truth nuke
>>
>>107687891
Yeah I can see the issue with that.
>>
File: 1749520029576406.png (1.64 MB, 1584x1056)
1.64 MB
1.64 MB PNG
the old man is wearing a white t-shirt with Hatsune Miku on it, and is holding two green glowsticks.

this one turned out good, I think the 2509 8 step lora is more consistent than the 4 step 2511 one.
>>
remember nunchaku?
me neither
>>
File: zzzz_00411_.png (2.12 MB, 1536x1024)
2.12 MB
2.12 MB PNG
bombardino crocodilo hehe
>>
Everything less than drawing by hand is a cope btw
>>
Everything is a cope btw
>>
>>107687948
https://github.com/nunchaku-tech/nunchaku/releases/tag/v1.1.0
it has implemented Z-image turbo recently but the quality is so ass, I guess small models don't like quantizations, like LLMs
>>107687956
if AI was cope and irrelevant the artists wouldn't piss and shit themselves over it
>>
>loras have always been a cope
said by someone whos never trained
>>
>>107687968
is nunchaku for vramlets mainly? never used it myself
>>
>activate 2 character loras
>they bleed into a mess of blended features
>prompt 2 characters the model knows
>they work perfectly fine
loras are outdated copium, technology that hasn't improved since 2019.
>>
File: 1748067148365188.jpg (3.27 MB, 8192x2487)
3.27 MB
3.27 MB JPG
>>107687976
it's a 4bit quant, but better than Q4 so it's definitely for vramlets who want a bit of quality
>>
Issue of the skill desu
>>
>>107687999
another truth nuke, nothing will replace having the style/character directly trained in the model
>>
>>107688005
this 100%
>>
>>107688018
3dpd brownoid boomersloppers don't understand this because they settle for plastic crap. realism is literally a single style and they STILL can't get that right
>>
Has anyone made a Danbooru tag LoRa for Z Image yet?
>>
File: 00185-4123510393.png (2.97 MB, 1824x1248)
2.97 MB
2.97 MB PNG
>>
>>107688005
anon is often apt to blame the free models as if theyd pay for it at all. very sad
>>
>loras are just as good as finetunes!
>loras work on distilled models (flux dev, flux 2)
EPIC LIFEHACK DISCOVERED! why doesn't lodestone and noob just train loras instead? they're just as good!
>>
Who is anon quoting? No one said that ITT
>>
>ZiT loras are better than XL
>WHAAAAATT??? LORAS ARE NOT AS GOOD AS FINETUNES HURRDURR
holy retardation
>>
>>107687595
>>107688029
remind me of the good old days of 2022 when we could only post close up of 1girl because that's the only thing SD1.5 could do correctly kek
>>
where base
>>
File: zzzz_00428_.png (2.08 MB, 1536x1024)
2.08 MB
2.08 MB PNG
>>
File: 00195-2273322063.png (2.83 MB, 1536x1536)
2.83 MB
2.83 MB PNG
>>
>>107688087
>where base
Still in China obviously :(
>>
File: Z-image turbo.png (2.3 MB, 1536x864)
2.3 MB
2.3 MB PNG
>>
File: zzzz_00434_.png (2.79 MB, 1536x1024)
2.79 MB
2.79 MB PNG
>>
>>107688158
pretty cool
>>
File: small_output.mp4 (2.14 MB, 1280x1280)
2.14 MB
2.14 MB MP4
>>107665458
>>107679533
>>107679033
I decided to use this to test some different settings. I used a deepthroat LoRA with the wan 2.2 FLF template.

son = sage attention on
soff = sage attention off
lon = lightning LoRA on
loff = lightning LoRA off

Everything else stayed the same. I changed the steps/cfg to the comfy recommended settings when changing the lightning LoRA.

The gen time is at the end of the filename. You can see the difference in quality. The lightning LoRA looks like it gives a smoother zoom in effect on the background, but turning it off makes the foreground action a lot more intense.

There's no telling if these differences will stay consistent across various seeds. I used to do thousands of gens of different SD settings back in the day and when you think you see a pattern, it could be totally a byproduct of that particular seed. More testing is needed.

I will say that my anecdotal experience is that characters rotating/manipulating objects is much worse with the lightning LoRA on. It is much more likely to just morph the object around until it gets to a stable position.
>>
>>107688088
this but unironically
>>
>>107688165
>I will say that my anecdotal experience is that characters rotating/manipulating objects is much worse with the lightning LoRA on. It is much more likely to just morph the object around until it gets to a stable position.
obviously, distillation hurts the model's quality and those lora apply distillation to make it faster, they improved a lot though since the SDXL turbo days, there's even better techniques than lora that are begging to be tried, we'll see how they fare in the future
>>
>30 days since z-turbo was released
not looking good, chinakeks
>>
>>107688165


KEK. Thank you for the breakdown, I personally like the one with sage on and the lightning lora on to be honest. Appreciate you!
>>
>>107688194

I think they know the eventual goontunes that will be made will collapse society as we know it. That's a heavy burden.
>>
File: 1749227857094628.png (2.04 MB, 1536x864)
2.04 MB
2.04 MB PNG
>>107688088
>>
they started training the hentai finetune and realized how much money they could make with saas
>>
>>107688233
I won't mind if they keep the finetune API, we just need the base model, we can figure out the rest by ourselves
>>
>>107688198
That's the worst one, imho. He doesn't even turn the gun around, it just morphs to a different position. From what I can tell, everything off yields the best results, as expected.
>>
>>
>>107688194
Only 30? Feels like it's been twelve months
>>
>>107688278
ai makes lesbians look hot again
>>
File: 1760010665056784.png (1.91 MB, 1136x1464)
1.91 MB
1.91 MB PNG
the anime girl in image2 is outside the door in image1. keep the man's expression and face the same.
>>
they're training more cunny which is the reason for the delay
>>
File: z-image_01392_.png (1.57 MB, 960x1440)
1.57 MB
1.57 MB PNG
>>
>>107688394
Have you had any luck replacing race of someone? e.g. westoid to jap and vice versa
>>
>>107688403
Sorry only Bane, Miku, Drive, and Floyd
>>
File: 1752125695466029.png (2.46 MB, 1536x864)
2.46 MB
2.46 MB PNG
>>
File: 1762052041565712.png (1.96 MB, 1136x1464)
1.96 MB
1.96 MB PNG
>>107688394
the anime girl in image2 has her arm around the man in image1. keep the man's expression and face the same.

aww.
>>
>>107688452
>gooseling as emotionless as ever
qwen is so fucking accurate desu
>>
File: 1739712520261508.png (1.03 MB, 1368x760)
1.03 MB
1.03 MB PNG
>>107688403
oh, absolutely.

netflix rush hour. "make the asian man black."
>>
>>107688467
it just changed the skin color, he still has the face of an asian man
>>
>>107687595
model?
>>
File: 1736404274634657.png (1.74 MB, 1136x1464)
1.74 MB
1.74 MB PNG
>>107688480
ok here is an example with gosling:
>>
>>107688491
holy shit it would never change race and face for me. guess i downloaded the cucked qwen. 2509 or 2511?
>>
File: 1742635626883538.png (1.93 MB, 1136x1464)
1.93 MB
1.93 MB PNG
>>107688491
chinese rice farmer:
>>107688502
2511, but with the 8 step 2509 lightning lora, works well imo
>>
>>
>>107688165
kek
>>
File: nunif.jpg (436 KB, 1419x1076)
436 KB
436 KB JPG
>>107686065
Download Nunif
Run install.bat
Run update.bat
https://github.com/Westlake-AGI-Lab/Distill-Any-Depth
scroll to pretrained models, download the 97mb one and the largest one and shove them into the nunif-windows\nunif\iw3\pretrained_models\hub\checkpoints
Load up iw3-gui.bat (takes ages to load btw, so just wait like 5 minutes for the window to appear after the cmd box opens and closes)
3D strength 1.5 to 4.0 (higher is better 3D depth but causes more artifacts around edges of foreground to background objects which looks like shit, I usually keep on 2)
Convergence 0.5
Depth Model (Distill any L (slower better quality) or Distill any S (worse quality but way fucking faster))
Edge Fix 2
Full TB = More horizontal resolution
F/Full SBS = More vertical resolution (though a 4k file becomes 8k, so if you want to lose half resolution and keep 4K (so 2k 3D) do Half SBS or Half TB.
If you have a Nvidia Graphics Card newer than a 1080ti series, then do fp16.
Depth Batch size = Depends on graphics card (chatgpt it)
Worker threads = Depends on graphic card (chagpt it)
Those 2 settings basically will be how fast your graphics card processes the image/video.
There, you can convert movies/videos and images to 3D for VR or a 3D TV.
>>
File: 1764481473568025.png (2.02 MB, 1160x1440)
2.02 MB
2.02 MB PNG
>>
>>107688572
not even close, QiE is so bad at keeping the original style
>>
File: 1764821873639112.png (1.48 MB, 1720x968)
1.48 MB
1.48 MB PNG
>>107688579
I didn't prompt to keep the style the same I just said miku. you can keep pixel art style if you prompt for it for example.
>>
>>107688565
Oh for images as well, you can definitely do higher 3D strength and play around with foreground scale. Read the readme to see what they do
https://github.com/nagadomi/nunif/blob/master/iw3/README.md
>>
>>107688607
do it for this >>107688572
>>
File: ComfyUI_00002_.png (3.18 MB, 1456x1536)
3.18 MB
3.18 MB PNG
Wish I could get face swap to work (some python shit is fucking up nodes) so I don't have to deal with every anime-real model having full asian faces.
>>
File: Z-image turbo.png (1.98 MB, 1536x864)
1.98 MB
1.98 MB PNG
>>107688572
Shit why did it go for XII, it was so close lol
>>
bf16 vs fp32 lora, is there a major difference or not so much?
>>
File: Z-image turbo.png (2.32 MB, 1536x864)
2.32 MB
2.32 MB PNG
>>
File: Z-image turbo.png (2.56 MB, 1536x864)
2.56 MB
2.56 MB PNG
https://www.youtube.com/watch?v=s4FnAOg6N5c
>>
did the mods kill tranfag got yet or do I have to wait?

*edit: schizo rentries are still there so I take that as a no. cya*
>>
File: 1744300007496944.png (1.42 MB, 1840x912)
1.42 MB
1.42 MB PNG
I dont watch this show but apparently some guy said he was gay.
>>
File: 1741036087539006.png (1.41 MB, 1840x912)
1.41 MB
1.41 MB PNG
>>107688761
also bf16 qwen edit 2511 lora seems good, they say you can do 8 steps instead of 4, outputs are good:
>>
can wan 2.2 do cartoon inbetweening well?
>>
Should've put more trust into Newbie but nooooooo we HAVE to wait in this magical Z Image Base that doesn't exist, wait for this censored Anime finetune of the non-existent Z Image Base, and wait for someone to (actually) uncensor it without adding in their own schizo rules like ... artist clustering. Remember that shit?
>>
File: 1757127558258881.png (1.14 MB, 1712x976)
1.14 MB
1.14 MB PNG
>>
>>107688825
>artist clustering
that was some insane idea indeed
>>
File: 1754932763910413.png (1.04 MB, 1712x976)
1.04 MB
1.04 MB PNG
>>
File: WanVideo2_2_I2V_00086.mp4 (1.05 MB, 480x832)
1.05 MB
1.05 MB MP4
>>
File: WanVideo2_2_I2V_00090.mp4 (3.55 MB, 480x832)
3.55 MB
3.55 MB MP4
>>
File: WanVideo2_2_I2V_00083.mp4 (1.46 MB, 480x832)
1.46 MB
1.46 MB MP4
>>
>>107688746
he's going to kill the threads if this keeps up
>>
>>107689053
the bowl...
>>
Seems like multiple people in this thread are having mother issues.
>>
File: full-0061.jpg (914 KB, 1464x2149)
914 KB
914 KB JPG
>>
File: 1754796464036548.png (1.55 MB, 1088x1536)
1.55 MB
1.55 MB PNG
the anime girl is wearing a white crop top and denim shorts, and white adidas sneakers.
>>
>>107689086
Hair and hands are fucked.
Some more steps would be nice.
>>
File: 1750694175300673.png (1.38 MB, 1312x1272)
1.38 MB
1.38 MB PNG
change the text from "IT'S TIME" to "OH SHIT". The man is pointing a gun at his head.

actually more funny given it's not him holding the gun desu
>>
File: 1756795748874914.png (1.99 MB, 1080x1544)
1.99 MB
1.99 MB PNG
give the girl wearing green on the left a green bikini.

not really necessary for stellar blade but still a valid test case.
>>
>>107689195
remove all clothes
ctrl enter
>>
>>107689240
hey that's good and it fixed the awful baked shadows on her tummy
>>
File: ComfyUI_00181_.png (1.03 MB, 832x1216)
1.03 MB
1.03 MB PNG
>>
>>107689251
using 2511 qwen edit, but 2511 lightning lora at 8 steps, the huggingface discussion said it works well at 8 too (devs were replying)
>>
File: 1740452654475887.png (1.79 MB, 1080x1544)
1.79 MB
1.79 MB PNG
the girls are wearing a red bikini and red santa hat. there are christmas presents on the floor in front of them.

pretty good, 8 steps is the way I think, 4 is fine but 8 is better detail/results
>>
>the same meme on repeat for an entire year
i hate this place so much
>>
>>107689274
yea 8 steps is good. with 4 its too easy to notice background and depth of field looking undersampled
>>
File: 1748325724054048.png (1.74 MB, 1080x1544)
1.74 MB
1.74 MB PNG
>>107689274
business suits with a short skirt and black heels:
>>
>>107688002
thank you for making a comparison using qwen image between bf16, Q8_0 GGUF, Q4_K_M GGUF, and Nunchaku FP4 quants

if it's not too much effort i'd be interested in knowing more stats like cosine similarity and how similar certain tensors are between quants

nunchaku looks better than i expected for a 4 bit quant. this is also a great reminder that Q8_0 is good enough
>>
File: 1766161235065044.jpg (307 KB, 1024x1024)
307 KB
307 KB JPG
I keep seeing people using the full qwen edit model, which is like 40gb.
Isn't that overkill? Can't I just use a Q6 quant instead?
Also I've been generating vids of girls twerking and getting facial blasted for 4 days.
>>
>>107689292
I have 16GB and I use Q8 which is like 20GB, even if you cant load the main model fully into memory it still works, it will just load some into RAM.
>>
File: 1756981882945972.png (2.15 MB, 1136x1472)
2.15 MB
2.15 MB PNG
replace the text "Cyberpunk" with "Cyber Miku". Replace the man with the pistol with Hatsune Miku holding a green leek vegetable instead of a pistol.

not bad
>>
>>107689292
>I keep seeing people using the full qwen edit model, which is like 40gb.
>Isn't that overkill? Can't I just use a Q6 quant instead?
Q8_0 is 99.97% similar to fp16/bf16 so yeah it's good enough, but technically not perfect. Q6_0 hurts a lot more and if you can run the Q6 you can probably run the Q8



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.