[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Thank You Mr. Xi Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107338862

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://huggingface.co/Comfy-Org/z_image_turbo

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
I'm waiting for somethign terrible to happen, you'll see, I can feel it coming
>>
File: ComfyUI_08875_.png (1.81 MB, 1152x1152)
1.81 MB
1.81 MB PNG
>>107339853
>A solitary figure traversing infinite megastructure corridors with towering brutalist architecture, cybernetic threats lurking in vast darkness, in the mangaka style of Tsutomu Nihei with heavy blacks, architectural precision, oppressive scale, minimalist character design, and post-apocalyptic sci-fi atmosphere, high resolution, black and white manga with extreme perspective.
>>
File: 1751606146329496.png (352 KB, 684x507)
352 KB
352 KB PNG
>>107339868
>>
Those threads are going too fast lol...
>>
these threads are flyin off the shelves
>>
File: ComfyUI_08876_.png (2.67 MB, 1152x1152)
2.67 MB
2.67 MB PNG
>A disturbing scene of spirals manifesting in human bodies and architecture, characters with wide horrified eyes and detailed grotesque transformations, in the mangaka style of Junji Ito with meticulous line work, unsettling body horror, psychological dread, and surreal cosmic horror elements, high resolution, black and white horror manga illustration.
>>
File: file.png (992 KB, 2442x876)
992 KB
992 KB PNG
>153% activity
we are reaching critical masses
>>
File: ComfyUI_08884_.png (2.18 MB, 1152x1152)
2.18 MB
2.18 MB PNG
>A masterless samurai in weathered robes executing a precise sword strike, raindrops frozen mid-air around him, in the mangaka style of Takehiko Inoue with photorealistic ink wash techniques, dynamic brush strokes, philosophical depth, and stunning anatomical accuracy, high resolution, as if from a historical martial arts manga masterpiece.
>>
File: 1733472244275678.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
Hatsune Miku wearing black latex lingerie and holding a whip

made a realistic migu
>>
File: 1756312045759976.png (397 KB, 814x1043)
397 KB
397 KB PNG
>Z Image (I assume the whole family of models up until now)
>314K H800 GPU hours
>628K $
>>
File: er_sde_z_image__00007_.png (2.11 MB, 1024x1024)
2.11 MB
2.11 MB PNG
>>
something extremely niggerlicious will happen and we'll all be disappointed, it's the only possible outcome experience has taught me
>>
>>107339907
>flux and linux shills
>>
File: ComfyUI_08878_.png (2.1 MB, 1152x1152)
2.1 MB
2.1 MB PNG
>Two masterless samurai locked in an intense duel on a rain-soaked wooden bridge at dawn, one in dark weathered robes executing a downward strike while the other in lighter hakama counters with an upward slash, their blades meeting in a shower of sparks, muscles tensed and faces showing fierce determination, raindrops and mist swirling around them, fallen autumn leaves scattered on wet planks, in the mangaka style of Takehiko Inoue with photorealistic ink wash techniques, dynamic brush strokes capturing motion blur, intricate fabric textures, stunning anatomical accuracy in their stances, dramatic chiaroscuro lighting, philosophical intensity in their eyes, high resolution, as if from a pivotal historical martial arts manga battle scene."
>>
File: 1763461405819920.png (3.27 MB, 1920x1440)
3.27 MB
3.27 MB PNG
>>
File: ComfyUI_08879_.png (1.62 MB, 1152x1152)
1.62 MB
1.62 MB PNG
>A solitary monk in a pristine white woolen habit kneeling in prayer within a stark stone monastery cell, hands clasped reverently holding a wooden crucifix, a single shaft of divine light streaming through a high narrow window illuminating his contemplative face and the folds of his robe, a simple wooden table with a skull and open scripture in deep shadow beside him, in the Spanish Baroque style of Francisco de Zurbarán with dramatic tenebrism, sculptural rendering of fabric with crisp angular folds, muted earth tones of ochre and umber against deep blacks, spiritual solemnity and meditative stillness, meticulous realism in textures of wool and weathered wood, high resolution, oil on canvas with monastic austerity and mystical devotion.
>>
File: 1759263257868995.png (990 KB, 1024x1024)
990 KB
990 KB PNG
Hatsune Miku in polygon form like a playstation 1 game

cute!
>>
>5th time crashing today after updating comfy
it's a fun model but I'm so done with this garbage ui
>>
File: ComfyUI_08880_.png (2.18 MB, 1152x1152)
2.18 MB
2.18 MB PNG
>A man engulfed in eternal flames stumbling through a frozen post-apocalyptic wasteland, his charred regenerating flesh perpetually burning yet never consumed, carrying an unconscious companion through knee-deep snow past the skeletal remains of civilization, desperate survivors watching from makeshift shelters built from rusted vehicles and human bones, in the mangaka style of Tatsuki Fujimoto with relentless bleak nihilism, scratchy expressive linework depicting constant agony, extreme tonal whiplash between profound suffering and absurdist dark comedy, religious symbolism corrupted by cannibalism and survival horror, wide cinematic panels of desolate snowscapes contrasted with claustrophobic violence, existential questions about identity and revenge visualized through body horror, high resolution, black and white manga with arthouse experimental pacing, shocking brutality that asks if living itself is worth the pain, and deeply human moments of connection amid incomprehensible cruelty.
>>
so what resolution are we supposed to be using?
>>
>it can do underwear without having a fit
the thing i'm finding most mind blowing about all of this.. i'm rendering perfect text, on a 6b. I'm not filling my entire 16gb card just for ((safety alignment)), it's actually doing it. For every roll.
>>
File: 1762298527144178.png (353 KB, 860x872)
353 KB
353 KB PNG
lora when
lora when
lora when
lora when
lora when
>>
File: 1741326395869657.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
Donald Trump boxing Joe Biden while both are standing on top of a T-Rex dinosaur that is standing on a shark, in the jungle.
>>
>>107339990
when the base model comes out because training on distilled is fucking dumb
>>
File: ComfyUI_08881_.png (1.98 MB, 1152x1152)
1.98 MB
1.98 MB PNG
>A weary surgeon with precisely detailed facial features and deep-set eyes sitting alone in a shadowy hospital office, single desk lamp illuminating case files, rain visible through accurately rendered window frames with German architecture beyond, in the mangaka style of Naoki Urasawa with photorealistic linework using minimal hatching, clean economical pen strokes, micro-expressions of moral exhaustion, cinematic low-angle composition, sparse strategic screen tones only for ambient shadow, no manga stylization, Western comic clarity, Hitchcockian visual tension through negative space, high resolution, black and white with film noir atmosphere.

Very nice default art styles, however they mostly don't resemble at all the style of the mangaka I'm asking for.
>>
File: ComfyUI_temp_ahcuj_00020_.png (2.75 MB, 1216x2112)
2.75 MB
2.75 MB PNG
>>107339990
Hopefully diffusion-pipe dev is working atm, hes the fastest one of all
>>
gguf version when?
>>
File: 1760696449344136.jpg (453 KB, 2048x1902)
453 KB
453 KB JPG
>>107340006
>>
>>107340006
if you need a gguf for zimage you need a better gpu
>>
>>107340014
so was this done in Z or are you epically trolling with some random greta tardburger image?
>>
File: ComfyUI_08882_.png (1.54 MB, 1152x1152)
1.54 MB
1.54 MB PNG
>A graceful peasant woman in her prime with porcelain skin and delicate features cradling a lamb in her arms, standing barefoot in a sun-dappled pastoral meadow with wildflowers, her simple rustic dress rendered in exquisite fabric folds, gazing with innocent tenderness at the viewer, soft afternoon light illuminating her face with an ethereal glow, in the academic realism style of William-Adolphe Bouguereau with flawless technique and idealized beauty, impossibly smooth skin texture achieved through invisible brushstrokes, perfectly anatomical proportions with graceful classical poses, luminous flesh tones with subtle pink and cream gradations, meticulous attention to textile draping and natural elements, sentimental romanticism with technical virtuosity, soft atmospheric perspective in the background, high resolution, oil on canvas with porcelain-like finish and timeless neoclassical elegance.

Also it straight up doesn't listen to some prompts that are supposed to be stylized.
>>
>>107339980
2048x2048 is your absolute limit, stay below that and you're fine
>>
>>107339927
hot. it seems to need some finetuning for the whip tho.
>>
File: ComfyUI_08883_.png (1.78 MB, 1152x1152)
1.78 MB
1.78 MB PNG
>>107340023
But other times it does a good job.

>A elegant society woman in a black silk evening gown with dramatic décolletage, standing in a opulent gilded interior with her body turned three-quarters while her face gazes confidently toward the viewer, one pale arm resting gracefully on a marble mantelpiece, candlelight and lamplight creating complex illumination, in the painterly realist style of John Singer Sargent with virtuosic loose brushwork, bold confident strokes that appear effortless yet capture exact form, mastery of fabric rendering with luminous black silk catching light, sophisticated limited palette with rich darks and luminous flesh tones, bravura technique with visible energetic brushstrokes, psychological depth and aristocratic presence, dramatic chiaroscuro with multiple light sources, high resolution, oil on canvas with Gilded Age glamour, spontaneous yet controlled execution, and capturing personality through gesture and gaze.
>>
File: 1758822772663608.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
Donald Trump in a business suit on a movie poster. The film title at the top is "Minority Deport". Include several computers and computer screens in the image. Include the Dreamworks logo at the bottom, and film credits in the style of a movie poster.
>>
File: zimg_0054.png (1.24 MB, 832x1216)
1.24 MB
1.24 MB PNG
>>
>>107340021
it is a reaction image (an image used in lieu of text, emoticons or emojis to express one's physical reaction or emotional state) responding to the guy asking for a gguf version of z-image, which is already extremely light weight.
>>
File: ComfyUI_07442_.png (2.87 MB, 2048x1280)
2.87 MB
2.87 MB PNG
we are soooo back bros!
>>
>>107339991
got dam
>>
File: 1738502248283332.png (2.18 MB, 1280x1280)
2.18 MB
2.18 MB PNG
>>
>2girl asian, yellow fever
>wi so bak buh wooos
>>
>>107340019
laptop 4070. It's not my fault if Jewsen refuses to add more vram. we have to deal with this Jewish market…
>>
File: 454512454545.jpg (354 KB, 2138x1514)
354 KB
354 KB JPG
>>
File: 1735168200222114.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
A 4 panel comic, in the first panel a chibi Hatsune Miku is saying "what's new today?" while beside a computer. In the second panel is the text "Flux 2". In the third panel a chibi Hatsune Miku is saying "isn't that a lot of VRAM?". In the fourth panel, Miku's computer explodes into fire and smoke.
>>
>>107340080
offload to the ram dude >>107338854
>>
File: z-turbo_00064_.png (1.74 MB, 1152x1536)
1.74 MB
1.74 MB PNG
>>
>>107340006
my 3090 doesn't even need one
>>
>>107340109
brother don't you care about your social credit? This one's at least a docking of six gorillion credits. You're gonna get EXECUTED!
>>
File: 78845587454.jpg (412 KB, 2138x1504)
412 KB
412 KB JPG
>>
File: 1762185461426738.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
An indian man typing at a computer in an indian call center. He is saying "i'll tell them i'm a white girl from America! Yes saar!"

kek
>>
File: ZImg_00153_.png (2.38 MB, 1440x1152)
2.38 MB
2.38 MB PNG
not too safe
>>
File: z-i-t.jpg (138 KB, 1024x1024)
138 KB
138 KB JPG
>>
File: 1734896776373144.jpg (839 KB, 1079x772)
839 KB
839 KB JPG
what
>>
>>107340103
wow. thanks bro. we are so back
>>
>>107340149
>he didn't pull
>>
>>107340149
no worries bro. comfyui is perfect
>>
File: ComfyUI_19183_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>107340149
Update Comfy
>>
>>107340149
update_stable your comfyui
>>
File: ComfyUI_07447_.png (2.79 MB, 1280x2048)
2.79 MB
2.79 MB PNG
Z-Image Turbo is the best shit ever. the Flux crew must be on suicide watch right now lmao
>>
>said he updated last thread
>tell him to update
lol
>>
File: 544545554564.jpg (768 KB, 2138x1514)
768 KB
768 KB JPG
>>
File: 1742191209190249.png (2.36 MB, 1280x1280)
2.36 MB
2.36 MB PNG
kek
>>
File: 222351515121.jpg (756 KB, 2138x1516)
756 KB
756 KB JPG
>>
File: Z-Image-Turbo_00003_.png (2.14 MB, 1920x1088)
2.14 MB
2.14 MB PNG
>>
Some of you guys are getting a weird noise in some of your images. What could be causing that?
>>
File: ZImg_00004_.png (2.38 MB, 1440x1152)
2.38 MB
2.38 MB PNG
>>
>>107340205
woah homeslice why do they both have trump's skin texture from this image? >>107340176
>>
>>107340214
s o v l
>>
Why did you admit to being a pedophile in the last thread?
>>
>haven't updated my comfy in a year since the last model I used was illustrious
Hmm, how painful will the pulling be...
>>
>>107340064
this but unironically
>>
File: 1747229440409607.png (2.81 MB, 1152x1536)
2.81 MB
2.81 MB PNG
>>107340169
>>
File: zimg_0058.png (1.36 MB, 832x1216)
1.36 MB
1.36 MB PNG
>>107340214
possibly trying to use a CFG at all.
>>
File: 1732800079528291.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
>>107340232
It will be extremely painful (For you)

>>107340241
coool, try this with 1536x1536, his face looks a bit mexican at that low res.
>>
>>107340214
>Some of you guys are getting a weird noise in some of your images. What could be causing that?
I think it's because of a too high resolution, it looks fine at 1 megapixel
>>
File: 4545454541211.jpg (399 KB, 2138x1515)
399 KB
399 KB JPG
>>
>>107340214
BFLniggers saw they are being executed in real time so they are trying to FUD unsuccessfully
>>
File: 1748161384559656.jpg (2.66 MB, 2048x2048)
2.66 MB
2.66 MB JPG
>>
>>107340258
Impressive, very nice, now let's see an asian 1girl in that image in the style of dali for a real benchmark.
>>
File: selfie.png (997 KB, 1024x1024)
997 KB
997 KB PNG
literally me rn
>>
>>107340279
>laptop
>ittodler
grim
>>
File: 455464521451.jpg (323 KB, 2138x1516)
323 KB
323 KB JPG
>>
why does the comfy example have a negative prompt if the cfg is 1?
>>
File: ComfyUI_128223_.png (1021 KB, 1024x1024)
1021 KB
1021 KB PNG
>>107340238
>>
anyone on rocm tested z-image?
>>
>>107340309
because he's retarded :(
>>
File: 1750277272969689.png (2.81 MB, 1536x1536)
2.81 MB
2.81 MB PNG
>>107340248
tried 2048, file too big
>>
Jesus fuck imagine booru finetunes of this model
>>
File: 45454545124.jpg (463 KB, 2138x1518)
463 KB
463 KB JPG
>>
File: 1750683494683505.png (531 KB, 640x640)
531 KB
531 KB PNG
https://www.reddit.com/r/StableDiffusion/comments/1p7nxrw/basically_uncesored_z_turbo/
It knows a lot of celebrities damn
>>
>>107340338
Z is way closer to actual ghibli aesthetics.
>>
>>107340334
welp at least it gets close!

>>107340336
what i'm really imagining is the nirvana i'll feel nuking my 200gb worth of loras across several models in favor of a fraction of that for z-image.
>>
>>107340342
no porn tho
>>
>>107340334
>file too big
the absolute state of /g/
>>
File: 595448848454.jpg (829 KB, 2138x1741)
829 KB
829 KB JPG
>>
>>107340023
>Also it straight up doesn't listen to some prompts that are supposed to be stylized
put the style in the beginning of the prompt
>>
File: 1652362253530.png (750 KB, 900x675)
750 KB
750 KB PNG
poor flux 2. raped in a single day after years of work
>>
>>107340342
Great, now they'll try and ban this model now
>>
File: Flux2_00030_.png (1.95 MB, 1536x864)
1.95 MB
1.95 MB PNG
>>
File: ZImg_00012_.png (1.81 MB, 1152x1440)
1.81 MB
1.81 MB PNG
>>107340367
oh no
anyway
>>
why is ModelSamplingAuraFlow bypassed. is it needed? since this is a distill model, should cfg always remain 1?
>>
File: ComfyUI_07464_.png (3.6 MB, 2048x1280)
3.6 MB
3.6 MB PNG
>>107340367
insane timing for the Z Turbo China Chads.
>>
File: file.png (1.45 MB, 1920x1080)
1.45 MB
1.45 MB PNG
it's a good model saar
>>
File: 4545451245452.jpg (671 KB, 2138x1590)
671 KB
671 KB JPG
>>107340346
Hmnm, I guess I'll test other anime and see if knows more about it in general.
>>
File: 1762251321960744.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
A box of assorted donuts in a donut shop, on a table.
>>
i cant wait for all the reddit posts that used knowledge from here to be reposted back here in a week with a dumbass comment like "4chan is dead reddit is where i get all my ai news from"
>>
ohh YES once zimage gets a NSFW finetune we will be EATING
>>
>>107340410
well wait no longer >>107340395
>>
File: file.png (1.23 MB, 1482x732)
1.23 MB
1.23 MB PNG
The seed variance is really bad.
There's only like two and a half different images here.
Notably it will never generate a girl looking to the right with the prompt "girl".
>>
File: 1735494696452438.png (1.25 MB, 1280x720)
1.25 MB
1.25 MB PNG
it doesn't know Teto I'm sad :(
>>
>>107340240
what's the name of the one on the left?
>>
>>107340309
>>107340318
Once again, why would anyone ever trust Comfy and his ilk to produce good workflows they're all terrible.

That one was probably authored by the ComfyUI Wiki guy who knows nothing about the program he documents.
>>
only gay and retarded things are allowed to happen in this world, there's a catch with Z-Image and it's gonna be huge
>>
>>107340395
>1920x1080
>>
File: iz.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>107340391
>>
File: 1744821328431946.jpg (2.7 MB, 2048x2048)
2.7 MB
2.7 MB JPG
>>
File: 545457486454.jpg (888 KB, 2138x1578)
888 KB
888 KB JPG
>>
File: judgemental slut.jpg (11 KB, 131x281)
11 KB
11 KB JPG
>>107340418
i feel like a retard, how do i prompt a fucking side view?

>>107340431
kek i love this image so much let me join in

>>107340434
GOD he is literally ME
>>
File: ComfyUI_23483_.png (3.98 MB, 1280x2048)
3.98 MB
3.98 MB PNG
>>
File: 1738376669542143.jpg (2.45 MB, 2048x2048)
2.45 MB
2.45 MB JPG
>>
>>107340430
>there's a catch with Z-Image
the base model will be complete ass. their reinforcement learning is doing all of the heavy lifting. you won't be able to just finetune the base and get nearly as good results
>>
>>107340445
flux2s is better on this one unfortunately
i appreciate these comparisons anon keep posting
>>
>>107340418
My only gripe with z that it suffers from same face unless you properly specify.
>>
File: 1744703573238208.png (2.85 MB, 1152x1536)
2.85 MB
2.85 MB PNG
>>
File: 454545124545.jpg (565 KB, 2138x1515)
565 KB
565 KB JPG
Z really seemed to struggle with this one despite different seeds.
>>
File: ComfyUI_07473_.png (3.16 MB, 2048x1280)
3.16 MB
3.16 MB PNG
>>
File: 1746185744938384.jpg (2.29 MB, 2048x2048)
2.29 MB
2.29 MB JPG
>>
>>107340485
OY VEY THAT'S NOT VERY KOSHER
>>
File: ZImg_00019_.png (1.57 MB, 1152x1440)
1.57 MB
1.57 MB PNG
so that's where 1girls come from
>>
File: ComfyUI_23484_.png (3.59 MB, 2048x1280)
3.59 MB
3.59 MB PNG
>>107340431
>>
How much longer before we get an image model like Sora, where we can submit multiple reference images and the model can combine them all into a single image? Like multiple character references and a background reference.
>>
File: deDL_zi_00035_.png (2.62 MB, 2048x1216)
2.62 MB
2.62 MB PNG
>dj2xk
it knows

>>107340279
dont put the pizza box on your bed. thats gross
but also, can I have a slice
>>
>>107340497
3 months ago
>>
>>107340495
Tag yourselves, i'm the chad in the back with the sunglasses (I have autism which makes eye contact difficult so i wear sunglasses indoors).
>>
Wait the z image full non distilled model is also 6B? Isn't that a bit low?
>>
>>107340497
Qwen does that retard
>>
File: 1738983738092716.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
>>107340430
The catch is that someone will realize it's "unsafe" and will try to take it down.
>>
>>107340514
not at all, the distilled model is already this good at 6b
>>
File: z-turbo_00099_.png (2.17 MB, 1024x1536)
2.17 MB
2.17 MB PNG
>>107340523
>>
File: space_saving_stack.jpg (101 KB, 1024x1024)
101 KB
101 KB JPG
this almost worked too
>>
>>107340168
fucking kek
>>
>>107340523
worse, comfy comes censored for local
>>
what resolutions are you fellas using for 16:9 portrait and landscape?
>>
>>107340485
OOOOOOH MY FAUCI IS THAT EDGINESS ON MY HUGBOARD?????
>>
File: 544545848744.jpg (676 KB, 2138x1951)
676 KB
676 KB JPG
>>107340484
This one it did much better.
>>
>>107340540
1920x1088
>>
Can you give images to z image as examples LLM style like in nano banana pro?
Not edit them, but use them in a conversation?
>>
Z-Image Turbo is CUNNY approved~
>>
Just woke up. Did something good finally happen for local?
>>
>>107340520
No it doesn't, moron.
>>
File: 1738450420596185.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
its a pretty view...

a large and very high bridge into the water at night, with a sign at the top of it saying "troon ender" in bright neon text.
>>
File: ComfyUI_07443_.png (2.39 MB, 1280x2048)
2.39 MB
2.39 MB PNG
>>107340569
>Did something good finally happen for local?
yes, local is saved!
>>
>>107340526
I feel like it's low compared to the competition, but then again if it's good enough it's good enough.
>>
>>107340569
>finally
it happens every 12 or so months desu
>>
File: Zimage.png (554 KB, 381x764)
554 KB
554 KB PNG
I wonder what causes this slightly orangy patchy weirdness of realistic style that from what I remeber began all the way back with realism tunes of Juggernaut XL
>>
>>107340569
yeah, Z-Image BTFO all current image models by a landslide, and it's fucking fast too. This is HUGE.
>>
File: 1741363494217623.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>107340572
oops, forgot to edit my previous prompt. still worked though.
>>
File: 454545456456.jpg (472 KB, 2138x1979)
472 KB
472 KB JPG
>>
>>107340514
>Isn't that a bit low?
No it isn't, it fits comfortably on a 16GB GPU and is exactly right. Bloatmodels are the cancer killing local
>>
>>107340514
z image was made with the express purpose of proving you can create SOTA level models without doing some 80B MoE slop
>>
>>107340600
>>107340588
>>107340579
Okay. I'm downloading it. It better rock my world or I'll sue you.
>>
>>107340456
hnnngg
>>
>>107340588
There's been nothing great since XL dropped 3 years ago, at least for image gen
>>
>>107340621
*smashes your cock with a rock*
>>
File: ComfyUI_temp_trrch_00001_.png (2.67 MB, 1088x1600)
2.67 MB
2.67 MB PNG
Tried a booru tag prompt from civit and the weight bled into the pic lol
>>
File: 5544545643.jpg (943 KB, 2138x2260)
943 KB
943 KB JPG
>>
>>107340621
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main
make sure to run update_comfyui_stable.bat. if you were on nightly like me, you need to replace 'stable' with 'nightly' in the bat

enjoy
>>
>>107340633
that's z image? I thought it didn't know anime IPs
>>
>>107340593
>tunes of
*tunes like
>>
lodestones NEEDS to make a finetune of this
>>
File: er_sde_z_image__00017_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
>>107340646
Sorry he blw all the money on trying to fix flux
>>
alright people time for a reality check
z-image blows.
it's basically SDXL2, it's retarded, has prompt bleed, and cannot blend characters and styles.
try naruto/bleach/ghibli/dragonball style + hatsune miku, then remove the reference to miku.
given that it's basically Lumina + Distillation (Lightxv or equivalent), it's no wonder. this is a cope model that does nothing new.
If, somehow, updates smooth out the wrinkles then great. as it stands, it's technologically already irrelevant and is being pushed purely by copers with shit tier hardware.
"oh look how good it looks for 1girl" yeah man that's what it was made for, it can't do anything else.
the edit model should marginally improve things but this flood of posts is seemingly from one dude from how much repetition there is (or is that just the distillation at play).
Flux2 sucks too, but it's a technological leap (10 references, already has really strong edit, JSON prompts).
Gimme a flux2 acceleration over this dogshit model thanks.
>>
File: zimg_0066.png (988 KB, 832x1216)
988 KB
988 KB PNG
china truly does not give a damn
>>
been out a day and this happens, so zimage is basically flux 2 but smaller and without the embedded cult of safetyism?
>>
File: Z-Image-Turbo_00009_.png (1.9 MB, 1280x1280)
1.9 MB
1.9 MB PNG
>>
>>107340625
imagine if that were true damn id kill myself
>>
File: 45544512544.jpg (611 KB, 2138x1888)
611 KB
611 KB JPG
>>
>>107340663
it's flux 1 but less slop and basically uncensored
>>
File: Flux2 v Zimage 1.jpg (692 KB, 2200x1600)
692 KB
692 KB JPG
Tired from wage slaving, wakes up and find that /ldg/ exploded and Flux2 mogged already. What the hell.
>>
>>107340655
>try naruto/bleach/ghibli/dragonball style + hatsune miku, then remove the reference to miku.
Why didn't you attach your example? Odd.
>>
>>107340663
it's flux dev but with an apache 2.0 licence, 2x smaller, with a better skin texture, knows more characters/celebrities IP and has better understanding of nudity, basically we won lol
>>
>>107340664
IIIIIIIIIIIIIIIIIIIII NEEEED TO KNOOOOOOOOW NOW KNOOW NOW
>>
>>107340661
And that is why China will win the AI race. Meanwhile Black Forest Labs spent half their resources locking down their model to be "safe" and corpo approved for normies.
>>
>>107340655
nogen mald
>>
>>107340655
nobody gives a shit about flux 2. it's a trap-laden spitebake just like flux 1, a complete money-trap for copetuners to dump thousands of dollars into only to run into collapsing anatomy every 5 epochs because it's fundamentally anti-local. even qwen image proved that a single blowjob lora outperformed all $200k worth of chroma. flux is a dead end
>>
>>107340676
The flux is better tho?
>>
>>107340673
It's not uncensored. It can't do genitalia or NSFW content. A finetune can easily fix that though.
>>
Will this be what brings hardware prices down? AI models becoming better while also being more efficient? What is the point of building a super computer in your home if you can gen stuff this good for less.

Also, why don’t we have chink ai text generators like this, too. Something that rivals Claude/open ai but more efficient and completely unfiltered.
>>
>>107340667
Yeah there's been a couple DOA bloatmax models that people dropped as soon as the novelty wore off, if this doesn't take of it's XL until the heat death of the universe
>>
>>107340655
flux 2 stance around anything nsfw makes it irrelevant for me
>>
>>107340700
um.. wrong
>>
>>107340700
I said basically uncensored, it's pretty evident there's no porn in the dataset, but there's certainly some nudes
>>
>>107340655
here i just shat this one out for you pal. it's yours. take it.
>>
>>107340701
>Will this be what brings hardware prices down?
no. people still pay for api nodes
>>
>>107340709
post example of a man fucking a woman.
>>
File: ComfyUI_temp_trrch_00005_.png (1.57 MB, 1088x1600)
1.57 MB
1.57 MB PNG
>>107340641
IPs yes, but not artists.
>>
>>107340719
blue board, you said it can't do nsfw or genitals, but it can
>>
File: ComfyUI_temp_trrch_00006_.png (2.84 MB, 1088x1600)
2.84 MB
2.84 MB PNG
>it skews towards photorealism so much that prompting anime characters without specifying artstyle gives me cosplay chicks
neat
>>
>>107340731
Catbox it
>>
is there a surefire way to get rid of the grain when it gets really bad you know what i mean
>>
File: 56565234154.jpg (635 KB, 2138x2114)
635 KB
635 KB JPG
>>107340663
Flux.2 has slightly better knowledge of artists and mangaka, but this is distilled model and they're still cooking so we'll see when we get base. When the edit model drops it should not be much of an issue though. Z-Image is better out of the box at realism though and probably knows more characters and celebs due to Flux.2 censorship. It also knows what a vagina and nipples look like as opposed to Flux.2. This is way better than Flux.2 overall.
>>
>>107340749
do it yourself nogen faggit
>>
>>107340700
nsfw content works, but the genitals are just weird. not bad for an alibaba model
>>
Z-Image won't replace Illustrious unless it gets a danbooru finetune.
>>
believe it or don't, try it or don't
it's chinese astroturfing in full effect
on the positive, at least we get more free models.

yes flux2 sucks too, but it has technology that we don't see in SDXL2. it actually does new stuff.
what the fuck is z-image even bringing to the table it's awful. this is basically a photographic realism tune of SDXL.
what is this insane cope that it's good
chinese astroturfing or poors celebrating a model they can actually run.
I like tons of Chinese models, Qwen isn't one of them, Wan is. Their LLMs are decent.
again, test for yourself:
Hatsune Miku + basically any anime style
curiously this same problem existed in SD3, Qwen, Flux1 and actually, it's been a problem in flow matched stuff for ages. It's just ultra pronounced in china models because they distill down western stuff.
>>
>>107340762
you format your posts like a faggot.
>>
>>107340759
But we need something new on the realism front, XL sucks
>>
>>107340689
No one even asked the team to be so safety obsessed, it's BFL doing that to themselves.
Even nano banana pro has a less censored dataset (because they understand that anatomy comes from nsfw/nudity), and only filter on the output.
If BFL did that, only filtering post hoc through their official API, no one would have cared if they released like Chinese models.
Every time I read a gigantic wall of text about man days wasted on safety to release the model weigths (instead of relying on filtering after), it makes me die inside a little.
>>
>>107340768
realism is gay
>>
>>107340767
45s
>>
>>107340762
gay
>>
File: heun_z_image__00004_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>107340762
coping so hard
>>
your realistic goth girl, sir.
>>
Can someone explain to me the difference between WanAnimate and VACE?
>>
>>107340683
>>107340673
>>107340753
looks like I know what to download this weekend, thanks anons
>>
>>107339954
St. Bruno, pray for us.
>>
>>107340762
id rather you start schizoposting about the government or some shit again instead of this
>>
>>107340775
You're mom gay
>>
>>107340762
>flux2
>it actually does new stuff.
correct, managing to bloatmaxx the model to the size where its no longer even possible to make a lora on any local hardware for an incremental improvement over qwen image while looking worse than a 6b model out of the box and taking multiple minutes to gen a single image no matter the gpu or settings
>>
>>107340759
Yes but it being small, there is now actually a chance we can finetune it unlike the other bloated models like flux and qwen (I say even though I really like qwen)
>>
File: 2125520135151.jpg (638 KB, 2138x2174)
638 KB
638 KB JPG
Alright, last comparison for now.
>>
>>107340771
>BFL doing that to themselves
it's just the way researchers view this stuff, it's truly cultural
western ones are a whole lot more prude than eastern ones, that's pretty much it
>>
File: 1753726541271394.png (266 KB, 627x729)
266 KB
266 KB PNG
>>107340655
>>107340762
>>
File: heun_z_image__00005_.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>107340762
>>
>>107340768
On civitai there's a ton of Illustrious checkpoints, and the way it plays out is each checkpoint basically has its own style. There's realism style checkpoints too like cyberillustrious_v40, and so they're basically blending realism with tag data from danbooru.
>>
So did alibaba's z-image just BTFO alibaba's own qwen? the next qwen edit needs to be a massive leap for it to be worth it
>>
>>107340815
lol'd
>>
File: 1757102233103064.png (64 KB, 1594x574)
64 KB
64 KB PNG
was confused so asked perplexity
>>
hello, ben from comfyorg here

please stop making fun of our important partner, black forest labs. our investors need a good response so we can keep providing the finest API nodes for you. zimage may be impressive but we need to give bfl the flowers they deserve. sign up for a comfy cloud account and try flux2 pro!

thanks for your continued support for comfy cloud, the best in local api
>>
>>107340794
KEK
>>
>>107340759
If it's actually popular it will get a booru finetune, because the base model is way better at understanding anatomy.
And it's not so huge no one can afford one.
>>
>>107340817
cyberillustrious has horrendous prompt adherence
>>
shameless bot behaviour
this site fucking sucks now
>>
>>107340823
>vram optimized 16gb
fake news, uses about as much as SDXL but at a significantly higher resolution. that turbo is efficient as fuck.
>>
File: Untitled.png (104 KB, 1660x1046)
104 KB
104 KB PNG
>>107340841
it sounds like a tough balancing act
>>
>>107340817
I've yet to see arealism illustrous model that doesn't suck, they all have obvious issues much like pony realism attempts
>>
I don't feel as bad for Yume as I did Pixart. Maybe it's because I've grown, maybe it's because circumstances are different. Maybe I'm a huge faggot. I don't know.
>>
>>107340862
what the fuck does "feel like illustrious" mean?
>>
>>107340430
This is the catch: >>107340418
Gens barely change from one seed to the other.
>>
>>107340867
>Maybe I'm a huge faggot
I can confirm this
>>
File: Flux2 v Zimage 2.jpg (672 KB, 2200x1600)
672 KB
672 KB JPG
Hmm... Flux2 can handle very fine details like fairy dust. It's a good thing that local chads can use both.
>>
>>107340418
>>107340872
This only applies to faces. Nothing else.
>>
>>107340870
"feel like danbooru", most likely. either your characters are obeying your booru tags or they're not obeying your booru tags.
>>
>>107340877
>fp8-AAAAAAAACK
>>
>>107340843
Nigger
>>
>>107340418
>The seed variance is really bad.
do you remember how much worse qwen was? LOL
>>
>>107340872
no dark images
no style mixing
has regular tag bleed
seems to ignore half the prompt between rolls (tokenizer issue?)
no lens effects
text is crap and misspells words
no reasoning (implementation missing, false advertising?)
introduces visual noise/particles intentionally as part of the aesthetic tune
no variance
one tag can completely overwhelm another
repeating an instruction bleeds really bad (tattoo on arm, tattoo on thigh)

all that being said hope it works out, not good on release.
>>
File: zimg_0087.png (982 KB, 832x1216)
982 KB
982 KB PNG
alt samplers: https://files.catbox.moe/qq9rym.png
>>
>>107340862
>it sounds like a tough balancing act
i know but im still really annoyed that you cant have realism and booru tagging coexist. chroma is nice but its fucking slow as shit even on a 5090
>>
>>107340898
It just needs to be a significant step up from XL, and easy to train.
>>
File: ComfyUI_07530_.png (1.44 MB, 944x1280)
1.44 MB
1.44 MB PNG
>>
>>107340902
Who is the dude
>>
File: 1751972887506650.png (178 KB, 1106x1495)
178 KB
178 KB PNG
>>107340850
yeah it's not perfect, I asked a second time while adding qwen
>>
>>107340418
>>107340872
Although the seed variance is not good, it's not that bad either, like it was for qwen which had 0 variance unless you are using qwen image edit lightning lora on qwen image to fix it. It also wont be a problem for any lora that can easily break free, but i do wish you could gacha it with basic tag prompts like you can chroma
>>
>>107340928
Damian Lewis
>>
It's time. The new era of MEMES has begun.
VON BRAUN premiering TONIGHT exclusively on /ldg/.
>>
>>107340937
lmao
>>
File: zimg_0084.png (959 KB, 832x1216)
959 KB
959 KB PNG
>>107340928
>angelina jolie and rihanna laying in bed together, bare shoulders, black silk sheets, steve buscemi on a chair in the background

i didn't want to give me ol' bug eyes
>>
>>107339853
>thinly disguised indian hate thread

k
>>
File: 1752408625597151.png (2.22 MB, 1880x1248)
2.22 MB
2.22 MB PNG
faux compression artifact hell
>>
please care about flux 2
>>
>>107340951
pretty good
>>
>>107340902
prompt?
>>
I don't remember people shitposting abd genning so much back when flux dropped, this feels huge
>>
>>107340958
What's with the obnoxious cloud noise pattern/artifacts on every z-image gen? seedream had that problem too.
>>
File: Z-Image-Turbo_00015_.png (2.02 MB, 1280x1280)
2.02 MB
2.02 MB PNG
>>
>>107340951
thx!
>>
Zigger image ggufs when?
>>
>>107340979
bro using a 1050 ti
>>
..And after VON BRAUN catch a new SHORT by DISNEY: WE WUZ

(and yes the image seed variations issue is real. this image is almost identical across several other gen attempts.)
>>
>>107340979
It's 12GB you don't need gguf
>>
maybe someone can test a solid color or something like that zimage gen with different settings and resolutions at bf16 to see if things change, maybe try to get fp32 files and enable fp32 with the special comfy flag too
but personally i dont really see an issue
>>
File: z-i-t.jpg (108 KB, 1024x1024)
108 KB
108 KB JPG
>>107340450
>GOD he is literally ME
i thought so. we're all geniuses or something. until ai takes over.
>>
>>107340972
i have yet to figure out specifically what triggers it during inference but maybe some anon other knows
>>
>>107340991
>>107340993
almost, 2070 super 8gb
i much rather use q8 than shitty fp8 or whatever the fuck
>>
>>107340655
>>107340762
They hated him because he spoke the truth.
>>
>>107340979
there's quants but they aren't gguf
>>
>>107341005
im on a gpu with 8gb vram and am using bf16
>>
File: VHcdiTNOQDBG.jpg (388 KB, 1536x1024)
388 KB
388 KB JPG
>>107340972
comfy implementation is broken and not denoising completely
>>
File: GPT Image 2.png (60 KB, 875x454)
60 KB
60 KB PNG
UH OH LOCALKEKKIES, LOOKS LIKE WE SPOKE TOO SOON
>>
>>107341003
it gets stronger with higher resolutions
>>
>>107340951
that doesn't look very safe, try that in flux 2 for a safer output
>>
>>107341018
>comfy implementation is broken and not denoising completely

fffffffffuck
>>
>>107341005
Should run decently on 8GB unless you hsve 4GB RAM
>>
>>107341018
>comfy implementation is broken and not denoising completely
HAHAHAHAHAHAHAAAA
>>
File: ComfyUI_01582_.png (1.32 MB, 1248x1024)
1.32 MB
1.32 MB PNG
Okay yeah I can get why you like this model.
>>
Z-Image is neat, but that's about all it is. Nothing exciting.
>>
>>107341018
>comfy implementation is broken
what else is new?
>>
File: z-i-t.jpg (120 KB, 1024x1024)
120 KB
120 KB JPG
>>
>>107340979
just use torch2multigpu node. works. already 50 gens
>>
File: fetchimage.jpg (150 KB, 600x582)
150 KB
150 KB JPG
>>107341030
>>
anyone have recommended prompts to have gemini spruce up prompts? this model clearly works best with word salad
>>
File: wan22__00001.mp4 (959 KB, 480x480)
959 KB
959 KB MP4
>>
isn't this just proof than large models are just fucking bloat?
>t. didn't even do a single gen on any new models, just going off vibes
>>
>>107340931
How the hell can zimage look so good while being 6B?
All other models are gigantic in comparison.
>>
File: COMFY SHILL.png (1.13 MB, 1503x1664)
1.13 MB
1.13 MB PNG
[BREAKING NEWS]
ComfyOrg's blog was just updated
...to feature more paid API node partnerships! Still no mention of z-image release!
>>107341018
Comfy's implementation is intentionally sabotaged to promote API models. As has been the case with every Comfy local implementation.
>>
>>107341048
me off camera on the right
>>
File: zimg_0088.png (1.35 MB, 832x1216)
1.35 MB
1.35 MB PNG
anyone wondering what aura flow does:
more grids: https://files.catbox.moe/p6k5hl.png
>>
>>107341067
>>107341067
>>107341067
>>107341067
>>
>>107341064
meds2
>>
File: ComfyUI_01608_.jpg (826 KB, 2048x2048)
826 KB
826 KB JPG
>>
>>107341058
it's mostly proof that slop styles are fine as long as they're the flavor of slop they were looking for
>>
File: ComfyUI_08888_.png (1.31 MB, 1152x1152)
1.31 MB
1.31 MB PNG
Prompting Z-Image at 2k, really not bad for these speeds.
>>
>>107341070
Nothing?
>>
>>107341064
ComfyUI needs to be removed from the OP. The continued push for API nodes while remaining completely silent about local releases only encourages companies to shift towards API, like we saw with WAN. Comfyorg is probably in talks with Alibaba right now to develop API-only versions of Qwen and Z-image to sell more ComfyCloud subscriptions.
>>
>>107341074
OK that's an amazing Asuka.
>>
>>107341077
>1152^2
>2k
???
>>
File: wan22__00002.mp4 (870 KB, 480x480)
870 KB
870 KB MP4
>>107341068
git it
>>
>>107341023
if it does, its only one part of it. other things contribute too
>>
>>107341074
Nice Asuka now make her pregnant
>>
>>107341079
not a fuggin thing
>>
>>107341074
i bet she tastes like burnt toast
>>
File: Z-Image-Turbo_00020_.png (2.91 MB, 1280x2048)
2.91 MB
2.91 MB PNG
>>
>>107341074
I think you messed up your parameters somehow
>>
File: Flux2_00126_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>107341098
>>
>>107340976
Those are some good colours
>>
>>107341224
(°□°)︵ ━
>>
>>107341224
now prompt her getting impaled by the spear of longinus
>>
>>107339853
Just found out about z-image

https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

What's the /ldg/ consensus on it? How does it compare to SDXL and Flux?
>>
>>107342647
>>107341067
>>
>>107342647
sdxl v 2.0
very fast
good quality
better than flux

can do text easy
can do western comics from 1960 and onward
does realism (hands too)
prompt adherence is not 100% as some claim/will claim



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.