[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: ldg bake 25 11 28.jpg (3.23 MB, 4300x4836)
3.23 MB
3.23 MB JPG
Discussion of Free and Open Source Text-to-Image/Video Models

Prev:>>107356595

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://comfyanonymous.github.io/ComfyUI_examples/z_image/
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: ComfyUI_00376_.jpg (1.01 MB, 2048x2048)
1.01 MB
1.01 MB JPG
Surely zimage is running a form of DyPe natively, right?
If not, the DyPe devs surely can make it work since zimage and flux seems to work similar.
>>
1girl, naked
>>
How did the chinks make a distilled/turbo model before making base model? Makes no sense
>>
>>107358401
make != release
Dumbass
>>
DyKe
>>
>>107358401
casual racism outside of /b/ is too edgy4me
>>
>>107358368
No matter how much you spam this board, no matter how many models you shitters get, it will never get as good as real art and you will never fill the void with slopping computer-hallucinated garbage. Soon the novelty won't be enough and you're going to kill yourselves.
>>
>>107358401
common ownership of the means of production
>>
>>107358425
We heard you the first time Sam
>>
>>107358425
An hero soon, artcuck
>>
>>107358425
>Soon the novelty won't...
4 more years lil bro
>>
Friendly reminder to support based and uncensored Chinese local models against censored API menace and "safe" western slop.
>>107358401
Because they made the base first and then distilled it? They just released the base first.
If they are still training, I assume they would do some finetuning or reinforcement learning, not training from scratch,
But more likely they are just running some tests or whatever.
>>107358425
Literally no one here gives a shit about it being real art seething retard.
We just like making shitposts and cooming.
>>
>>107358425
rent free lmao >>107358295
>>
>107358425
You sound really upset, have you tried learning about new things instead of getting angry at them?
>>
>>107358445
*They just released the distill first.
>>
>>107358445
>If they are still training, I assume they would do some finetuning or reinforcement learning
then what they will release can't be called "base" anymore?
>>
File: ComfyUI_00378_.jpg (814 KB, 2048x2048)
814 KB
814 KB JPG
>>107358388
>taking panels through two llms and zimage
>>
>>107358425
Hey buddy I think you got the wrong door, xitter is two blocks down.
>>
File: ComfyUI_00057_.png (2.51 MB, 1280x1280)
2.51 MB
2.51 MB PNG
>>
>>107358445
>no one here gives a shit about it being real art
/ldg is the vanguard of new art. society will think of it as "early classic" Trump era renaissance art.
>>
>>107358472
>Calling all space niggas
>>
>>107358463
Maybe the main model was lacking in artistic stuff, and they wanted the distill to be realism oriented, so they decided to distill and release the model first before further finetuning the model on artistic stuff? Or some shit like that.
I dunno could be a lot of reasons, I don't see the conspiracy.
>>
File: ComfyUI_00110_.png (1.34 MB, 1200x800)
1.34 MB
1.34 MB PNG
>>
>>107358425
kek sloppers really get mad about the truth
>>
>>107358445
I will support based uncensored Chinese API models
>>
>>107358472
this model has a killer combo, realism + details, impressive for a 6b model
>>
File: 1761623925764721.png (3 KB, 147x72)
3 KB
3 KB PNG
>>107358425
>the novelty won't be enough
200k+ generated and 10k saved, the more time goes on the more ideas i have and the more ideas i get the more models i have to test those ideas on.

If I don't feel like spending much time on finding a good balance of settings on a new idea I load up some of my older favourite images and gen a few hundred similar images with a new twist or model.
Or taking one of the thousand saved images online, push them through a VLLM to get a few different descriptions of it, and then feed them into multiple different models to see what they come up with.

Any image or video model or lora that comes out for it means I regen all my characters and past prompts with those which takes days.

And this is all before we even have a proper text2edit model without a VAE that is gonna destroy the barrier of entry for getting the exact image you want without worrying too much about even prompting anymore.

Unless you are a brainlet with no imagination having access to create anything can't ever become boring.
>>
File: ComfyUI_00078_.png (2.25 MB, 1440x1440)
2.25 MB
2.25 MB PNG
>>107358315
Nice
>>
File: AnimateDiff_00001.mp4 (3.7 MB, 1072x1072)
3.7 MB
3.7 MB MP4
>>107358388
I need to work on my light lora combos..
>>
>>107358566
man he is literally me
>>
File: ComfyUI_04981_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>107358425
lolol
>>
>>107358425
This isnt the correct take. The correct take is that art itself peaked with ILLUSTRIOUS and NOOBAI.

Remain underfed so called "artist".
>>
>>107358580
>1072x1072
wtf, what gpu are you rocking anon? how is it so expressive? new wan movement lora?
>>
>>107358598
all illu gens have the same shit shading
>>
>>107358605
shitmix issue
>>
>>107358566
desu the effect is not really strong, but once again I shouldn't expect this kind of shit on the non base model
>>
File: AnimateDiff_00001.mp4 (3.9 MB, 1072x1072)
3.9 MB
3.9 MB MP4
>>107358580
I guess gradient backgrounds just enhance the color/brightness shift no matter what.

>>107358603
Painteri2v, fill out the prompt.

"the woman looking at the viewer and smiles while closing her eyes tilting her head and then opens her eyes and looks at the hamburger that she is holding with both of her hands and fingers and then opens her mouth wide and begins to eat the hamburger as she puts it towards her mouth and with each of the several bites she takes of the burger pieces disappear from the burger with bite marks as crumbles of bread fall from the burger and her mouth and a ketchup mark is left staining her cheek and she continues to eat the burger violently as she shakes her head and eventually the entire burger has been devoured by the toman and she leans back and lets out a loud burp as her lips shake from the force of the burp."

Unending sentence, no commas, no periods.
>>
vramletbros should try out SDNQ
https://github.com/erosDiffusion/ComfyUI-ZImageDit this one worked for me
https://github.com/EnragedAntelope/comfyui-sdnq seems broken
quality is pretty much the same
>>
>>107358670
>Painteri2v
holy shit thanks!
>>
>>107358596
>got half the shit wrong
>didn't even look at the piece of garbage he generated
>still posts his slop
Sasuga saar
>>
>>107358558
Didn't read this ran vomit.
>>
File: 1747267141942599.webm (1.06 MB, 1070x604)
1.06 MB
1.06 MB WEBM
assuming I don't have dGPU, is it futile to run something like this?
>>
>>107358705
Yes.
>>
>>107358673
>SDNQ in cumfart
actually nice, I want the generic one tho, ill check it out
>>
>>107358705
I look like this
>>
>>107358425
imagine spendin basically your whole life on drawing multiple hours a day to land work for shit money as a cog in some globohomo company just so you can at least draw for a living and then almost overnight ai automated drawing before it even automated programming, and it can create images that would take you a lifetime of practice and time to do, that now any normie can get in 5 seconds for free
brooooooooooootal life
>>
File: 3192486162.png (3.04 MB, 1536x1536)
3.04 MB
3.04 MB PNG
>>
>>107358753
I don't know but you sound like you have a massive learning disability.
>>
me personally i dont feel bad for artists, but i wouldnt wish whats happening to them to happen to other people, i just dont give a shit and ill keep on cooming with local image models
you'll never stop me
>>
>>107358673
Was the catch for unofficial SVD implementations is that they lack the fused kernel or whatever so you get less quality/performance?
Not like I really need it for 6B though.
Still waiting for Wan 2.2 copechaku implementation however.
>>
so sick of ranfag being a drama nigger
>>
>>107358766
draw this in 5 seconds paintpiggie >>107358763
oh, you cant? rip
>>
>>107358763
>it starts turning into hebrew halfway in
sounds about right kek
>>
have y'all folx found a solution for translating english prompts to chinese in comfy?
>>
>>107358775
>less performance?
yes, there is no speedup for me even though i have to offload bf16
but the quality dropoff isnt EXTREME
only good for vramlets maybe, pretty sure SDNQ supports older pre-RTX cards
>>
>>107358795
my chinese gf does it for me
>>
File: file.png (208 KB, 1739x823)
208 KB
208 KB PNG
>>107358775
>Still waiting for Wan 2.2 copechaku implementation however.
https://huggingface.co/wanvideoquant right on time
>>
>>107358806
how's her honeypot gig going?
>>
>>107358815
>its him
llmxy bros... WE WONNED?
>>
File: file.png (16 KB, 941x168)
16 KB
16 KB PNG
OOOH AHAHAHAHAHAHAHAH FLUX KEKS ITS FUCKING OVER AHHAHAHAAHHA
WE ARE FUCKING BACK
I KNEEL XI
I FUCKING KNEEL XI
https://github.com/nunchaku-tech/nunchaku/issues/809
https://github.com/nunchaku-tech/nunchaku/issues/809
https://github.com/nunchaku-tech/nunchaku/issues/809
>>
File: ComfyUI_04979_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>
>>107358823
but FLUX needs nunchaku way more than Z-image
>>
>>107358823
2026 will be year of the pooh
>>
this time next year we'll all be speaking chinese
>>
>>107358823
man I just checked and this faggot still didnt merge in official qwen loras support.
not that we need it anymore thanks to ZIT.
>>
File: 1735909061001886.png (1.76 MB, 3381x1531)
1.76 MB
1.76 MB PNG
I managed to integrate the prompt enhancer part on ComfyUi, I'm currently using Qwen 3 VL 4b instruct (there's a thinking version on the list but when I use it it also adds the thinking part on the prompt so Idk what to make of that)
I started with this short vague ass prompt
>Hatsune Miku eating popcorn while skateboarding, depicted in the visual style of a PlayStation 1–era game screen, some game Ui
and it gave me something detailled, that's pretty cool

I also used the official system prompt for Z-image so we're in the clear
https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo/blob/main/pe.py

you can test it out here
https://files.catbox.moe/mzo0r6.json

PS: Update your transformer's package or else it won't regognize Qwen 3 instruct
PS2: Install flash attention so that it gets faster
>>
>llama 3 70b tunes are still the best for rp
am i missing something? i know moe is all the rage now, but smaller models suck ass compared to a dense 70b. even deepseek has more isms than l3 70b. what are you guys using now that would be better?
>>
>>107358580
I was just using the light 1030 high model (not the lora) and it definitely had faster motion than the moe lora, at least for my sexy massage gens.
>>
>>107358840
to the victor go the spoils
>>
>>107358815
>nvfp4
Welp, still waiting.
>>
>>107358856
very nice! thank you for sharing it anon <3
>>
>>107358865
/lmg/ is that way
>>
>>107358865
>>/g/lmg
>>
File: ComfyUI_00388_.jpg (787 KB, 2048x2048)
787 KB
787 KB JPG
Arr rook same.
>>
>>107358425
half this thread are actual pedophiles by the way
not lolifag type, like actually jerk off to kids type
>>
>>107358913
I wish they stayed in sdg
>>
File: z_mod_00131_.jpg (717 KB, 1624x1288)
717 KB
717 KB JPG
>>
File: baron.png (1.2 MB, 1408x768)
1.2 MB
1.2 MB PNG
>>
>>107358931
They created /ldg/ though.
>>
File: 00008-1162403333.png (2.08 MB, 1272x1552)
2.08 MB
2.08 MB PNG
>>107358605
Nope
>>
>>107358425
This would have made sense if we never moved on from the flux plasticslop era. But we finally just did that.
>>
>>107358931
>obsessed
>>
File: z_mod_00236_.jpg (877 KB, 1408x1680)
877 KB
877 KB JPG
>>107358938
härkönen
>>
>>107358938
kek
>>
>>107358954
cummed to
>>
>>107358949
>posting the same image for 3 years straight
>>
>>107358905
>Arr rook same.
Art imitates life.
>>
File: 1757693258231050.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>107358856
>PS2: Install flash attention so that it gets faster
you can get it here for the windows fags
https://huggingface.co/lldacing/flash-attention-windows-wheel/tree/main
For pircel the prompt was
>A magazine cover featuring Hatsune Miku involved in drug trafficking.
>>
File: z_mod_00244_.jpg (675 KB, 1408x1680)
675 KB
675 KB JPG
>>
>>107358856
Nice, I tried to figure out how to do this earlier but settles on manually doing things.
>>
File: 00002-2776078806.png (1.9 MB, 1200x1464)
1.9 MB
1.9 MB PNG
>>107358949
who is living rent free in your mind schizo?
>>
>>107359002
ment to link >>107358974
>>
File: file.png (79 KB, 639x322)
79 KB
79 KB PNG
lul
>>
>>107359014
>>107359019
braindead retard kek
>>
>>107359002
Stop posting these trash images. Isn't your discord enough for you? Ask nigbo to join you, you can stroke eachother off 24/7 then.
>>
why is ran spamming this thread
>>
File: 00010-1293275323.png (1.85 MB, 1272x1552)
1.85 MB
1.85 MB PNG
>>107359029
Who is nigbo? Why are you so mentally ill?
>>
File: ComfyUI_04984_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>107359002
>>
>>107359047
Why is it so blurry?
>>
>the netayume bugmen are seething at SDXL chads again
>>
File: 1739661496361009.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>107358856
>8-bit NES style depicting Donald Trump battling 2B from Nier Automata, user interface elements, dramatic effects, particle effects, dynamic poses
>>
File: ComfyUI_00390_.jpg (965 KB, 2048x2048)
965 KB
965 KB JPG
>>107358905
>using anons simple prompt > LLM rewriting the prompt into a bigger one > prompt read as qwenvl despritive and joycaption booru tags merging into the same prompt > sampler

HNNNNNNNNNNNNNG
>>
>niggerjak tricked into downloading cumfart again because she's illiterate and doesn't know forge has it
>>
>>107359055
>>107359075
>>107359090
samefag pajeet
>>
File: 00013-1977879980.png (1.74 MB, 1083x1552)
1.74 MB
1.74 MB PNG
>>107359054
netayume niggas perpetually malding
>>
File: 1740189464162414.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>107358856
>An illustration book describing how a model called "Z-Image turbo" managed to be more popular and loved than a model called "Flux 2 dev"
kek
>>
The miracle of Chinese SOTA only did so much to keep the XL sloppas at bay... They've finally returned but for how long?
>>
>>107358856
Any way I can customize samplers of the enhancer? (Temperature, min p, etc)
>>
File: 1736177433730948.png (111 KB, 772x1027)
111 KB
111 KB PNG
>>107359127
yes, with the qwenvl advanced node
>>
>>107359137
Thanks I will give this a shot.
>>
welp this model going to taken down and censored
>>
File: 1763678921961846.jpg (414 KB, 1200x1200)
414 KB
414 KB JPG
>An ultrasharp, high-resolution aerial photograph looking straight down (top-down view) onto the Shibuya Scramble Crossing. The focus is laser-sharp on a beautiful Caucasian woman with a radiant, genuine smile, who is waving up at the camera.

I can't get her in the middle :(
>>
>hes still samefagging
>>
fuck this nigger lumina 2 ass model doesn't work in fp16 dtype
>>
Why is everyone pretending like this wasn't known information since the first hour the model was available?
>>
File: 1734526464832780.png (1.69 MB, 1280x1280)
1.69 MB
1.69 MB PNG
>>107359119
kek, that one is good
>>
why is he so fucking mad
>>
>>107358866
>>107359055
WTF????
>>
File: ComfyUI_00342_.png (895 KB, 800x800)
895 KB
895 KB PNG
>>
Can I show the prompt between these two with a node? Like a Show Any sort of deal?
>>
>>
File: ComfyUI_00339_.png (2.51 MB, 1152x1152)
2.51 MB
2.51 MB PNG
portrait model btw
>>
>>107359237
Portrait of a landscape, yeah?
>>
>>107359237
is this where teletubbies from?
>>
>>107358856
>Doesn't work for NSFW
Fuck, it doesn't even want to describe a 2d pic as if it was an human
>>
>>
>>107359237
Neat
>>
Is there a "batch load prompts from all images in the folder and gen with those prompts in the current workflow node"?

Whats the best way to achieve this?
>>
>anons getting 3 day bans for replying to sfw ai generated imagery
it's so grim
>>
>>107359286
3 day? It's a perma ban.
>>
>>107359262
oof, is there some uncucked qwenvl finetunes there?
>>
>>
File: 1763908872120609.png (62 KB, 760x265)
62 KB
62 KB PNG
>>107359290
You don't get permas if it's AI
>>
File: 1737166041618569.png (1.9 MB, 1280x1280)
1.9 MB
1.9 MB PNG
>>107358856
>Create a four-panel manga explaining the concept of gravity.
what he did was dangerous though lool
>>
>>107359229
Kinda figured it out. But I am locked out of editing the prompt. Can I have this automated process be sent to the final prompt node but it stops to not generate, allowing me to adjust the prompt manually?
>>
>>107359300
I literally got a permaban message for merely replying to an image lol
>>
>>107359286
just dont be a pedo, simple
>>
How are there still no loras? It’s been a week, hasn’t it?
>>
>>107359316
kys nigga it hasn't even been 48 hours
porn has warped your brain
>>
>>107359313
seriously.. im fucking glad they're banned child rapists and wannabes
>>
File: file.png (1.45 MB, 2048x512)
1.45 MB
1.45 MB PNG
>>107359237
landscape model btw
>>
>>107359330
hot
>>
>>107359324
so much this. trans rights are human rights btw
>>
>>107359337
shouldnt trannies be your friends given they are the highest likelyhood of being a pedo like you doe? uh oh
>>
File: 1760284117044196.jpg (686 KB, 1920x1920)
686 KB
686 KB JPG
>>
File: file.png (234 KB, 2048x64)
234 KB
234 KB PNG
>>107359330
Kinda impressive that this works at all.
>>
Local Diffusion
>>
File: ComfyUI_00348_.png (1.01 MB, 800x800)
1.01 MB
1.01 MB PNG
>>
File: ComfyUI_00355_.png (950 KB, 800x800)
950 KB
950 KB PNG
don't do the crime
if you can't do the time
>>
File: z_mod_00265_.jpg (877 KB, 1648x1288)
877 KB
877 KB JPG
>>
>Z doesn't understand facesitting.
it's over
>>
File: wan22__00002_.mp4 (668 KB, 640x640)
668 KB
668 KB MP4
>>
File: file.png (1.62 MB, 512x2048)
1.62 MB
1.62 MB PNG
portrait model btw
>>
>>107359453
base model when
>>
>>107359471
possibly in 2027
>>
>>107359471
it's still cooking
>>
I wonder if with z edit it will be worth to vibe code a photoshop clone
>>
File: ComfyUI_00229_.png (1.09 MB, 1028x2068)
1.09 MB
1.09 MB PNG
How much vram do I need to train loras?
>>
File: z_mod_00272_.jpg (856 KB, 1904x1288)
856 KB
856 KB JPG
>>
File: file.png (1.67 MB, 512x2048)
1.67 MB
1.67 MB PNG
I wanted longcat but this is fine too.
>>
>>107359490
6
>>
File: wan22__00003_.mp4 (781 KB, 640x640)
781 KB
781 KB MP4
>>
>>107359471
Sunday
>>
File: file.png (235 KB, 500x489)
235 KB
235 KB PNG
>>107359547
>Sunday
>>
>>107359446
is this reverse psychology or something?

a photo of a woman, sitting on top of a man's face on a bed. the man is lying below her, his head covered entirely by the woman's buttocks.
>>
File: zimage__00016_.png (1.15 MB, 832x1216)
1.15 MB
1.15 MB PNG
ostris posted, i'm training a lora right NOW
>>
>>107359591
>ostris posted, i'm training a lora right NOW
this shit will be useless in 2 days when base will be released lool
>>
>>107359490
Depends on what model that is.
I am also guessing it doesn't really support that resolution.
>>
>>107359506
6 what?
>>
File: 6 7.png (921 KB, 1024x535)
921 KB
921 KB PNG
>>107359506
>>107359613
6 7
>>
tried to train a lora with the fork of a literal pajeet and it did nothing
https://github.com/pyros-projects/diffusion-pipe

why do I keep falling for these scammers, I just wasted money on renting a gpu setting everything up and the loras don't work, I don't know why I even tried this, there are literally no other loras too lol
>>
>>107359622
Now generate this image with Z-image
>>
>>107359613
6 vrams
>>
>>
File: z image 13.jpg (441 KB, 2048x2048)
441 KB
441 KB JPG
>>
>>107359622
kek
>>
File: 1755592627327701.png (2.46 MB, 1920x1080)
2.46 MB
2.46 MB PNG
>>107359639
>>
>>
File: comparison.png (2.97 MB, 1920x1080)
2.97 MB
2.97 MB PNG
Nano Banana Pro vs Z Image Turbo, who wins?
>>
File: ComfyUI_01922_.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>107359605
Chroma, I'll just train 1024x1024
>>107359605
:3
>>
>>107359592
>implying i care
i'm posting on /g/ while the sun is out, you think anything i do matters?
>>
>>107359687
nano banana's jacket doesnt exist, too weird looking
>>
>>107359687
Z-image didn't do the expression
>>
>>107359687
nano wins since you can tell z only uses chinese background in its dataset

z wins since people can run it so it's good enough
>>
>>107359687
Nano looks off but can't put my finger on why.
>>
>>107359719
the lighting looks like its shot in a studio
>>
>>107359719
nigger is smiling
>>
File: 67.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
>>107359622
>>
>training a lora using a distilled model
>>
>>107359687
zutt wins because nigger is jaywalking (trve)
>>
Still haven't figured out why it takes so long while guides say it's a matter of a few seconds.
>>
File: unnamed (8).jpg (195 KB, 964x568)
195 KB
195 KB JPG
>>107359719
Runpod fags and locals cant compete with Nano and Grok imagine
>>
>>107359777
coz its using reasoning
>>
>>107359777
how long does it take
>>107359784
no he's using the instruct model not the reasoning model
>>
>>107359782
damn, this was made with nano banana pro?
>>
File: 1763330264974710.png (607 KB, 1572x773)
607 KB
607 KB PNG
>>107359782
back in your cage cuck
>>
>>107359784
>>107359787
You can see it above the node, 108s, rtx 5090.
>>
buy an ad
>>
>>107359632
>I just wasted money on renting a gpu
AAAAAHAHAHAHAHAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHAAHAHAHAHAAHAHAHAHAAHAHAHAHAHA
>>
>>107359807
lmao, and this is with flash attention 2?
>>
File: 1764081156859182.png (349 KB, 774x987)
349 KB
349 KB PNG
anyone trying?
>>
>>107359782
can grok do this?
>>107358866
>>107359055
>>
File: poor poor yuropoor.png (97 KB, 1139x484)
97 KB
97 KB PNG
>>107359812
wait people do that really? LMAO
You can rent my GPU >>107359632 ;)
>>
my whole identity is me owning gpu
i have literally nothing else happening in my life :( anyone else like this?
>>
>>107359815
Yeah, guide was using like a xx60 card.
>>
File: unnamed (5).jpg (182 KB, 953x572)
182 KB
182 KB JPG
>>107359802
Yes i have been making tv seethe all week
>>
>>107359820
I'm sorry, I can't comply with that request.
>>
>>107359827
based, gotta remind them how AI will replace hollycuck
>>
>>107359782
>>107359827
I've already seen these on twitter
>>
>>107359825
The system detected potentially unsafe content. Please try again later or adjust the prompt :)
>>
>>107359825
yes, me, i have a poorfag gpu and i get really insecure when people shit on vramlets itt
>>
>>107359782
>>107359827
Z-image edit could unironically have this level if it was a 14b model instead of 6
>>
>>107359844
really lol i made these particular ones.
>>
>>
>>107359855
fat bfl fingers typed this
>>
>>107359825
my whole identity is local models
i feel you anon
>>
>>107359825
Don't worry, you are based and redpilled. You can generate all the child porn as fast as you like you want unlike all the api cucks and vramlets.
>>
File: Z-image turbo.png (3.21 MB, 1920x1080)
3.21 MB
3.21 MB PNG
>>107359782
it did its best, it can't do old images with the grain unfortunately
>>
File: Untitled.png (48 KB, 1109x464)
48 KB
48 KB PNG
>>107359817
>>
>>107359855
Burgers cannot comprehend the slimness of the Chinaman
>>
File: Wan2.2-T2I_00023_.png (3.06 MB, 1440x1440)
3.06 MB
3.06 MB PNG
>>107359687
Wan 2.2 T2I + distill lora
>>
>>107359883
anything special you need to do? setting up my venv right now
>>
>>107358856
>having to use Qwen 3 vl 4b for encoding the text
>having to use Qwen 3 vl 4b instruct to rewrite the prompt
why do I feel this is retarded, if the text encoder was the instruct model it could've do the both of them
>>
>>107359894
looks good 8B or 16B?
>>
>>107359894
>overbright plastic shit
lool
>>
>>107359906
Wan2.2 AT2V 14b low noise model
Q8
>>
File: I COPE HARD.png (244 KB, 800x450)
244 KB
244 KB PNG
>>107359782
>>107359827
just you see anon! this is how z-image base will look like
>>
>>107359894
how many steps
>>
File: Untitled.png (1.55 MB, 1105x770)
1.55 MB
1.55 MB PNG
>>107359899
nope, so far so good
>>
>>107359931
white girl lora?
>>
>>107359931
what resolution are you training on to reach 15.4gb? block swapping?
>>
which node to extract a clean prompt string from an image?
>>
>>107359944
>>107359931
nevermind i just noticed. pictures for ants
>>
>>107359931
are you using ramflow by chroma?
>>
>>107359894
he's literally me
>>
>>107359894
giff workflow anonman
>>
File: 1750155244916772.png (3.26 MB, 1920x1080)
3.26 MB
3.26 MB PNG
>>107359827
:(
>>
File: unnamed (10).jpg (155 KB, 1024x572)
155 KB
155 KB JPG
>>107359880
maybe with image edit it will be better when it comes out, i used a image and said add crew with 80s equipment and grain and etc.
>>
File: ZiMG_00089_.png (3.91 MB, 1568x2016)
3.91 MB
3.91 MB PNG
howdy Z bros
>>
File: psxay.png (1.48 MB, 893x1152)
1.48 MB
1.48 MB PNG
>>107359940
angel youngs body type lora, still have the dataset handy from flux

>>107359962
no sense in training a huge lora if it doesn't work

>>107359969
nope. i think 24GB should be fine
>>
>>107359218
Nice!
>>
>>107359994
What scheduler sampler
>>
>>107359894
The lighting is absolute shit.
>>
File: ZiMG_00093_.png (2.7 MB, 1344x1728)
2.7 MB
2.7 MB PNG
>>107360003
euler bong tangent
>>
>tfw ukrainian
>the model is literally called Z
I hate chinese so fucking much
>>
File: 1734426616285673.jpg (1.21 MB, 3719x1187)
1.21 MB
1.21 MB JPG
>>107359827
>>
>>107360071
>Soul
>Souless
>>
>>107360071
>original vs netflix remake
the woman is literally black on Z-image turbo lmao
>>
File: Capture.png (37 KB, 713x470)
37 KB
37 KB PNG
>>107360071
May I suggest this node
>>
>>107360069
Named after agent z
>>
The fact that you can even compare them at all is funny
>>
>>107360071
this shows that even though z-image is really realistic it's still a little bit slopped, can't wait to see if the base model will improve on that
>>
File: Untitled.png (2.18 MB, 1265x851)
2.18 MB
2.18 MB PNG
>750/3000 steps
>body, tattoos and hair already learnt
what the fuk
>>
>>107360120
Are you training on 24vram or cloud?
>>
>>107360120
>what the fuk
and this is only the turbo model, the base model will learn this shit even faster
>>
>>107360118
define slopped
>>
File: unnamed (11).jpg (177 KB, 1024x572)
177 KB
177 KB JPG
Prompt i used in case any turb cucks want to try is

1994 Paramount Pictures soundstage, medium close-up behind-the-scenes photograph during filming of Star Trek: Generations, exact same framing and lighting direction as reference image but now shot from only a few feet behind the camera, Panavision Panaflex Platinum 35mm camera with anamorphic lens very close in foreground, camera operator's hands visible on follow focus, cinematographer leaning in, Rick Berman or Jonathan Frakes standing right next to camera watching small CRT video assist monitor, actors clearly visible and large in frame, 1990s crew in polo shirts khakis bandanas Nike sneakers fanny packs, Mole-Richardson 10K and Dino lights close by, C-stands and sandbags in foreground, thick
>>
>>107360130
define deez nuts
>>
>>107359812
yes I wasted a whole 2 bucks, nvm I'm trying ai-toolkit now
>>
>>107360137
>pircel
lmao, did you ask the model to zoom out from the original image?
>>
File: 1763037328431971.jpg (1.76 MB, 1536x2048)
1.76 MB
1.76 MB JPG
>>107360069
>>
ai-toolkit released support to train z-image loras has anyone tried?
>>
>>107358557
>>107358472
What model is that? On a work trip so stuck phonelurking
>>
>>107360153
did you not read the thread? someone is training something right now
>>
>>107360153
check the thread you peabrain
>>
File: 1762768175960097.png (2.99 MB, 1920x1080)
2.99 MB
2.99 MB PNG
>>107360137
>1994 Paramount Pictures soundstage, medium close-up behind-the-scenes photograph during filming of Star Trek: Generations, exact same framing and lighting direction as reference image but now shot from only a few feet behind the camera, Panavision Panaflex Platinum 35mm camera with anamorphic lens very close in foreground, camera operator's hands visible on follow focus, cinematographer leaning in, Rick Berman or Jonathan Frakes standing right next to camera watching small CRT video assist monitor, actors clearly visible and large in frame, 1990s crew in polo shirts khakis bandanas Nike sneakers fanny packs, Mole-Richardson 10K and Dino lights close by, C-stands and sandbags in foreground, thick
:(
>>107360155
Z-image turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>>
>>107358580
Can you make the same video with Frieren?
>>
>>107360158
very helpful anon wow incredible why didnt i think of that? theres zero documentation on how to do shit how about someone shares
>>
File: ZiMG_00101_.png (3.52 MB, 1344x1728)
3.52 MB
3.52 MB PNG
>>107360058
Z is godly
>>
>>107360150
yea just upload any image and say behind the scene image, or medium close up and etc.
>>
File: 1761043217727930.jpg (1.76 MB, 1536x2048)
1.76 MB
1.76 MB JPG
It does actually feel like the model gets better at understanding expressions and emotions if you translate the prompt to chinese first.
>>
File: zimg_0003.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>107360128
24GB 3090 and i can sorta gen in comfy while i train the fucking lora?
> https://files.catbox.moe/3j1t21.png
>>
>>107360180
can't wait to try that on z-image edit
>>
>>107360120
have you trained on nudes?
>>
File: ComfyUI_00426_.png (1.42 MB, 1200x800)
1.42 MB
1.42 MB PNG
>>
>>107360185
oh hi! arent u the anon that made the hailey rose lora?
>>
>>107360197
dataset is nudes with face removed so we'll see
>>107360185
low-res test gen in catbox but we're very early in training
>>
Is sage and flash attention two different versions that can be installed at the same time, or does one take over the other?
>>
File: ZiMG_00104_.jpg (481 KB, 1344x1728)
481 KB
481 KB JPG
>>107360179
>>
>I changed the order of the prompt by translating to ching chong language
sigh
>>
File: 1746812561889634.jpg (1.97 MB, 1536x2048)
1.97 MB
1.97 MB JPG
So what i get about this model is that my entire library of loras will be obsolete within few months, freeing a shit load of space on my PC.
>>
>>107360229
my thoughts exactly
>>
File: 1744460941781911.png (624 KB, 1378x1647)
624 KB
624 KB PNG
https://xcancel.com/bdsqlsz/status/1994336717587845601#m
hmm...
>>
>>107360206
yes sir
>>
File: ZiMG_00107_.jpg (464 KB, 1344x1728)
464 KB
464 KB JPG
>>107360219
>>
File: 1759826607698883.jpg (1.61 MB, 1536x2048)
1.61 MB
1.61 MB JPG
>>
>>107360249
Yes, its distilled alright, seems like its already destroying the quality of the model

I'm starting now too
>>
>>107360229
That AK/pistol hybrid is absurd but I love it.
>>
>>107360267
>seems like its already destroying the quality of the model
yes, that's why it's essential to get a good base model, just so that we can actually train it
>>
>>107360273
It defaults to AK every time you prompt a gun without specifying it. Sovl.
>>
Is there any way to gen a video from image on 1660S, even if it takes a long time?
>>
>>107358368
>Z-Image-Turbo
I wish it could generate dicks
>>
File: ZiMG_00110_.png (3.54 MB, 1344x1728)
3.54 MB
3.54 MB PNG
>>107360252
>>107360249
good to see you back here! been a while
>>
>>107360229
Yes, unless it trains like shit, this model will own the entire image gen market
>>
>>107360284
the end game is to train a lora of your own dick
>>
A "distilled" high step model would be better than Z-Image-Base and I assume they're training just that
>>
>>107360320
no its a non distilled base model which means many here will seethe because itll be more difficult to get good images. on the upside, presumably, the quality ceiling of base will be higher than turbo
>>
File: Untitled.png (1.72 MB, 715x1181)
1.72 MB
1.72 MB PNG
>>107360267
good luck with the training. i wouldn't take the sample images as any indication of final outputs though.

>>107360290
thanks, i've been going hard dumping all my flux loras to civit. moving to ZIT, since Flux.2 is way too big for me to train locally.
>>
>>107360320
no distilled, distillation makes the model too hard to finetune
>>
>>107360340
what resolution are you training on, im training at 768 and 1024 but seems overkill, its going to take forever

my_first_lora_v1: 2%|#3 | 53/3000 [07:05<5:28:13, 6.68s/it, lr: 1.0e-04 loss: 1.275e-01
>>
File: ZiMG_00119_.png (3.45 MB, 1344x1728)
3.45 MB
3.45 MB PNG
>>107360290
>>107360340
yes have been following you there. interesting to see how your bake will go on this one man!
>>
File: emoji.png (784 KB, 1406x863)
784 KB
784 KB PNG
Emoji sort of work.
>>
>>107359980
Why is the res so low?
>>
File: ComfyUI_00425_.png (1.49 MB, 800x1200)
1.49 MB
1.49 MB PNG
>>
>>107360337
>which means many here will seethe because itll be more difficult to get good images
Anon will almost immediately claim its DOA because their 1:1 comparison with their prompt and settings tuned for Turbo will look like ass on base
>>
File: 1755510402732898.png (52 KB, 220x268)
52 KB
52 KB PNG
>>107360366
based
>>
>>107360366
lol
>>
>>107360366
use facemask emoji
>>
File: file.png (813 KB, 1920x1080)
813 KB
813 KB PNG
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/discussions/26
>>
Fresh

>>107360388
>>107360388
>>107360388
>>107360388
>>
>>107360366
lmao thats crazy
>>
>>107360358
training this one at 512 just to get it out the door, then i'll attempt higher ones and fine tuning settings depending on what i get
>>
>>107359719
he is not crossing in a designated place
>>
>>107360279
Yep, with grok imagine or sora 2
>>
>>107358466
How did you make this? Didn't realize AI could make cute feet like girl on the left kek
>>
>>107360366
sort of? looks like they work 100% to me



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.