[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107538552

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
It is finally done!
>>
File: Chinse culture status?.png (82 KB, 1252x361)
82 KB
82 KB PNG
Small reminder that we won.
https://github.com/kohya-ss/musubi-tuner/pull/778
>>
File: 303243.png (328 KB, 608x608)
328 KB
328 KB PNG
>>
>>107540693
Why isn't zit base in the links?
>>
File: 1758060415863983.png (3.11 MB, 1600x1344)
3.11 MB
3.11 MB PNG
is the base model in the room with you right now?
>>
>>107540748
the model isn't released. just the training code
>>
>>107540752
>just the training code
based, we're getting there, we're saved!
>>
>>107540718
Will be training on this like squirrel on crack

Glory to the chinks!
>>
File: 041428_xxxx.mp4 (2.94 MB, 1184x704)
2.94 MB
2.94 MB MP4
Miku also won
>>
>>107540785
kek
>>
>>
>>107540785
lmao
>>
>>107540785
kryptonite for jeets.
>>
Is there a way to lower lora rank on the go like model quants in loader?
>>
formally requesting the prompt for the mirror selfie image in the collage
>>
>>107540804
No
>>
>>
>>107540785
>>
File: whotao.png (2.02 MB, 1080x1920)
2.02 MB
2.02 MB PNG
>>
>>107540815
Six finger hand and two deformed versions of the character in the background.
>>
>>
File: ComfyUI_00007_.png (2.5 MB, 1504x1024)
2.5 MB
2.5 MB PNG
The antichrist doesn't want you to know this, but with ComfyUI you can just break models and bend them to your will.
>>
Might be a stupid question, but do you NEED to do the sampling during lora training or can you rawdog everything?
>>
has someone implemented fuzzy search for the model/lora/whatever lists in ComfyUI yet or do I have to code it myself?
and by myself I mean Gemini
>>
>>107540848
Once you're comfortable enough you don't need to sample during training
>>
File: shallwe.png (2.18 MB, 1080x1920)
2.18 MB
2.18 MB PNG
>>107540841
>>
File: 1751734628852401.jpg (386 KB, 1280x1280)
386 KB
386 KB JPG
PromptManager gives interesting prompts.
>>
>>
>>107540848
No you can just rawdog it.
Sampling is there so that you can check live during the training whether your lora seems to be learning what it is supposed to learn good enough. If not you can cancel it early.
I just rawdog it and see afterwards if it worked or not. If not retrain.
>>107540855
The latter
>>
>>107540865
>Sampling is there so that you can check live during the training
Yeah, I train in my sleep so whatevs...thx.
>>
>>107540861
>PromptManager
what's that?
>>
>>107540877
https://github.com/BigStationW/ComfyUI-Prompt-Manager
>>
Why doesn't ksampler advanced let us control all parameters at all steps?
>>
>>
File: 1735102758875070.png (3.44 MB, 1024x1536)
3.44 MB
3.44 MB PNG
>>
File: ComfyUI_02432.png (3.9 MB, 1536x2048)
3.9 MB
3.9 MB PNG
Where do you guys think Base will land, 24GB or 32GB of VRAM?
>>
File: goof.png (18 KB, 517x316)
18 KB
18 KB PNG
From what folder does it pull? Neither my text encoder or my llm folder show up, just the cloudshit.
>>
>>107540918
i hope 32gb vram so most of you idiots can't do shit
>>
>>107540817
Nice
>>
>>107540926
It's guidance distilled so it'll be 6-12GB.
>>
>>107540923
>From what folder does it pull?
you have to read what it's written anon
https://github.com/BigStationW/ComfyUI-Prompt-Manager?tab=readme-ov-file#instructthinking-llms
>>
>>107540893
do you find vram is released fast enough to prevent problems using llama?
>>
>>107540923
cumfart/models/LLM/GGUF/ ?
>>
>>107540935
why aren't you pulling it from the normal folders?
>>
>>107540950
how should I know?
>>
>>
File: Z-image turbo.png (1.36 MB, 1280x720)
1.36 MB
1.36 MB PNG
>>
>>
>>107540971
still no model that can do ringworlds / halos, especially not with the shadow squares.
>>
>>
even if base drops, it will take at least a few months for decent finetunes to drop
i'm not holding my breath
>>
>>107541002
I'll have my fun with it, it'll be able to do edit shit
>>
>>107540974
Nice
>>
>comfyui update
>changes lots of stuff
>can't even click on the image preview - the click event's not even used.
>well maybe they fixed reroute to work on float
take a guess guys do you think reroute works on float?
>>
>>107541013
Thanks
>>
>>107541026
He's wrong.
>>
File: 1739786882467676.png (3.6 MB, 1920x1080)
3.6 MB
3.6 MB PNG
>>107540991
>no model that can do ringworlds / halos
yeah :(
>>
>>107540860
>>107540841
Sexo with mother and daughter
>>
Poll
"If ONLY ONE of these were to be released, what would be your pick?"

https://poal.me/1ceqwm
https://poal.me/1ceqwm
https://poal.me/1ceqwm
https://poal.me/1ceqwm
https://poal.me/1ceqwm
>>
>>107541032
No thanks
>>
>>107541045
Sage Atention 3
>>
>>107541032
Seethe somewhere else! Fuck off!
>>
>>107541045
>what would be your pick?
the one that was promised and has already a training script lol >>107540718
>>
File: ComfyUI_00016_.png (2.88 MB, 1504x1024)
2.88 MB
2.88 MB PNG
>>107540847
>>
>>107540946
VRAM isn't released at all by default. ZiT is working in the remaining vram + offloading. I'm still looking what can be done better.
>>
File: ComfyUI_00342_.mp4 (399 KB, 640x640)
399 KB
399 KB MP4
>>
>>
cute, hk416 is cute!!!!
>>
>>107541089
is this comfy?
>>
>>107541045
(You) are not going to be able to run Wan 2.5 in your system with sane speed, so what would be the point?
>>
>>107541020
shotted and dragged
>>
File: 113696833.jpg (465 KB, 3040x2080)
465 KB
465 KB JPG
>>
>>107541101
My system has two 3090s, are you sure about that?
>>
and 2.5 would be pure ass without loras
>>
>>107541101
Runpod my friend
>>
>>107540848
No, it's just a sanity check to see how training is progressing, also the sampling output is usually quite far from the quality that you can expect when using a dedicated inference program like Comfy, Forge etc
>>
>>107541020
Have they fixed the UI with the botched stop button at least ?
>>
File: inexcusable.png (13 KB, 458x128)
13 KB
13 KB PNG
you niggers are retarded
>>
>>
>still forcing an unfunny meme
>>
>>107541111
>and 2.5 would be pure ass without loras
you're such a fucking idiot. you run it at 128x64 resolution and you have THE BEST LOCAL AUDIO GENERATION MODEL OF ALL TIME AS A FREE SIDE EFFECT
>>
>>107541142
Yes but we're also right.
>>
>>107541109
Yes. You wouldn't even be able to run a SAAS image generator like Dall-E 3 with sane speed on that machine, let alone video generation.
>>
>>107541137
yeah they added it back (after two weeks lmfao). it's a real shit show at cumfartorg
>>
File: goodmorningsaar.mp4 (789 KB, 480x842)
789 KB
789 KB MP4
>>107541159
>it's a real shit show at cumfartorg
no kidding
>>
File: Z-image turbo.png (1.77 MB, 1280x720)
1.77 MB
1.77 MB PNG
>>
>>107541155
well yeah, base is actually coming and wan is actually not, that's for certain. ltx2 is the only hope right now and everyone is already doomed about it because its implied (((they're))) pozzing the model in preparation for release as we speak
>>
File: ComfyUI_02479.png (3.5 MB, 1536x2048)
3.5 MB
3.5 MB PNG
>>107541045
I don't think I have the patience for newer, heavier video generation. The 4-5min I spend now rendering 7sec of video is absolutely brutal.
>>
>>107541142
No, I would agree here, Wan is the bigger get. As much as I would like base ZImage, de-distilling Turbo should we have been forced to do it is still pretty alright. Video models have also been stagnant and Wan 2.2 was almost 5 months ago even though the last of the tools was around 2 months ago.
>>
>>107541142
omg I'm retarded for not wanting a gigantic bloatmaxx vid model that would take a century to inference! how could I choose the image model that's perfect for dethroning sdxl!
>>
>>107541196
Ikr, you're so silly anon!
>>
File: z-image_00641_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
File: originals.jpg (2.7 MB, 3528x4416)
2.7 MB
2.7 MB JPG
Does this thread still have that one faggot who posts realistic images and refuses to answer any questions about what checkpoints/loras he uses because "someone posted my workflow on reddit and got upvoted for it" and he's butthurt that he missed out on his precious internet points?
>>
>>
File: ComfyUI_00024_.png (2.25 MB, 1504x1024)
2.25 MB
2.25 MB PNG
I can now generate unlimited Wired Magazine covers.
>>
>>107541002
you people underestimate the power of the coomer side of the tech sphere
it took 24 hours to get non-pepperoni pizza nipples in turbo, a week later it's blowing up civitai. I give it a full month and we see the best of the best finetunes. one week and we get the best merges/loras from illustrious sent over to z-image.
seriously, people will be fast. the hype has been building up with no slowing down.
>>
File: Z-image turbo.png (3.32 MB, 1920x1080)
3.32 MB
3.32 MB PNG
>>
I don't want a stupid Z-base and to wait days for a lighter version. just give us better adult Z-Loras
>>
>>107541176
twinflow will make it 2x as fast (taking it down from 4-step loras to 2-step loras) eventually

>>107541183
any anon who doesn't understand the value audio adds to video needs to use sora and feel the magic

>>107541196
ignoring my argument completely is not a good start
>>107541154
and a team would make a few-step version of it for the clout. the lightx2v team would be able to handle a model 3x larger than WAN 2.2 on their same hardware since Twinflow uses 3x less memory

and we already know the Twinflow team themselves have compute enough for a 20B model


>>107541214
who cares you can make all that with Z turbo now
>>
File: z-image_00650_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
File: 1755806629582420.png (1.91 MB, 1120x1440)
1.91 MB
1.91 MB PNG
that ultra flux vae is actually a straight upgrade, it just makes things sharper.
>>
>>107541232
you know what, training loras on Z-base might help us making better loras and we could use them on Z-turbo
>>
File: ComfyUI_00026_.png (1.99 MB, 1504x1024)
1.99 MB
1.99 MB PNG
>>
>>107541254
>that ultra flux vae
can you provide the link anon
>>
>>107541254
>straight
hue hue
>>
>>107541239
>ignoring my argument completely is not a good start
for being such a pooper I'm ignoring everything you say
>>
>>
>>107541290
leave my scat fetish out of it
>>
>>107541254
someone do a side by side comparison, i won't habeeb it till i twinkie house it - i mean see it.
>>
>stealing /adt/ prompts now
>except this time with a hag
>>
File: ComfyUI_temp_cvmgc_00003_.png (2.64 MB, 1344x1856)
2.64 MB
2.64 MB PNG
>>107541095
>>
>>107541230
Kino
Prompt?
>>
>>107541324
Nice
>>
File: z-image_00662_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>107541334
here o/
https://files.catbox.moe/x61m7c.txt
>>
>>107541276
https://huggingface.co/Owen777/UltraFlux-v1/tree/main/vae

>>107541302
too lazy to download a single VAE file and test it on Z? whatever, here's the same pic with normal Flux VAE. Notice it's not as sharp. there is a slight line artifact at the edge sometimes with chroma, still worth it.
>>
File: 1739755612995898.png (1.36 MB, 1280x720)
1.36 MB
1.36 MB PNG
lul it kinda knows Phryge if you do some boomer description of it
>>
>>107541308
Why have you been namedropping your general all day?
>>
>>107540898
You need to chain the ksamplers.
>>
>>107541347
Prompt her doing something interesting.
>>107541355
Thanks
>>
>>107541324
>chromosome reveal
whew
>>
>>107541365
>too lazy
no, but also yes. and placebo/snake oil trends happen all the time in these threads so i don't want to waste time on more.
that said, looks legit. jesus thats almost a night and day difference. i'm giving it a shot.
>>
>>107541365
>https://huggingface.co/Owen777/UltraFlux-v1/tree/main/vae
interesting
https://imgsli.com/NDM0MzQy
>>
File: ComfyUI_temp_kvlzp_00001_.png (2.06 MB, 1176x1472)
2.06 MB
2.06 MB PNG
>>107541214
yeah, (sadly) I'm still here, after all these months, you still couldn't find the model kek
>>
>>107541347
Make her do ass-to-ass.
>>
File: ComfyUI_temp_hfjmk_00002_.png (1.87 MB, 1176x1472)
1.87 MB
1.87 MB PNG
>>
File: ComfyUI_02501.png (3.7 MB, 1536x2048)
3.7 MB
3.7 MB PNG
>>107541254
Bit of ringing on real high contrast edges, but overall, not too bad.
>>
>>107541422
oh, a different model? isn't it just sdxl?
>hiding hands
>skintone issues
>>
>>107541476
the faults and glitches make think of chroma
>>
File: ComfyUI_00036_.png (1.83 MB, 1504x1024)
1.83 MB
1.83 MB PNG
>>
>>107540693
can this wan2.2 thing do nsfw video?
>>
even img2img is the best
>>
>>107541504
no. its only really good for generating videos of rats swimming in bowls of soup.
>>
>>107541515
i fucking love soup
>>
>>107541515
>rats swimming in bowls of soup
my fetish!
>>
File: ComfyUI_temp_empdf_00012_.png (1.83 MB, 1176x1472)
1.83 MB
1.83 MB PNG
>>107541462
>>
>>107541365
>>107541419
https://imgsli.com/NDM0MzQz
that's actually pretty cool, thanks for the vae anon
>>
>>107541422
>>107541530
we need a proper investigation to find this wf
>>
base base base base base
>>
File: ComfyUI_00038_.png (1.7 MB, 1504x1024)
1.7 MB
1.7 MB PNG
a smiling cat
>>
File: ComfyUI_temp_empdf_00020_.png (2.04 MB, 1176x1472)
2.04 MB
2.04 MB PNG
>>107541422
>>
>>107541365
>>107541532
Looks just like postprocess sharpening.
>>
>>107541573
sharpening, that's what i was thinking it's doing. i forget all the image manip terms but it's definitely doing something very specific and enshittening.
it definitely reduces how blurry z-img gens look, but it adds artifacts from what looks like post process sharpening. like >>107541470 and adding a slight halo to white colors.
>>
File: fluxlol.png (1.18 MB, 1365x908)
1.18 MB
1.18 MB PNG
This new vae just makes the zit banding even worse lol.
>>
File: 1750434156237454.png (2.73 MB, 1920x1088)
2.73 MB
2.73 MB PNG
>>
>>107541591
But it works tho. Although I'm not sure if it is actually so good or if it has a photoshop filter baked in.
>>
File: 00000-3391018998.png (572 KB, 1024x1024)
572 KB
572 KB PNG
>>107541566
That just looks like some of the broken gens I get when the settings aren't correct
>>
File: ComfyUI_temp_empdf_00024_.png (3.17 MB, 1344x1728)
3.17 MB
3.17 MB PNG
the ultimate slop thrill is grabbing other people images and run them thru a promptless img2img with a high denoise value
>>
>>
>>107541601
>it works
i guess if you wanna make all your gens look shitty and improperly upscaled kek. it does reveal that all z-img gens are blurred by the vae possibly.
>>
>>107541566
a smiling cat 2
>>
>>
File: ComfyUI_temp_empdf_00027_.png (3.11 MB, 1920x1088)
3.11 MB
3.11 MB PNG
here saar, I made slop out of your slop
>>
File: ComfyUI_00040_.png (1.87 MB, 1504x1024)
1.87 MB
1.87 MB PNG
>>107541602
that's not remotely similar.

>>107541619
here
>>
File: image_00130_.jpg (949 KB, 2048x2048)
949 KB
949 KB JPG
Trying schizoprompting with Z, not having much luck
>>
>>
File: image_00134_.jpg (1.21 MB, 2048x2048)
1.21 MB
1.21 MB JPG
>>107541636
a little better
>>
THEY CHANGED A FONT FOR NO REASON
>>
>>107541156
>SAAS image generator like Dall-E 3 with sane speed on that machine
Are you retarded? The only thing dalle 3 had special is the fact it was trained on more data
It did mangled text and elements/anatomy every once in a while, it had signs the model was not that big

>>107541156
>let alone video generation.
I can run Wan 2.2 on my machine better than most people can since I can offload both models (high noise and low noise) to different GPUs
>>
>>107541666
saar, please understand! work is progressing! thank you for noticing our hard work saar
>>
>>107541652
do you have good luck with temple gens (like with columns, like giant churches)?
>>
File: ComfyUI_00044_.png (1.73 MB, 1504x1024)
1.73 MB
1.73 MB PNG
zit is full of crazy loras. idk the triggers, but you can force them out.
>>
>>
>>
File: ComfyUI_00043_.png (1.89 MB, 1504x1024)
1.89 MB
1.89 MB PNG
>>107541713
compare.
>>
>>107541718
:(
>>
>>107541723
tf is that cat doing there
>>
File: image_00142_.jpg (1.66 MB, 2048x2048)
1.66 MB
1.66 MB JPG
>>107541700
postan a few
>>
File: image_00141_.jpg (1.45 MB, 2048x2048)
1.45 MB
1.45 MB JPG
>>107541740
>>
>>107541156
I still want it leaked. I don't care if it took an hour to gen, I KNOW it can gen what I want and the stupid dog is the only thing blocking it.
>>
>>107541740
>>107541747
sd1.5 vibes
>>
File: image_00140_.jpg (1.28 MB, 2048x2048)
1.28 MB
1.28 MB JPG
>>107541747
>>
File: ComfyUI_02858_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
File: image_00144_.jpg (1.33 MB, 2048x2048)
1.33 MB
1.33 MB JPG
>>107541752
>>
File: panel.png (7 KB, 329x53)
7 KB
7 KB PNG
PLEASE EVEN MORE FIBER THIN FONT WHO ASKED FOR THIS
>>
File: image_00143_.jpg (1.41 MB, 2048x2048)
1.41 MB
1.41 MB JPG
>>107541761
>>
>>107541754
Can the model do panties down? (in the leg shin area)
>>
File: vae.png (2.17 MB, 1391x925)
2.17 MB
2.17 MB PNG
Crunchy as fuck, but for upscaled/resampled pics might be great. But the raw distill output is shit. Gonna try the vae on chroma.
>>
animanon won
>>
File: ComfyUI_00047_.png (1.97 MB, 1504x1024)
1.97 MB
1.97 MB PNG
>>107541736
I chain samplers. It's in the second prompt.
>>
>>107541764
That's like 2 whole days grif- work, pls understand
>>
>>107541754
instantly thought of The Sims
>>
File: 00172-343929487.png (2.5 MB, 1344x1728)
2.5 MB
2.5 MB PNG
>>107541422
It's hilarious that you think hiding the name of a model makes you some kind of elusive mystery man
>What's his model, what could it be???
>Mwa ha ha, I'll never tell!
This is highschool edgelord level of larping and it makes you look pathetic
This doesn't earn you respect; it just makes you seem like a fucking loser
>>
>>107541896
Yes, Ran. Now go back to your discord.
>>
>>107541677
You are retarded, it has a shit ton of params, hence it's really big, it will not fit in 48gb vram by a long shot which means tons of block swapping which means slow.
>>
File: image_00157_.jpg (1.16 MB, 2048x2048)
1.16 MB
1.16 MB JPG
what?
>>
>>107541906
>See someone you disagree with
>Immediately assume they are some boogeyman from a previous thread
You guys have enough self-awareness to recognize that you sound like schizos, right?
>>
File: image_00158_.jpg (1.29 MB, 2048x2048)
1.29 MB
1.29 MB JPG
>>107541920
Yeah Z is no good for schizoprompts, just turns into SD
>>
>>107541750
no it isn't.
>>
>>107541919
Show me a single dalle3 image that can do text better than Qwen-Image then, retard (a 20b model that does run on my machine)
>>
>base
it wont happen
>but..
no
>but look at this...
never
>>
>>107541932
Are you using any ddims or res_Xm?
>>
File: ComfyUI_temp_myczu_00016_.jpg (373 KB, 2048x1536)
373 KB
373 KB JPG
Will we ever be free of the curse of sameface
>>
>>107541954
SKILL ISSUE
>>
>>107540750
can you fchange the font
>>
>>107541239
>audio
I'm not convinced having it done in the same model is fine unless you have a lot of pristine footage to train on and only Youtube really has that.
>>
>>107541954
turbo is a distilled model, that problem will be less obvious on base
>>
File: image_00151_.jpg (1.4 MB, 2048x2048)
1.4 MB
1.4 MB JPG
>>107541951
rawdogging the default settings
https://files.catbox.moe/9nad6i.png
>>
File: wan_00043.mp4 (1.21 MB, 912x720)
1.21 MB
1.21 MB MP4
>>
File: ComfyUI_00059_.png (1.93 MB, 1504x1024)
1.93 MB
1.93 MB PNG
>>107541740
cool

>>107541761
yeah, it's realistic in sizes. idk why, but super fantasy (sci fi or magic tier) colossal scale is hard to generate. It's like tiny people, but furniture & doors remain somewhat constrained.
>>
>>
>>107541968
Can we stop with this meme already? I got you vramlets got thrown a bone with z-image, but the base model won't magically be Nano Banana Pro tier
>>
>>107542034
>but the base model won't magically be Nano Banana Pro tier
but unlike nano banana or bloatmodels you can workyour way around the limitations once you run into them
>>
>>107540443
>13 cents a gen
>>
File: anon_is_retarded.jpg (871 KB, 2658x1328)
871 KB
871 KB JPG
>>107541919
>>107541934
Left Qwen-Image, right Dalle-3.
>>
>>107540772
Hell yeah I converted my 100+ datasets of VNs already to natural language using Jailbroken Gemini 3
>>
File: z_00155_.png (1.46 MB, 1280x1024)
1.46 MB
1.46 MB PNG
>>
>>107542084
Only boomers and millennials like me will remember this. Sadly most of the thread are zoomers
>>
File: 1741859945944716.png (2.76 MB, 1248x1824)
2.76 MB
2.76 MB PNG
$66,697/week selling Ai at 18
>step 1: get a job
>step 2: run facebook ads for boomers and sell your ai slop service
>step 3: ???
>step 4: profit
https://www.youtube.com/watch?v=7IUjT0AUdi8
>>
>>
>>107541954
Skill issue, sweetie. So it has been for a long time.
>>
>>107542057
So sharp and good aesthic Qwen....
>>
File: 1750552825176223.png (2.24 MB, 1440x1120)
2.24 MB
2.24 MB PNG
>>107541896
pretty sure he's just using some variant of pornmasterpro noob or an offshoot shitmix like alchemistmix.

>>107542113
the problem with having an easy remote job is that I have no motivation to grind and get money like this, and would rather just gen for fun.
>>
File: z_00164_.png (1.44 MB, 1280x1024)
1.44 MB
1.44 MB PNG
>>
>>107542017
The Bitch should have been beheaded.
>>
File: ComfyUI_02555.png (3.87 MB, 1536x2048)
3.87 MB
3.87 MB PNG
>>107541718
I do not care for your very sad image, Anon. Make her smile or something!
>>
>>
vote on which will be released next:
>sd 1.6
or
>z image base
>>
Has the mayli anon trained a mayli lora for z-image already?
>>
>>107542218
My vote is neither and my vote is the truth.
>>
>>107542113
fake edited html crap earnings to entice people to use their affiliate links, everybody knows that normal people HATE absolute HATE ai phone calls, that just bs
>>
File: z_00172_.png (1.33 MB, 1280x1024)
1.33 MB
1.33 MB PNG
>>
>>107542251
Truth hits hard at times
>>
>>107542251
lel nice
>>
I'm proud of this gen. It's the first time I think that the eye placement is actually wrong. The scale isn't really off, so it's only a little of the goal.
>>
File: ComfyUI_00069_.png (1.83 MB, 1504x1024)
1.83 MB
1.83 MB PNG
>>107542262
>>
>>
>>107542113
Take it from a boomer. If it was this easy, everyone is doing it.
>claim your services net you absurd amounts of money weekly
>build a community based on hopes and dreams
>profit off of that community based on your bs through ads, subscriptions, etc.
>>
>>107542288
yeah his actually business is youtube ad revenue
>>
Why do they insist on training models on both 2d and 3d
>>
File: 1747116858563110.png (2.23 MB, 1248x1824)
2.23 MB
2.23 MB PNG
>>107542183
the only way to make money with ai is by scamming people and i dont like it
>>
>>107542288
>>107542299
his whole youtube page is just money baiting crap, he sells his "courses" thats all what these zoomer entrepreneurs are doing, they all do the same tactic, show some edited html page with crazy earnings, write some money baiting crap "this 18 year old just made 100k in one month", "my AI influencer just made me win 10k in a week" , here is how, use my links to <insert AI SaaS crappy site affiliate link that will pay me for every user that registers>, oh and here is my the link for my $100 course, and my coaching and my 1-on-1 session, etc etc
AI is the new gold rush, everyone is selling prompts now, courses, etc.

a few years ago it was NFTs, then dropshipping, and now AI of course, they just hop on every trend they get so they can grift money out of it
>>
>>107542326
yeah it's the oldest trick in the book, if you're a boomer like me and the other anon, you recognize it immediately
common sense is if you're making 60k a week you don't need to spend weeks editing and publishing videos that basically give out your supposed secret sauce to making money
this kind of thing was already popular in the late 90s with people selling guides on "how to earn money on the internet"
>>
>>107542326
>>107542349
All the big yt youtubers don't know how to manually focus a dslr. But they will tell you which camera is good :^)
>>
File: ComfyUI_00073_.png (1.38 MB, 1504x1024)
1.38 MB
1.38 MB PNG
>>
>>107541227
Doesn't mean shit when 90% of the bakers apply some retarded dataset pruning strategy
>>
>>107542349
whats funny to me is that they dont even try with all money making methods, I mean AI Phone Calls, what kind of bs is that, lol . To me is just so obvious that they just ask ChatGPT to write a business idea, script, everything, they just copy paste and read for their videos, I mean why would you listen to some 18 year old kid about how to run a business lmao
>>
>>107542183
Interestingly I have some experience with this. I made a porn game as a passion project and it actually sold well enough to make it my main job. It completely sapped the fun out of life and sent me into a year long depression. I don't like being in the wage cage, but I feel happier having somewhere to go to than being at home all the time.
>>
File: ComfyUI_00074_.png (1.42 MB, 1504x1024)
1.42 MB
1.42 MB PNG
>>107542308
THIS IS THE POLICE!!! STOP SLOPPING!!!

>>107542392
>>
File: ComfyUI_temp_laryv_00003_.png (3.41 MB, 1536x1224)
3.41 MB
3.41 MB PNG
>car
>res3m
Boom, marketing material
>>
File: z_00181_.png (1.63 MB, 1280x1024)
1.63 MB
1.63 MB PNG
>>107542415
i've been remote for 5 years now, you really have to make a conscious effort to hang out with people outside of work otherwise you'll feel like a depressed hermit in no time
>>
>>107542051
It doesn't mean base will magically get the 'sameface' problem fixed, it's a 6b parameters model afterall
>>
>>107542251
Her expression is spot on
>>
>>107542472
No, but with local you can just train in other faces as loras.
>>
File: z_00182_.png (1.36 MB, 1280x1024)
1.36 MB
1.36 MB PNG
>>107542472
i'm actually really interested in the seed variance of base, i wonder if it is a core limitation of the model or it was introduced by the turbo distillation
>>
base doubters became real quiet today
>>
>>107542472
sameface is caused by basically a lora.
>>
>>107542485
It doesn't work like that retard, otherwise people would have "fixed" sameface with Flux, SDXL and other models
>>107542489
Lack of seed variation does seem to be a distillation problem.
>>107542501
Chroma is basically an undistilled 'pretrained' model and the sameface problem was present there too. The parameters are just not enough.
>>
>>107542466
Nice image, 60s + whimsical = win
>>
File: zimg_0127.png (2.27 MB, 1080x1440)
2.27 MB
2.27 MB PNG
you guys can scream about base all you like but i'm real happy with turbo and this promptmanager thing. best zero dollars i've spent all year
>>
>>107542519
Hypothetically, if there's a model without the same face problem, how would you get consistent faces with it?
>>
One trick that -could- help mitigating the sameface problem is by adding noise mid-inference, and the noise covering the faces being generated by chance, but of course that would not work 100% of the time
>>
>>107542519
What makes you think they haven't ?

Do you even know why you get sameface ?

It's because when training, it will generalize everything with a concept, meaning you will end up with a generalized female face, male face, children, old people etc. Only way to avoid this is to have specific sub-concepts, like 'Donald Trump male', 'Taylor Swift female' thus they care new concepts and will only be generalized within their sub-concept.

It's the same for everything, photo of a tree will be a generalized representation of all photos of trees, a pine tree will be a generalized representation of all photoes of pine trees, etc.
>>
>>107542553
i think the problem with z turbo is random noise always converges to the same thing given the same prompt, so not sure that would work. maybe if you'd add a different degree of noise to each face or smth
>>
>>107542519
>>>107542501 (You)
>Chroma is basically an undistilled 'pretrained' model and the sameface problem was present there too. The parameters are just not enough.
Why do you think they can remove a baked in lora?
>>
>>107542501
Yeah I've found ethnicity + woman yields a wide variety of faces. Sometimes loras can fuck with that
>>
>>107542540
Which lora is that?
>>
>>107542540
proompt
>>
>>107542540
Well, it's not really as if Base replaces Turbo.

Turbo is a distilled model made to create aesthetically pleasing images very quickly.

Base is a non-distilled model made to be further finetuned, it will be slower to generate with (though hopefully not much slower) and will not create as aesthetically pleasing images out-of-the-box since it hasn't been trained with such a bias.
>>
>>107542472
Qwen and qedit are massive models and they are just as rigid
>>
>>107542585
One thing I found works is frowning breaks the lora. no guarantees.
>>
>>107542489
Base is probably gonna be more varied but will need more careful prompting to produce highly aesthetic images because everything I've done with image gen so far suggests that distilled models and merge/finetune models that maximize anatomical correctness/image quality (as well as certain loras) make art styles and output variety much worse but make the output more consistent and defined.
>>
>>107542602
qwen and qedit were trained on synthetic data (z-image only trained on real data), and those models were finetuned with some rigid RHLF shit, the base model doesn't have that either
>>
File: zimg_0129.png (2.41 MB, 1080x1440)
2.41 MB
2.41 MB PNG
>>107542590
psxzstyle (zishy style)

>>107542594
courtesy of qwen so excuse the slop
https://pastebin.com/uZnmAU6i
>>
>>107542602
Like this anon said >>107542619 , Qwen was obliterated by training with a narrow dataset of synthetic 4o slop, it's not a good benchmark of how good large models can be
>>
>>107542625
>A woman with long, dark, wavy hair stands in the same room
lel
>>
File: wpx_00002.mp4 (315 KB, 512x512)
315 KB
315 KB MP4
>>
>>107542619
>z-image only trained on real data
:^)

ahahah
>>
>>107542625
>psxzstyle (zishy style)
Thanks, I would have never suspected that given that the showcase pics from Civitai mostly look like generic pro photo slop
>>
File: ComfyUI_02682_.png (1.61 MB, 1408x1088)
1.61 MB
1.61 MB PNG
>>
File: zimg_0149.png (1.94 MB, 1080x1440)
1.94 MB
1.94 MB PNG
>>107542601
yeah i'm still looking forward to it, but we've barely even had time to get bored of what we have already is all.
>>
>>107542601
>will not create as aesthetically pleasing images out-of-the-box since it hasn't been trained with such a bias.

We don't know yet if the model had post-training or if it is a pretrained-only model
They could very well pull a BFL (Flux Pro/Dev) and fine-tuned the >=30step >3cfg model on "good aesthetics"
I too would prefer if it was a model with zero post-training even if the outputs are not as pleasing, with lots of watermarked internet images, unprompted memes, lowres images etc, base Chroma is like that and it's soulful as fuck since it can do "organic" stuff and is easily steerable with loras
>>
>>
File: FACE ME.png (140 KB, 640x360)
140 KB
140 KB PNG
DO NOT HIDE. WHERE IS YOUR "CHINESE CULTURE" NOW? ANSWER US, COWARD
>>
>>107542308
>>107542113
This is no erotic, the feet are gross and uncanny.
>>
>>107542747
That's why spider schizo posted 1 time and then fucked off.
>>
File: 1765099061953846.jpg (459 KB, 1250x1566)
459 KB
459 KB JPG
>>107542653
it's true though :(
>>
>>107542728
Ideally, a really good model should be able to do everything, ranging from ChrisChan-tier artworks (or shitty deviantart artworks people used to do back in the day), to high-quality "cinematic" slop AI labs love so much
>>
File: this.png (83 KB, 225x225)
83 KB
83 KB PNG
>>107542747
>>107542765
Ikr, they all bark when everything is uncertain but once they realize they were doing retarded prediction they just leave instead of admitting they were acting retarded
>>
File: ComfyUI_02575.png (3.88 MB, 1536x2048)
3.88 MB
3.88 MB PNG
>>107542276
What am I looking at here?
>>
>>107542803
a FAS goblin? why you askin me
>>
File: 1750988909282202.png (1.48 MB, 1216x1024)
1.48 MB
1.48 MB PNG
>
>>
>>107542822
RUDE!
>>
>>107542853
kek
>>
>>107542865
Heh, kiddo, he is just stating facts.
>>
>>107541896
things we know about the lolcow catjak:
>half-black amerimutt
>on welfare
>not a single contribution to anything open source
>schizophrenic off their meds
>sonic and ben10 diaper scat enthusiast
>samefags
>uggo slopstyle gilfs
>worse than debo
>jealous of ani's talent
>had been malding for three years straight (and counting!)
>falseflags as people he doesn't like
>splitbakes
>>
File: chud_button_press_3.mp4 (419 KB, 640x480)
419 KB
419 KB MP4
>>
>>107542880
his tiny hand is great
>>
>>107542790
>Concern trolling is when some faggot shows up in a thread pretending to give a shit about you or the topic, but they’re really just slipping in thinly veiled insults or trying to derail shit with fake worry. Like “oh no I’m just concerned you’re wasting your life on this game” while they stir the pot and farm (You)s. It’s faggy, low-effort bait and everyone sees through it after the first reply.
t. kimi 0905
though the "low-effort bait and everyone sees through it after the first reply" is not quite true but eh it is what it is
>>
>>107542886
yea, the first last frame for WAN 2.2 is breddy gud so far. Just wanted to test it out.
>>
>>107540334
kek, he looks like he was genned with an early SD1.5 finetune then shoehorned in that way

>>107539485
noice.
moar? plz
>>
File: zit_00004_.png (1.66 MB, 1504x1024)
1.66 MB
1.66 MB PNG
>>107542625
thanks, turns out you can break zit without resorting to chaining samplers. Here, it's just busted using a prompt. I just put words in random extra places, copy/paste around, just butcher the text.

>>107542803
two prompts, basically. kinda like if you did i2i and you start the sampler at like 6/9. idk. the thing is each gen is different, which is why I'm a huge skeptic of online services that just let you control the prompt.
>>
>(((low angle view, from below)))
>get a high angle selfie
thanks
>>
>>107542625
>>107542933
oh yeah, I also use a strong weird NAG
>>
>>107542853
Finally the end of the jews!

I didn't think I could like Z-Image more, but now I do
>>
>>107542896
You are very gifted.
>>
>>107542936
>(((low angle view, from below)))
(((OY VEY)))
>>
>>107542944
It's weird, because it can't do swastikas.
>>
>>107542948
There's really not much to it, if you set up the first and last frames correctly
>>
>>107542979
Yeah but you are an artist.
>>
>>107542994
trapped within a cage of logic
>>
>>107542950
(((rubs hands))), 1girl
>>
>>107541072
nice, photorealism is for fags
>>
File: 1765661191000354m.jpg (85 KB, 723x1024)
85 KB
85 KB JPG
local can't stop winning!
>>
>>107543032
The bubble is bubbling
>>
>>107543032
they're giving it to nvidia, who is giving it to... oracle? who is giving it to... I dunno, probably burning it in a big pile and dancing around it like pygmies
>>
File: zit_00005_.png (1.66 MB, 1504x1024)
1.66 MB
1.66 MB PNG
>>107542933
I want to emphasize how weird this is. This gen looks like I messed something up.

All I messed with is the prompt, it's full of jumbled words that are messed up and nonsense.
>>
>>107543051
if that were the case then loss would be 0. there is a hole in the boat
>>
>>107543015
does your mind overflow with poetry and distant horizons?
>>
>>107543106
>>107543106
>>107543106



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.