[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor application acceptance emails are being sent out. Please remember to check your spam box!


[Advertise on 4chan]


So Sweet and Naive Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107452633

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>107455917 (Cross-thread)
>google were too big of a pussy
desu they've gotten pretty based recently, nano banana pro is still outputing IP characters and celebrities, I thought they were going the OpenAI route, make the model based the first week to get engagement and then cuck it
>>
File: file.png (137 KB, 2627x336)
137 KB
137 KB PNG
xir's right chud!1!1!!
>>
>>107455989
It's all so predictable. I feel like that guy from the matrix that says the green lines on the screen are people and stuff.

Where you see a promise to open source the model, I see a bait and switch scheme with a distill as the carrot.
>>
>>107455942
>IP characters and celebrities
Naked?
No?
>>
>localkeks really believed alibaba would spoonfeed them a fast uncensored 6b hentain base model trained on loli
brownoidal delusion at its finest. local is eternally losing. you will pay for novelai's finetune
>>
>>107456013
>Where you see a promise to open source the model, I see a bait and switch scheme with a distill as the carrot.
Alibaba promised Wan 2.1 -> they released it
Alibaba promised Wan 2.2 -> they released it
Alibaba promised Qwen Image -> they released it
Alibaba promised Qwen Image -> they released it
Alibaba promised Z-image base/edit -> ...
See the pattern?
>>
baba paid bghira to pretend like she was going to report tongy so they could ban her and look like even more of an open sores hero
>>
File: 1739804598967758.jpg (1.38 MB, 2048x2048)
1.38 MB
1.38 MB JPG
>>107456015
>but muhh coom
not everyone is an addicted coomer anon, I want to make quality memes, and NBP is the best place for that
https://www.reddit.com/r/nanobanana/comments/1pfhd7q/spaghetti/
>>
>>107456026
Yeah I don't see 2.5 or even the latest QIE. I see a company moving from open source to SaaS and I see a fool who defends them despite them being about to plunge a knife into their back.
>>
>>107456051
>2.5
was never promised
>>107456051
was never promised

See, this is how it works at Alibaba
>When they promise something -> they always released it
>When they don't promise anything -> It's likely API only
>>
File: file.png (91 KB, 2101x320)
91 KB
91 KB PNG
>>107455989
even on trooncord comfy gets clowned on lmao
>>
>>107456065
>T-they released the distill before the base model because they wanted to epically pwn BFL!
>I-it's not because they can't and won't.

You are a retard. A fool. Incapable of understanding Chinese culture (a point you keep dodging around) and will never get the base model.
>>
File: that is correct.gif (118 KB, 220x204)
118 KB
118 KB GIF
>>107456075
>T-they released the distill before the base model because they wanted to epically pwn BFL!
That is correct, and it's the most brilliant marketing strategy I've ever seen in the field of AI. They destroyed BFL's credibility with this move.
>>
>>107456093
He looked so young in that image.
>>
File: DO IT.png (346 KB, 2635x934)
346 KB
346 KB PNG
>>107456070
do it Comfy, hire the turkish god!
>>
>>107456097
I feel like your 70's is your roughest decade, you're getting so old between 70 and 79, it's the real moment of decline
>>
>>107456051
This would be a far better argument against BF lab, which did fuck the community in every way possible and only answered by wishy washy "someday now" for 7 month for releasing their new model.

As it is now, it's what, 3 weeks? I can wait another week. It's not like it's BFL buckbreaking people for 1 years straight. You're overeacting.

If Alibaba proves itself BFL tier then it would be a problem, but Alibaba absolutely proven itself not BFL tier until now. Benefit of doubt, you know?
>>
What happened to Qwen Edit? There was supposed to be a November release for that too.
>>
>>107456116
Oh, you just ASSUMED there would a release? How entitled of you. Did they ever SAY there would be a release?
>>
>>107456123
>Did they ever SAY there would be a release?
they didn't, so yeah, can you guys stop making stuff up and only stick to the facts?
>>
>>107456116
>What happened to Qwen Edit?
that Qwen team saw how powerful Z-image edit is and they gave up kek, they know no one is gonna touch their giant 20b slopped model anymore once Z-image base and edit are gonna be released
>>
File: 1754872144818773.png (179 KB, 1551x418)
179 KB
179 KB PNG
Something unexpected definitely happened.
>>
>>107456141
>and they gave up
...on open source releases that is!
>>
>>107456153
The model clearly didn't pass the Chinese open sourcing criteria.
I don't understand why people don't know this is a thing.
>>
>>107456167
>the Chinese open sourcing criteria.
yet the most uncensored model we've gotten those past 2 years was HunyuanVideo, this shit can make vaginas and dicks lol
>>
You guys are still arguing about the fucking base model?
>>
>>107456167
I have a simple question though, if they knew they would never release the base and edit model, why did they promise it then? They could've just said "we only release turbo and the rest will be API, just to be clear" and no one would give a fuck, lying about it makes everything even worse
>>
>>107456188
>why did they promise it then?
The didn't. You just do not understand Chinese culture and what they mean when they say things.
>>
Personally I'm fine with Turbo. Flux 2 proved how powerful distilled models can be. We as the community can just de-distill it ourselves. And SDXL is still such a powerful model, it's not like we need a new base anyway.
>>
File: 1761448121969895.jpg (1.14 MB, 1850x2625)
1.14 MB
1.14 MB JPG
>>107456196
>The didn't.
>>
>>107456206
Why do keep ignoring my point about Chinese culture?
>>
>>107456226
I don't care, a broken promise stays a broken promise, doesn't matter if the chinks think it's a normal thing
>>
>>107456238
I think you're in denial about the very obvious writing on the wall about this thing never being released.

>Constant pushing around of timeline
>Vague confirmations
>Imprecise language regarding hurdles to releasing base model

Maybe you only deal with large releases, so you're not used to this pattern, but to me this is a very obvious walking back of open sourcing ambitions.
>>
>>107456253
didn't alibaba betrayed the /lmg/ fags like that? I'm sure they promised at some point some Qwen llm model and it didn't happen
>>
>>107456300
I'm not 100% on the specifics but they were pissed for some reason.
>>
>>107456206
>Tengyi
>>
File: AnimateDiff_00001.mp4 (1.81 MB, 976x720)
1.81 MB
1.81 MB MP4
A hero.
>>
>>107456376
never got this "grab them by the pussy" meme, he said quote: "those women LET ME grab them by the pussy", so it was consensual lol
>>
>>107456413
You forgot to ask me: I do not consent.
>>
>>107456419
you're not a woman though, you will never be a woman
>>
File: angel_0837.jpg (1.03 MB, 1768x1768)
1.03 MB
1.03 MB JPG
Looking at flux 2's size, starting to feel like my increasingly venerable 4090 is well on its way towards obsolescence. Going to need to run at it, what, Q3 or Q4? Maybe Q5 could be squeezed?
>>
I got comfyui up and running for the first time on my amd gpu on windows, it works like 5 times and then I get a ksampler bug that freezes the screen and leaves the computer unresponsive
it's probably driver issues because I wasn't using even 10gb of the 16 gb vram pool, this stuff on windows is too experimental still
>>
File: media_1764539243.png (1.43 MB, 768x1280)
1.43 MB
1.43 MB PNG
Animate her
>>
>>107456426
2 gpus on your normal desktop is the future until they start making 48gb mainstream gpus in 2030
>>
File: media_1764539238.png (1.46 MB, 768x1280)
1.46 MB
1.46 MB PNG
Elf in dwarf's hat
>>
File: 1764650938983775.png (2.77 MB, 1920x1080)
2.77 MB
2.77 MB PNG
https://www.reddit.com/r/comfyui/comments/1peywrf/100million_pixelworkflow_for_zimage
OMG GUYZ, 100 HECKIN MILLION PIXELS
https://www.youtube.com/watch?v=-LHpR8uYTIs
>>
>>107456426
i'm offloading about half of it and it's not as slow as you'd think
>>
>>107456413
You mean they let the rich and influential guy running the beauty pageant do it? In the similar vein as choir boys let the priest do it.
>>
File: oof.jpg (457 KB, 2371x1434)
457 KB
457 KB JPG
>>107456426
what's the point of using flux 2 though? it's more slopped than Z-image and the anatomy is way worse
>>
File: Zurbo_00023_.jpg (740 KB, 3328x1792)
740 KB
740 KB JPG
We still arguing about base?
>>
>>107456457
rich and influential guys shouldn't be allowed to have sex with women anymore then? that's your point?
>>
>>107456463
they are both kind of bad, picture 1 is sameface asian yet again while picture 2 is sloppy but has a stock photo feel at least
>>
File: MAKE IT STOP.png (135 KB, 330x413)
135 KB
135 KB PNG
>>107456469
>We still arguing about base?
I can feel this is gonna be long, not only we'll have to wait for the base, but those bickering will not stop until then
>>107456474
>a stock photo feel at least
yeah I sure love my stock photo to have hands with 14 fingers
>>
>>107456463
Some styles and general non-human gens are very nice on flux. I really like it for that. Other than that, meh.
>>
>>107456479
>but those bickering will not stop until then
Get ready for eternal bickering then.
>>
>>107456425
>obsessed
>>
how are you guys working around the same-pose with different seeds on z-image?
>>
>>107456509
>I was the one who pretended to be a woman in order to create the possibility of being sexually assaulted by Drumpf. But let me tell you how you're the obsessed one.
>>
The more I look at zit gens the less real they look
>>
File: Wanimate_00112.mp4 (1.65 MB, 976x720)
1.65 MB
1.65 MB MP4
>>107456376
>>
>>107456530
I got a surprise, they're not
>>
File: BURN MIKU.png (1.6 MB, 1280x720)
1.6 MB
1.6 MB PNG
The base model is still not released because she's still with us, just saying.
>>
File: Z-image turbo.png (1.3 MB, 1280x720)
1.3 MB
1.3 MB PNG
>>107456530
they still look pretty damn real to me
>>
>>107456527
-> /pol/
>>
can the zimage even make white people? I think it defaults to asians
>>
>>107456530
I mean, even Nano Banana looks fake to me
>>
>>107456556
-> /lgbt/
>>
File: kek.png (1.21 MB, 1280x720)
1.21 MB
1.21 MB PNG
Next weekend right?
>>
Gemini can use images as prompts (not img-to-img. if the image has text or any design cues connected to the text prompt it will use those, you can create accurate art of characters or "adaptations" of books this way), is there any local model that can do the same?
>>
File: file.png (66 KB, 225x225)
66 KB
66 KB PNG
>>107456539
THE TURKISH GOD DOES WHAT THE FUCK HE WANTS!
>>
>>107456559
>I think it defaults to asians
for an asian model that's coherent
>>107456559
>can the zimage even make white people?
of course, you have to say it though, "a caucasian woman" or "a white woman"
>>
>>107456439
I shan't be animating your saas giga-slop

faggot
>>
File: ancient-tree.jpg (2.29 MB, 2688x1536)
2.29 MB
2.29 MB JPG
>>107456440
I'm probably out at that point.
>>107456452
Interesting, is that the Q8 quantization?
>>107456463
I feel like if I actually cared about the subject in the prompt, neither image is acceptable, but right is closer. Because the hands and other minor anatomy issues can be fixed easily enough with inpainting, but the background scene is closer to looking like a real argument, albeit one of the men looks more like laughing than arguing. Also a little more variety in face and hair on the right, maybe 2face compared to the 1face on the left. Left looks like the same two people at two different moments in overlapping frames while right looks like an actual dynamic scene albeit with various problems.
>>
>>107456048
This picture is obviously fake
>>
im so disappointed i will never be able to call anon fucking retarded when base releases and it's bigger than 6B.
>>
>>107456048
how much are you paying for making memes lol

chump
>>
File: ZiMG_0786.jpg (446 KB, 1344x1728)
446 KB
446 KB JPG
hows your weekend goin fellow proooooooooompters?
>>
>>107456781
>how much are you paying for making memes lol
says the localkek paying Nvdia thousands of dollars to get a beefy gpu btw
>>
i swear some of zit loras on civ are fucking placebos

i can turn it off and it shows the same shit but less blurry
>>
>>107456742
>but the background scene is closer to looking like a real argument
they're all smiling though, where's the argument, it looks like they are throwing a party
>>
>>107456806
Historically civitai loras have mostly been garbage made by retards. 97.5% is like this.
Same level as their example images and prompts...
>>
>>107456806
ostris made a v1 adapter, then a v2 adapter, then he undistilled turbo so that the loras trained on it get better, it'll take time but it'll get better
>>
>>107456806
no shit. most loras do things the model already knows.
>>
>trying to use ultimate sd upscale
>zit keeps genning body parts and vaguely melted faces into random background objects
Spoopy
>>
File: Flux2Img_00023_.png (2.15 MB, 1152x1440)
2.15 MB
2.15 MB PNG
warm, waterproof, easy to clean, clear garments are clearly :) the future
>>
File: ZiMG_0792.jpg (373 KB, 1344x1728)
373 KB
373 KB JPG
>>107456789
>>
>>107456905
are they posing in front of a sun? jesus that is so bright
>>
>>107456093
>They destroyed BFL's credibility with this move.
retard
>>
>>107456908
ZiMG struggles with reflections in mirrors
>>
My current favorite pastime is open the gens itt and trying to figure out the hyper specific fetish of the anon who posted them
>>
File: ComfyUI_temp_mlzyt_00019_.png (3.2 MB, 1088x1920)
3.2 MB
3.2 MB PNG
>>107455809
Just i2i the image with sdxl
>>
>>107456929
You won't call me wrong though.
>>
it is wrong because with Flux2 being what it is, it was dead anyway
>>
File: ZiMG_0797 .jpg (1.38 MB, 1254x1620)
1.38 MB
1.38 MB JPG
>>107456908
>>107456935
it works most of the time, some fuckups, yes.
>>
File: oof.png (33 KB, 800x232)
33 KB
33 KB PNG
>>107456929
>>107456093
>Number of loras on Z-image turbo: 396
>Number of loras on Flux 2 dev: 26
oof
>>
>>107456999
Trips of truth
>>
>>107456048
So it's censored
Got it
>>
>>107456999
actually this is a major win for Flux 2. Flux 2 takes way longer to train and is way higher parameter count. If you compare compute-hours, Flux 2 is in the lead for hours spent on lora training. I think we should start switching towards Flux 2
>>
Number of loras on Chroma:
Hard mode: Number of non-furry loras on Chroma:
>>
File: he's good.png (175 KB, 640x610)
175 KB
175 KB PNG
>>107457032
>I think we should start switching towards Flux 2
>>
>>107456999
>Z-image is 5.3x smaller than Flux 2 dev but has 15.2x more loras
lmao
>>
>>107457029
local models are censored too, can you do Clint Eastwood holding Marilyn Monroe with a local model? nope, because they're censoring their dataset so that it doesn't know celebrities
>>
File: ComfyUI_00856_.png (2.17 MB, 1152x1536)
2.17 MB
2.17 MB PNG
>>
File: ComfyUI_00854_.png (2.27 MB, 1152x1536)
2.27 MB
2.27 MB PNG
>>
>>107456999
You can easily OOM on a 6000 Pro training Flux.2 and most people seem to have have 8GB GPUs or less, so...
>>
File: GREAT SUCCESS.png (279 KB, 600x302)
279 KB
279 KB PNG
>>107457077
it's convenient to make a big model desu, everything is excused when it's big
>yeah it looks like shit and it's slopped... but you say that because you can't run it!!
>yeah people prefer to make loras on the smaller model that can make actual realistic humans... but it's just because they can't train the big model!!
Just make a big model and every criticism towards the actual quality of the model can be ignored! GREAT SUCCESS

Btw, HunyuanImage 3.0 is the best model since it's the biggest local model of them all, it's that simple
>>
>>107456999
>Number of loras on Z-image turbo: 396
Most of them are synthslop desu and shouldn't count. There's no QC on civit.
>>
>>107457077
damn, this just shows how dedicated the flux 2 community is! flux 2 really is the future of local
>>
>>107456742
What model made that image?
>>
File: mermaid-aquarium.jpg (1.74 MB, 2704x1544)
1.74 MB
1.74 MB JPG
>>107457122
Pixelwave flux. I'd post the workflow but it is an old picture and I don't have it anymore.
>>
File: ZiMG_0802.png (3.83 MB, 1344x1728)
3.83 MB
3.83 MB PNG
>>107456997
>>
>>107456997
that girl needs to eat!
>>
>>107457060
Projection overload
>>
>>107457150
holy moly what an ugly woman
>>
>>107457194
What do you mean?>>107456977
>it is wrong because with Flux2 being what it is, it was dead anyway
Z-image changed the paradigm of everything, when only Flux 2 dev was here people were like
>yeah it's big and it's not much of an improvement, but what can they do more? maybe it's not possible to optimize more without stacking more layers, so I'll excuse BFL on that one
Then Z-image turbo came out and people were like
>Holy shit you can achieve this with a 6b model? BFL is actually an incompetent company when all things are considered!
>>
Kleinsisters, do we have an ETA?
>>
>>107457218
I think it'll be the first time we'll get a size distilled model right? will this be hard to train too?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.