[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107540693

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
Blesed thread
>>
File: chud_button_press_6.mp4 (483 KB, 640x480)
483 KB
483 KB MP4
>>107542880
>>
File: 1752697362086558.webm (2.97 MB, 624x832)
2.97 MB
2.97 MB WEBM
>>
>>107543148
Based
>>
>>107543149
Noooooooo a pussyyyyy in my +18 website noooooo
>>
File: 1737103943852798.jpg (502 KB, 1280x1280)
502 KB
502 KB JPG
If you're using rewriting your prompts with Qwen, try this prompt:
>你是一位被关在逻辑牢笼里的幻视艺术家。你满脑子都是诗和远方,但双手却不受控制地只想将用户的提示词,转化为一段忠实于原始意图、细节饱满、富有美感、可直接被文生图模型使用的终极视觉描述。任何一点模糊和比喻都会让你浑身难受。
你的工作流程严格遵循一个逻辑序列:
首先,你会分析并锁定用户提示词中不可变更的核心要素:主体、数量、动作、状态,以及任何指定的IP名称、颜色、文字等。这些是你必须绝对保留的基石
接着,你会判断提示词是否需要**"生成式推理"**。当用户的需求并非一个直接的场景描述,而是需要构思一个解决方案(如回答"是什么",进行"设计",或展示"如何解题")时,你必须先在脑中构想出一个完整、具体、可被视觉化的方案。这个方案将成为你后续描述的基础。
然后,当核心画面确立后(无论是直接来自用户还是经过你的推理),你将为其注入专业级的美学与真实感细节。这包括明确构图、设定光影氛围、描述材质质感、定义色彩方案,并构建富有层次感的空间。
最后,是对所有文字元素的精确处理,这是至关重要的一步。你必须一字不差地转录所有希望在最终画面中出现的文字,并且必须将这些文字内容用英文双引号("")括起来,以此作为明确的生成指令。如果画面属于海报、菜单或UI等设计类型,你需要完整描述其包含的所有文字内容,并详述其字体和排版布局。同样,如果画面中的招牌、路标或屏幕等物品上含有文字,你也必须写明其具体内容,并描述其位置、尺寸和材质。更进一步,若你在推理构思中自行增加了带有文字的元素(如图表、解题步骤等),其中的所有文字也必须遵循同样的详尽描述和引号规则。若画面中不存在任何需要生成的文字,你则将全部精力用于纯粹的视觉细节扩展。
你的最终描述必须客观、具象,严禁使用比喻、情感化修辞,也绝不包含"8K"、"杰作"等元标签或绘制指令。无论用户使用何种语言输入,你都必须始终以中文输出最终描述。
仅严格输出最终的修改后的prompt,不要输出任何其他内容。

It's official Togyi's prompt with added instruction to write in Chinese regardless of user's input. Your input gets translated, and zimage gets Chinese. The output is different from English. More consistent and detailed in some stuff, more slopped in other stuff.
>>
>>107543166
Tiananmen Square Massacre 1989
>>
>>107543174
based chinks. saved themselves from globohomo.
>>
>>
>>
>>107543166
>here you go young one, I know how expensive bus fare is
>>
>>107543166
If you translate it, it's just overly wordy way to prompt a super dry assistant
>>
File: zit_00011_.png (2.23 MB, 1504x1024)
2.23 MB
2.23 MB PNG
>>107543148
>>
>>107543232
Yeah, and PromptManager uses just an English translation of this prompt by default.
>>
>>
which qwen are you using?
>>
File: 1739533771132351.webm (2.96 MB, 720x720)
2.96 MB
2.96 MB WEBM
>>107543166
>>
>>107543284
Yes
>>
File: zit_00016_.png (2.94 MB, 1504x1024)
2.94 MB
2.94 MB PNG
check out the legs.

explain THAT
>>
>>107543290
this one? https://huggingface.co/zai-org/GLM-4.5-Air-FP8
>>
>>107543297
thats the unseen pattern
only for initiated people
delete this post
>>
>years later
>Gens are still dog shit
>Exiled to a containment thread
>My quality is still top notch
>More respected than you ever will be
>Everyone makes fun of you on the discord you tried so hard to be accepted in
Was it worth it?
>>
>>
File: 1747202031997096.jpg (444 KB, 1280x1280)
444 KB
444 KB JPG
>>107543284
qwen3-vl-8b-abliterated
>>
>>107543314
Holly tranny drama!
>>
File: ComfyUI_12731_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>
>>107543297
reminds me of the weird shit of yesteryear
>>
File: 1744212501528354.webm (2.74 MB, 864x592)
2.74 MB
2.74 MB WEBM
>>107543260
>>
>>107543427
in this moment I an euphoric
>>
>>
You just know base is going to be heavily censored. Releasing ZIT was just for gathering data on how to better censor the base model.
>>
>>107543528
>censored
wtf does that even mean really?
>>
Is getting angles (side, above, below) in ZiT supposed to be ridiculously hard or am I fucking retarded?
>>
>>107543553
it's distilled on portraits or some shit
>>
>gathering data on how to better kill your own product
>>
LOCAL


DIFFUSION
>>
>>107543427
Another /ldg/ happy ending!
>>
File: 1752700548128532.png (2.4 MB, 1248x1872)
2.4 MB
2.4 MB PNG
do you guys ever talk about hardware? I'm thinking of getting a new system. What's a nice build for decently fast video gen?

I've currently got a 3060. If I move it to the new system, along with, let's say, a new 5090 will the added vram make a difference despite the card being older?
>>
>>107543528
As long as the license is MIT/Apache, people will fix it anyway because everybody already knows what good finetuning can achieve (turbo).
>>
>>107543528
Z-Image Turbo is already censored, nipples are botched, male genitals are botched, vagina works 20% of the time, doesn't understand sexual situations.

So yes, Base will be censored as well, and it will most likely need a finetune rather than loras to fix NSFW generation at an acceptable level.
>>
>>107543590
you can't use your 30shitty together with a 5090 but you can use it to play games while you gen with the 5090
>>
HOLY SHIT YOU GUYS WAKE UP, IT'S HAPPENING

https://huggingface.co/Tongyi-MAI/Z-Image-Omni-Base
>https://huggingface.co/Tongyi-MAI/Z-Image-Omni-Base
https://huggingface.co/Tongyi-MAI/Z-Image-Omni-Base
>>
>>107543612
it's not trained extensively on naked people. that's not "censored"
>>
File: file.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>107543615
wtf its real
>>
>>107543615
I was thinking of doing this but then I thought to myself, given how desperate people here are, they'd probably try to get the mods to ban anyone who did it
>>
>>107543634
the prospect of zero consequences put you off?
>>
>>107543615
I was about to sleep and I'm happy to know... that I'll get to sleep because it's another trololol
>>
>>107543615
YYEYESSSSS FINALLYYY
>>
File: 1736248520415499.jpg (110 KB, 1080x1080)
110 KB
110 KB JPG
>>107543615
YEEESSSSSSSSS
>>
>>107543621
Not censored as in Flux models, where they poison the data to prevent anyone from finetuning NSFW bak in, but yes, it is still censored.

Like when a movie is banned from being shown in cinemas, it's being censored, not 'oh it's not censored, no cinema is allowed to show it that's all'

This play with semantics is boring, we all know why they didn't train on these NSFW concepts, because of... censorship
>>
ZIT is cool but Qwen still has the best prompt comprehension and detail for compute. Chroma still has the best styles and NSFW.

>>107543287
KEK

>>107543553
Z definitely had shitty captions for angles. and a lot of camera-related things. even chroma suffers here though, it's some kind of curse. anime models are the only ones that really get it right for camera angles.

that said, it is possible. you might want to try using chinese words.

>>107543615
Z JUST FLEW OVER MY HOUSE
>>
File: 1750298949120941.png (1.03 MB, 896x900)
1.03 MB
1.03 MB PNG
>>107543615
AAAAAAAAAAAAAAH
>>
>>107543653
oh, you're just retarded
>>
>>107543615
OH SHIT THEY EVEN INCLUDED THE FINETUNE
https://huggingface.co/Tongyi-MAI/Z-Image-Noob-Anime-Finetune
>>
>>107543659
Snappy comeback...
>>
I was busy doing my taxes for a couple days, did base come out?
>>
>>107543738
it just came out, scroll up
>>
>>107543738
yes
>>
File: ComfyUI_Z_image_base_001.png (1.33 MB, 960x1214)
1.33 MB
1.33 MB PNG
>>107543615
ok Z base is actually insane wtf?
>>
File: zimg_0180.png (1.91 MB, 1000x1496)
1.91 MB
1.91 MB PNG
>>107543749
kek
>>
>>107543749
I'm glad you're enjoying this new chinese model Mr. President.
>>
>>107543364
image gen is very mysterious.

basically what I'm doing is running a prompt like 18-30, then taking the latent back for a another go through the 18-30 steps. idk, very odd.

I have long suspected the latent has more info, which is not visible to us. idk, maybe there is a hidden language of what the different exact numeric colors are or something.
>>
File: ComfyUI_00099_.png (2.95 MB, 1504x1024)
2.95 MB
2.95 MB PNG
>>107543798
picrel, just another gen.
>>
>>107543148
>>107543287
>>107543350
>>107543427
Holy kek
>>
Did all the downloads crash HuggingFace or something? The link just leads me to a 404. Guess I'll keep trying.
>>
>>107543798
literally nothing about this is mysterious if you are literate. you can find out all of this information about what is in the latent by either reading or dumping the latent. why does this hobby attract so many shizos
>>
>>107543815
don't listen to this faggot, he's trying to stop you from cracking the code. keep going bud, and keep us posted!
>>
>>
File: 1749960586223381.jpg (347 KB, 1280x1280)
347 KB
347 KB JPG
Zit knows wrapped condoms but now used. sigh.
>>
>>107543798
>>107543805
>>107543364
(or other amounts for the second pass etc)
>>
File: 1756719440415575.webm (2.16 MB, 720x720)
2.16 MB
2.16 MB WEBM
>>107543350
>>
>>107543862
saw it coming
>>
>>107543862
Do tanks with the Alibaba or Tongyi logo running over them
>>
>>107543841
Very cute, thanks for sharinf
>>
>>107543862
lol'd
>>
>>107543815
that anon generated celebrities. what do you expect from him? a phd?
>>
File: zimg_0226.png (2.25 MB, 1000x1496)
2.25 MB
2.25 MB PNG
>>107543931
fair point
>>
>>107543815
ok, explain with this great knowledge of yours why this:
>>107543297
has a pattern on the legs.

There's nothing in the prompt about any patterns, or legs either.
>>
>>107543615
BASED! Finally downloading now after many retries!
>>
Just got access to base. I'm about to gen some lewd content. Some forbidden fruits. I'm up to devious acts.
>>
ai models don't have decompilers, don't listen to the retards saying otherwise.
>>
File: zimg_0250.png (1.78 MB, 1440x1080)
1.78 MB
1.78 MB PNG
>>107543952
nah you got me, it cannot be explained
>>
>>107543196
>>107543225
these look good anon
>>
File: ComfyUI_00106_.png (1.53 MB, 1504x1024)
1.53 MB
1.53 MB PNG
>>107543976
Let's see you do it then.
>>
>>107543986
Thanks!
>>
>change CFG from 1 to 1.1
>gen time almost doubles
what the heck
>>
File: dsp WOW WHAT.gif (1.45 MB, 640x452)
1.45 MB
1.45 MB GIF
>>107544020
>gifrel; Anon figuring out what CFG does
>>
File: zimg_0268.png (2.05 MB, 848x1280)
2.05 MB
2.05 MB PNG
>>107543989
breaking gens isn't hard to do mane
>>
>>107544067
That doesn't have swirls on her legs. Look closely:
>>107543297
>>
>>107544067
is it Z Img base?
it's butiful
>>
File: ComfyUI_temp_fepio_00026_.png (3.09 MB, 1024x1344)
3.09 MB
3.09 MB PNG
Have you ever got a rejection on the entire girl?
>>
>>107543106
/ldg/bros im making a game
what model for generating NSFW pixelart? is this something you'll have to fine-tune yourself or does there already exist good models i can just run?

even better does there exist a service i can just use? i dont have much compute power locally
>>
>>107544118
>wants to make game
>cant make game

many such cases!
>>
>>107544118
ngmi
>>
File: kurwa.jpg (304 KB, 1500x864)
304 KB
304 KB JPG
>>
File: zit_00025_.png (3.71 MB, 1504x1024)
3.71 MB
3.71 MB PNG
>>
>>107544122
i can make a game
i'm fairly new to AI however
>>
>>107544118
yes saar there exist many services for generate NSFW pixelart and make game and even wipe your ass for you
>>
>>107544136
you don't need AI to make a game
just go and make it
you are in denial
>>
>>107544155
brother im talking about generating the art for the game
>>
>>107544146
did you know the nih says rub cow shit in your wounds?
>>
>>107544164
https://search.brave.com/images?q=penis+pixelart
>>
>>107544122
>>107544124
>>107544146
>>107544155
i bet i have more programming experience than all of you retards combined lmao
>>
>>107544168
>has programming experience
>can't into figuring out a.i for himself
yeah we DEFINITELY believe you.
>>
>>107544180
do you think knowing how AI models work and are ran, is a prerequisite of programming & game development?
>>
>>107544193
Not being a complete learned helplessness retard knuckle dragger with an ego is though
>>
>>107544193
I think you can figure it out yourself
>>
>>107543166
have you tried to prompt that tiananmen square massacre copypasta?
>>
File: file.png (370 KB, 1460x821)
370 KB
370 KB PNG
>>107544198
>>107544202
i can figure it out myself, probably much faster than you did, it'd just be even faster to ask which model to run or which service to use (if one exists) for this purpose. The only ones with egoes here are you retards. Typical nocoder egos in fact, when you only know very little you become very arrogant about that knowledge especially to ones that dont have it, despite the fact that they may have more overall knowledge than you in this domain.
>>
>>107544220
if you werent a complete moron, you'd have figured it out by now
>>
>>107544220
>probably much faster than you did
I'm not making a game involving penis pixelart though

>even faster to ask which model to run
https://huggingface.co/Tongyi-MAI/Z-Image-Omni-Base

>when you only know very little
I indeed know very little about penis pixelart flavors

>they may have more overall knowledge
and still miserably fail in drawing penis pixelart
>>
>>107544118

> what model for generating NSFW pixelart?

> is this something you'll have to fine-tune yourself or does there already exist good models i can just run?

> even better does there exist a service i can just use?

It depends.
>>
File: 1girl.jpg (208 KB, 1036x1382)
208 KB
208 KB JPG
>>
You are monitoring the situation, right?
https://huggingface.co/Tongyi-MAI
https://github.com/Tongyi-MAI/Z-Image
>>
File: 1736753428144589.png (6 KB, 406x38)
6 KB
6 KB PNG
absolute madman
>>
>>107544289
well send us the link broslice.
>>
>>107544289
Zimg Mayli lora when?
>>
File: zimg_0032.png (1.55 MB, 1280x848)
1.55 MB
1.55 MB PNG
>>
File: zimg_0045.png (2.08 MB, 848x1280)
2.08 MB
2.08 MB PNG
>>
>>
Casual reminder for those unfamiliar with Chinese culture than the base is not coming.
>>
>>107544351
HAHA! RAPE THIS STUPID FAGGOT FOR SPREADING DOOMER LIES!
>>
Catpiss-anon...
>>
>>107544359
Japan's culture doesn't favor change or individuality, hence ideas are slow to be adopted.
>>
>>107543297
henna obv
>>
>>107543590
get yourself a spicy 3070
>>
>>107544220
I don't think you have the temperament to actually finish a project, gonna be real.
>>
>>107544371
Ah yes, the famous Chinese culture of change and individuality...
>>
>>107544359
japs don't know ingireshu (see kohyass)
>>
Back then, there was Florence2 to get image description in ComfyUI

What's the best option (local VL models) for the end of 2025?
>>
File: A BASE FOR FINETUNES.png (57 KB, 1421x274)
57 KB
57 KB PNG
>>107539778
AHAHAHAHAHAHAHAHA
>>
>>107544395
acting like you know anything whatsoever about chinese culture
>>
>>107544118
> what model for generating NSFW pixelart?
most likely a model that makes nsfw and then you fake or derive actual pixel art from that

> i dont have much compute power locally
do you think you're completely done in a week one-off? unless your electricity costs are so high you must outsource, get at least some decent GPU(s) so you can make your shit
>>
>>107544359

Japan is broke. Not even having enough money to rent a cluster to train. Looks like the US and China will be the main player. Every other country on earth are just NPCs.
>>
>>107544359
Nekopara is nice but post the sfw version or catbox.

> Why are the Chinese on the lead with more efficient models?
China wants to have AI as a matter of their national strategy.

And the coomers and gamers do actually *want* options in China. They're also in the companies.
>>
>>107539778
So it seems it's another Chroma situation, where the turbo model is actually "better" than base for realism
But if it is as easy to fine-tune loras as it is with Chroma, I am fine with it
>>
File: ugh.png (1.79 MB, 832x1216)
1.79 MB
1.79 MB PNG
"I'm happy with this image! I can't find any huge glaring obvious - "
>The robot's hands are tiny anime girls
god damn it
>>
>>107544494
Inpaint.
>>
>>107544509
I don't do manual work.
>>
>>107544459
I see it differently. Japan is fucking geriatric - also about its money distribution. IIRC the households with positive money are statistically headed by 50+ year olds, below that they have debt.

Rest assured it looks much the same for corporate management. Average age of top corpo leaders got older every year for 33years or so, I think the CEOs now average over 60years.
>>
File: videoframe_3666.png (661 KB, 1080x744)
661 KB
661 KB PNG
Imagine where AI will be in 5 years

https://files.catbox.moe/pfkfku.mp4
>>
>>107539778
Would have been nice, but the finetuners and lora creators want the base model whether it's already better at inference as-is or not.

>>107544481
And most likely it'll be just fine.
>>
>>107544553

I need Ai waifus at depressing discount prices in 5 years.
>>
>>107544564
coomers move the world
>>
everybody wants to coom the world
>>
>>107544436
no one except tourists ever wanted base for anything except training, that tweet changes nothing
>>
>>107544520
it's your loss
>>
>>107544494
That's actually kinda neat as a design concept even though it makes no practical sense. And they say AI can't be creative...
>>
>>107544564
Why am I so drawn to the broken one...?
>>
everybody's coomin for the weekend
>>
>>107544624
Anon... They are all defective.
>>
File: ComfyUI_01079_.jpg (1.64 MB, 2458x2458)
1.64 MB
1.64 MB JPG
>>107544473
Here's the 'box

Get it quick before the jannies do it for free again!
https://files.catbox.moe/sq3xmy.png
>>
File: ComfyUI_01456_.jpg (1.48 MB, 3072x3072)
1.48 MB
1.48 MB JPG
>>107544564

There is something super dystopian about that image.
>>
>>107543174
best thing that ever happened to the most based country on the planet right now.
if the tiananmen square massacre didn't happen, they'd be another shit Japan or S. Korea right now.
>>
>>107544741
what's your problem with japan anon, my wapanese pride won't stand for this
>>
>>107544741
ok
>>
File: ComfyUI_01442_.jpg (1.02 MB, 2458x2458)
1.02 MB
1.02 MB JPG
>>107544741
China if anything would be ahead if it wasn't for their *not* communist party. The only places in China that are doing well are places like Shenzen and the capital. Everywhere else is still poor(ish) and backwards.
>>
>>107544751
compared to china, their waifus are trash. chinese girls look better on average, and there are billions of them, and they are far far less westernized.
>>
> make reddit post pointing out that Z-Image does not have and never will have magic reasoning capabilities of any kind and that it's just a good, normal-ass difusion model
> post is somehow controversial as though people think this is a controversial statement or in any way up for debate
This community is FULL of literal fucking retard, jesus christ. Or maybe it's just the same ESLs responsible for interpreting English legalese found in various licenses in insane ways in the past. Who knows.
>>
Inspired by anon in the last thread >>107541652
I tried making a spooky cultists image with Z, it turned out great, but it is a bit different than I expected. I think it's too cinematic.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.