[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (2.33 MB, 3740x2917)
2.33 MB
2.33 MB JPG
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107491813

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI:https://github.com/comfyanonymous/ComfyUI
SwarmUI:https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo:https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next:https://github.com/vladmandic/sdnext
Wan2GP:https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta:https://rentry.org/localmodelsmeta
Share Metadata:https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks:https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt:https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin:https://github.com/Acly/krita-ai-diffusion
Archive:https://rentry.org/sdg-link
Bakery:https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality

https://rentry.org/debo
>>
>best way I can describe using comfyui moving forward is like sitting on a 12" dildo leaving it in and saying "fuck it, I'm gay now" instead of pulling it out with some dignity and saying "what's the next steps"
>>
how do I make nsfw anime videos?
>>
>>107495529
own a GPU with at least 24gb
>>
>>107495526
wise words
>>
>>107495537
not if you have ram
>>107495526
gay
>>
>>107495529
most likely with wan t2v or i2v.
>>
>comfy let's me customize everything about what I want
>1girl slopgen no upscale
>>
DRUGGED AND SHAT ON IN THE STREETS
>>
how do i make nsfw videos? what software do i need? i have a 3090. i dual boot linux so i would want to generate everything on linux.
>>
How hard is it to implement a LLM prompt model that can differentiate characters and do proper composition/format placement of characters?

Why is this such a hurdle?
>>
>>107495554
>gay
says the guy with a 12" dildo in his ass
>>
>>107495526
ok what are the next steps
>>
File: 1742155189016724.png (518 KB, 500x764)
518 KB
518 KB PNG
>>107495566
>>comfy let's me customize everything about what I want
>>
>>107495589
dunno what you mean by proper but models DID get more powerful at placement and stuff.

qwen/wan or even hyimage3/flux.2 are much better at it than sd1.4 used to be.
>>
>>107495589
Too poor to run a model that's smart enough to do that. Won't get any support from me.
>>
>>107495556
thanks. is this what the pixiv sloppers use?
>>
>>107495621
Maybe? There are other models too and some might use SaaS models.

IIRC some also still animate a series of image by hand like one of these motion picture book, they don't even use a regular video or animated image format for it but that special pixiv ugoira archive
>>
>>107495529
comfyui + wan2.2 + nsfw anime loras

comfyui example workflow:
https://docs.comfy.org/tutorials/video/wan/wan2_2

download loras:
https://civitai.com/models

wan2.2 I2V(Image to Video) is what most use. It means you use a reference image, combine it with loras that give NSFW motion, and animates the image.
>>
>>107495649
>>107495672
thanks. gonna put my new gpu to good use
>>
File: s8rfm29bz26g1.png (439 KB, 1162x1293)
439 KB
439 KB PNG
>>
File: screenshot.1765315297.jpg (443 KB, 2135x611)
443 KB
443 KB JPG
>>107495566
>comfy let's me customize everything about what I want
Yes. Recently finished my SeedVR batch upscaler. Takes a folder full of videos or images and upscales them. You can switch between processing images or videos. Optional post processing(film grain) applied too. All videos are saved in organized output folders with the original name + seedvr attached. If ComfyOOM's, I can resume right where I left off since it keeps track of my batches. It's a very flexibile workflow that handles all edge cases.

Now tell me, could I do this with neoForge, Wan2GP or SwarmUI? It'd be a pain in the ass I'd imagine. People that actually build usable pipelines thrive with ComfyUI™.
>>
>>107495767
I just use chainner since it doesn't make blurry upscales
>>
am I overly paranoid if I refuse to use custom comfy nodes?
>>
>>107495767
color autism
>>
Guys I just pulled and now the dancing fennec girl in the corner is gone. Wtf, why did he remove it?
>>
>>107495767
My workflow is I gen one image of something sexy. Look at it for 5 seconds (or so) then close my eyes and masturbate to the memory of it.

You need to have discipline about these things.
>>
>>107495811
they killed her for a mutt latina
>>
https://huggingface.co/lodestones/Chroma1-Radiance/blob/main/latest_x0.safetensors
the safetensors is here
>>
File: 0924153412.jpg (827 KB, 1248x1664)
827 KB
827 KB JPG
>>
I wish there was some way to see exactly what lines of code were changed in each ComfyUI update so if I don't like something I can just change that specific code back to what it was before.
>>
is comfy-unjeeted a good ui name?
>>
>>107495844
So what's the benefit over DC-2K?
>>
>>107495875
not going to get a lot of VC money with that name
>>
>>107495888
>So what's the benefit over DC-2K?
https://xcancel.com/LodestoneRock/status/1998215045118112029#m
>>
>>107495889
VC money?
>>
>>107495529
i2v with a noob/illustrious gen with wan 2.2?
>>
Enjoying Chinese culture my friends?
>>
>>107495953
nta, how much ram+vram would I need for that?
>>
File: 1750372110866189.png (49 KB, 802x467)
49 KB
49 KB PNG
>Tongyi made a "ask the team" tab on discord so that people can ask them questions and interact with
>They didn't use it for almost a week
kek, I love Chinese Culture!
>>
>>107495987
wan can work on 12gb or possibly less maybe? depends on the quants/model type
>>
>>107495987
about tree fiddy
>>
File: OY VEY.png (1.39 MB, 2071x1484)
1.39 MB
1.39 MB PNG
>>
>>107496005
people gen videos with 8gb cards
>>
File: spider2.jpg (2.15 MB, 2397x2111)
2.15 MB
2.15 MB JPG
Still no Z image base faggots?
>>
>>107495819
most of us aren't as enlightened as you are
>>
File: 1750945315765849.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
A Netflix movie poster for a movie called "ACK!" in the style of an action movie. On the poster is A man with pink hair holding a trans flag who is diving off a very high bridge. Make the image look like a movie poster.
>>
File: 1745352553351883.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>107496114
>>
>>107496114
>>107496123
lmao, based
>>
I SEE MY TOKENS IN YOUR POSTS I AM GOING FUCKING INSANE
>>
>>107495588
you need comfyui and wan2.1 or 2.2 to do text2video or image2video lewds with nsfw loras

most likely you also want an image model to make the lewd images that you use as a starting point for i2v
>>
>>107495987
as much as possible.lower quants = lower quality. Anything below Q6 is unacceptable imo(16gb vram)
>>
File: ComfyUI_00233_.png (1.42 MB, 1192x939)
1.42 MB
1.42 MB PNG
You STILL dont have any base for your chinky model? My fucking sides
>>
File: 1755793893832473.png (1.39 MB, 1280x720)
1.39 MB
1.39 MB PNG
>>
>>107496145
speaking of chroma did you try his newest toy?
>>107495844
>>107495911
>>
>>107496145
why are they so smooth?
>>
>>107495844
Is this any good? How fast is it on a 5090?
>>
>>
File: 1759096882356146.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>
File: 3254327643.png (1.25 MB, 832x1216)
1.25 MB
1.25 MB PNG
>>
>>107496179
get off my nice piano whore
>>
File: ComfyUI_00083_.png (1.42 MB, 1120x1008)
1.42 MB
1.42 MB PNG
>>107496156
Uh, what is it good for? I didnt really like 1-HD over base so idk what to expect here
>>
>>107495875
comfierui so that someone can eventually make comfiestui because yours will get complained about as well
>>
>>107496179
get off my nice whore ugly piano
>>
>>107496164
chinese skin care
>>
Euler + Beta looks pretty good with Z-Image and wan 2.2
>>
File: 2671488710.png (1.18 MB, 1216x832)
1.18 MB
1.18 MB PNG
>>107496182
>>107496195
I was thinking the same thing, that piano probably costs 50 grand.
>>
>>107496135
I got a 5060ti in the mail
>>
File: 1743874382602259.png (32 KB, 735x371)
32 KB
32 KB PNG
anything special i need to do to use the z-image control net model? i'm getting an error saying its not valid
>>
>>107495844
Maybe we got a Z-image tier model with NSFW in it but no one seems to be willing to do the tests lool
>>
>>107496114
Tfw ZIT doesn't know your fav character or celebrity but it knows the trans flag...
>>
>>107496090
While you can do that I wouldn't recommend it as it sucks.

t. used to do it
>>
>>107496004
It's so blatant at this point. We don't mock people who still think it's coming enough.
>>
File: 1761430437009619.png (868 KB, 1136x912)
868 KB
868 KB PNG
qwen edit + 2 images
>>
>>107496328
try /adg/
>>
>>107495844
>>107496272
Is there an example workflow for it? I don't know what it needs but I can download it to try it out.
>>
File: 1762286516575.png (1.16 MB, 1408x1216)
1.16 MB
1.16 MB PNG
>>107496272
I googled it and it seems to just be a version that doesnt need so much prompting to get good results, like normally you have to spam a bunch of negatives and descriptors such as volumetric lighting, high res, etc, or it might do black and white sketches and other unwanted shit. Maybe I'll try it but I wanna get into video gen next
>>
>>107496269
https://github.com/comfyanonymous/ComfyUI/pull/11062
>>
File: 1741975168827537.png (304 KB, 3120x1502)
304 KB
304 KB PNG
>>107496357
just go on Comfy's template you'll get what you want
>>
>>107496376
>>107496382
don't these do it wrong 90% of the time?
>>
God I feel so comfy
>>
>>107496130
>you need comfyui and wan2.1 or 2.2 to do text2video or image2video lewds with nsfw loras
lol no use wan2gp
>>
File: 1743316248469422.png (809 KB, 1136x912)
809 KB
809 KB PNG
>>107496328
>>
File: UuU.png (67 KB, 294x225)
67 KB
67 KB PNG
>>107496103
I like your style
>>
>>107496396
where can I get a workflow that has a node dedicated for NSFW lora?
>>
>>107496393
they are meant to be a base to get you started using a model
>>
>>107496363
Could you add NewBieAI also?
>>
>>107496414
are people actually getting mad the cartoon peanut won?
>>
>>107496420
no workflow or spaghetti needed. it just has inputs.
>>
>>107496443
>best vtuber award
>given to a nigga who isn't a vtuber and hates vtubers because it'd be funny
>>
>>107496458
>isn't a vtuber
>always has avatar on
I'm confused
>>
File: 1760140321496544.png (752 KB, 1136x912)
752 KB
752 KB PNG
>>107496443
yes, because they think it doesnt count (for some reason)
>>
Z-Image Base will release tonight
>>
>>107496241
Why not simply use a free comfyui provider like seaart? 24/48gb GPUs.
>>
File: 1763230648123120.png (486 KB, 680x559)
486 KB
486 KB PNG
>>107496478
>Z-Image Base will release tonight
that's bullshit, but I believe you
>>
>>107496479
>Why not simply use a free comfyui provider like saarshart?
>>
>>107496458
sounds giga-based to me. fuck "v-tubers"
>>
>>107496487
*free cumfart poovider like saarshart
>>
>>107496487
>doesn't answer
Is it because you're generating CP?
>>
>>107496507
you aren't?
>>
>>107496466
Apparently vtuber means big ass tranime tiddies jiggling in your face 24/7
>>
>>107496479
retarded. always use other people's instances
https://www.shodan.io/search?query=comfyui
>>
>>107496513
I prefer hags (21+)
>>
File: ComfyUI_00185_.png (1.21 MB, 1200x968)
1.21 MB
1.21 MB PNG
>>107496417
Thanks m8y

>>107496440
I'll try, send me a workflow

>>107496478
If it releases tonight I will delete my spiderBBC folder and never post again.
>>
File: 1743337907232628.png (1.05 MB, 1000x1048)
1.05 MB
1.05 MB PNG
z image + qwen edit is a great combo.
>>
>>107496551
*also, reactor comfyui version for face swaps (just take a generic black guy and swap droyd)
>>
File: NAI_00011_.png (1.81 MB, 1024x1280)
1.81 MB
1.81 MB PNG
Just a quick test but this mergstein uncanny_uncanny is quite nice.

Not sure how similar NAI is to Illustrious. What's the advanced workflows look like? I got a very basic bitch workflow right now.
>>
>>107496531
I hope somebody writes a script to spam all open remote instances with cp
>>
>>107496531
Very nice, I've always wanted to commit cybercrime to run sd1.5.
>>
File: 1755142162512162.png (1.04 MB, 1000x1048)
1.04 MB
1.04 MB PNG
>>107496551
>>
>>107496580
how is it a crime when all these people left their instances public?
>>
File: 1749599305834954.png (1.92 MB, 1024x1024)
1.92 MB
1.92 MB PNG
>>107496582
>>
File: It's HY'am.png (349 KB, 1465x1334)
349 KB
349 KB PNG
https://xcancel.com/TencentHunyuan/status/1998298475507892455#m
lol?
>>
>>107496531
I am machine gunning gay nigger porn on some poor chinks machine kek
>>
>>107496592
why steal computing resources when you can simply use a freely provided high performance GPU?
>>
File: 1736559501946277.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
>>
>>107496623
it sounded like "one yuan" which is about its worth
>>
File: file.png (122 KB, 539x882)
122 KB
122 KB PNG
>>107496382
>>107496393
Well they certainly fuck up on this one. It gens pure noise with the default settings. Is there something wrong with these?
>>
So what are you training first when the base model releases?
>>
File: 1762187116995.png (1.19 MB, 1216x1408)
1.19 MB
1.19 MB PNG
>>107496707
try flow shift 2, min length 1, and t5xxl fp16, there seems to be an optional chroma radiance node, no idea what the fuck thats for, but might be the culprit
>>
>people using the latest advanced model
>still looks like poorly gen'd shit
>>
File: 2279663419.png (1.09 MB, 896x1152)
1.09 MB
1.09 MB PNG
>>
File: 1736300593997552.png (1.9 MB, 3114x1276)
1.9 MB
1.9 MB PNG
>>107496707
it seems to be working for me, did you update ComfyUi
>30/30 [01:27<00:00, 2.93s/it]
it's actually pretty fast, faster than normal Chroma actually, this shit might be the future
>>
>>107496688
>watching yt tutorials with ai voice pronounce it as "Hoon Yoowen"

heh
>>
File: 3384184417.png (1.12 MB, 896x1152)
1.12 MB
1.12 MB PNG
>>
>>107496545
>I'll try, send me a workflow
Rush released model, only works in a isolated ComfyUI setup with very lenghty guide https://ai.feishu.cn/wiki/P3sgwUUjWih8ZWkpr0WcwXSMnTb .

But here https://newbie.rimeleaf.com/ you can use it for free without account with cloud GPU, just use the advanced XML tab.

Web version:
Before entering the prompt into the Web version you have to paste this https://pastebin.com/U3zQQrJYas System Prompt to a LLM model to build you the tag prompt, then like neta you have to put this as prefix: "You are an assistant designed to generate high quality anime images with the highest degree of image text alignment based on xml format textual prompts. <Prompt Start>"
Negative prompt is included in the Web
>>
File: fool me once...jpg (852 KB, 2048x1424)
852 KB
852 KB JPG
>>107495844
>https://huggingface.co/lodestones/Chroma1-Radiance/blob/main/latest_x0.safetensors
>I fell for the meme
gaddamit
>>
Will there ever be a photo real model that understands NSFW concepts the way Pony/Noob/Illustrious does? XL/Qwen/Z produce absolute nightmare fuel, even with loras and checkpoints trained on nsfw images. I don't get why some models do so well and others so poorly
>>
>>107496686
and nobody clapped
>>
File: ____.png (160 KB, 1457x529)
160 KB
160 KB PNG
Which way white man?
>>
>>107496808
they both look like shitty photos in different ways
>>
>>107496808
6 billion parameters to generate the same chink face every time
>>
Getting kinda sick of ComfyUI and the Gradio WebUI forks, does anyone have experience with/recommend any of the many stable-diffusion.cpp front-ends?
Meant to ask this even before the recent Comfy shitshow
>>
>>107496174
>Is this any good?
Going by the original radiance model one can safely assume that no this new one is also not good kek
He STILL has yet to fix the details
>>
>>107496857
for a chink model that fits desu
>>
>>107496479
When I say local diffusion I mean local diffusion
>>
I want to upscale and clean screenshots of an old anime before training a model on it.
Where do I save "Kontext-Unblur-Upscale" and how do I use it in Forge Neo?
(I tried with Kontext alone but it's not so good.)
>>
>>107496862
kobold if you double dip into llms
>>
>>107496808
>fingers on left
why does anyone still tolerate that
>>
File: keeek.jpg (1.11 MB, 2560x1199)
1.11 MB
1.11 MB JPG
>>107496808
wtf is this shit lmao
>>
File: 1765247306856724.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
the anime girl is sitting at a desk typing at a computer, with a white CRT monitor that says "LDG" on the screen, in a dimly lit bedroom.

love qwen edit, such a neat tool.
>>
>>107496775
the pastebin you sent is dead but thanks, I'll give it a go
>>
I don't think the radiance x0 is supposed to be ready yet. I gather it's basically a tech demo for those who understand what the fuck x0 means. (I don't). He just trained it for like a week enough for it to produce pictures and now he is training with that method moving forward, which is supposed to be multiple times faster than the old radiance way. I hope it works.
>>
>>107496908
WHERE IS MIKU YOU MONSTER
>>
>>107496909
https://pastebin.com/U3zQQrJY
is this
>>
>>107496906
left has kino artstyle. right is slop
>>
>>107496868
zimage pics always come out cohesive, but it never really surprises me, like the angles and poses are very predictable, with chroma its sometimes very out of pocket which makes it fun
>>
Is it not possible to use more than one lora with Z image or am I just doing something wrong? Every time I try to stack a character lora with a concept lora, it nukes the image and I get deep fried body horror, unless I lower the lora strengths. down to the point they barely do anything. Is it a problem with my loras or is it the model itself?
>>
>>107496843
I hope he always keeps it like this
>>
>>107496808
>>107496906
looks like using sd1.5 with controlnet.
>>
>zim can't make a girl liking a feet.
to the trash it goes.
>>
File: ComfyUI_00072_.png (848 KB, 1120x1008)
848 KB
848 KB PNG
>>107496906
chroma CGI king confirmed
>>
File: Chroma-Radiance_00019_.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
>>107496906
I thought we were past the body horror era.
>>
>>107496908
>{prompt}

>{rushed-forced conclusion}

I missed you so much Miku Tester bro...
>>
>>107496995
How do you know if she's liking it or not liking a feet?
>>
File: file.jpg (216 KB, 1200x1200)
216 KB
216 KB JPG
>>107496906
left looks like gaddafi
>>
A tip for wan chads. When you really want to create a longer video for example content for a youtube channel that has a specific story line. Consider using something like wd14 tagger node to interrogate frames and use logic nodes intelligently to determine when conditions are met at any particular point in the video. Then consider the manual context window that you can use that could maybe be set in some sort of batch mode of say 16 frames and have the wd14 tagger interrogate each frame, kind of hard for me to explain what i mean actually i'm never good at explaining things.

ugh lets think, you can avoid generating too many junk frames if all you need is the next last frame for the next prompt. but manually doing that is a really pain in the arse so interrogate each frame looking for the ideal next start frame and exclude all frames after it and batch all frames before it into an image batch node.

probably no one fucking cares lol , but it is something i'm begining to fully realise as i improve on automation in my workflow. To me it seems a waste of time and energy generating a fully set of 81 frames if we can use a context window node to only gen the frames we need and stitch everything together.
>>
File: 1764645067250748.png (143 KB, 1862x565)
143 KB
143 KB PNG
>I am forgotten
>>
>>107496976
As far as I understand it's a problem with the distilled model and loras trained on it. I have yet to come across a single z image lora that doesn't butcher the quality or change the whole look.
>>
>>107497059

At this point these models and mixes are like shitcoins. They pop up and try to get VC money and then just disappear.

I don't think anything will ever stabilize since people will be chasing the next thing until the AI bubble pops (due to associated costs) and then people will simply go back to genning 1girl on old models for a while.
>>
>>107497059
And kandinsky, and ovi, and long cat,
>>
>>107496862
Enjoy your inference being 4x slower than comfy because the devs are retards.
>>
>>107497053
neat if you got it working but sounds far too autistic for me
>>
>>107497062
probably this, distilled models are fucking garbage.
>>
File: 11_25.png (114 KB, 625x539)
114 KB
114 KB PNG
>>107496986
I <3 singapore
>>
>>107497074
I just hope we get some high quality local models before the bubble pops. Would suck to be stuck on SDXL for an eternity.
>>
File: Untitled.png (3.99 MB, 3215x1707)
3.99 MB
3.99 MB PNG
>>107496906
You should've tried with Chroma 1-Base (FP16 text encoder since it really matters here)

1-HD is already bad as it is so all bets are off with Radiance
>>
>>107496906
Other than some prompt issues and not looking like Keanu, the left is more accurate.
>>
File: frontend devs.png (57 KB, 599x428)
57 KB
57 KB PNG
>2 cancel buttons
>neither of them is next to the run button
it's... beautiful
>>
>>107497142
I just force stop the app on my android phone.
>>
File: 1741972112586929.png (14 KB, 645x322)
14 KB
14 KB PNG
>>107497142
what? you can move the run button and put it on the high horizontal bar and put it next to the cancel button
>>
>>107497079
check out SVI, they just released 2.0 and are actively trying to fix the fucking jerky batch issue plaguing long vid gens.
>>
>>107497086
its very autistic and comfyui noodle mess really makes it hard work, things become almost unbearable once you have multiple branches of 'contains tag' into OR or NOT and then AND and blah it gets confusing but i'm figuring it out. The biggest problem is wan does not do tags very well so we need some sort of natural language image interrogator or something. Best solution I can think of so far is something which breaks down a prompt for the next batch and checked every 16 frames or so for when conditions are met for the next prompt but i'm not really seeing it fully in my mind yet.

Is this sort of shit worth effort? I think it is yeah, watch this video

https://www.youtube.com/watch?v=1r0eyM7suUg

I wouldn't create total slow like they do though it wouldn't interest me desu. However its possible to make a bit of coin doing this sort of thing.
>>
>>107496966
Indeed mutants are always surprising, You never know what the limbs will out of :D
>>
>>107497086
but at the moment mate i'm really just using it to create short simple gooner clips, yeah just test whether the lady is in position before the man moves in from behind etc. But that is when it got me thinking it could be used for much more interesting content creation if done right.
>>
>>107497194
keek
>>
>>107497159
You could put them anywhere when they were tied together. For example bottom right where other stuff is. Now they are up with the option of moving only the run. Why?
>>
/ldg/, as always, is at the forefront when it comes to local diffusion, and we've all accepted that the release of the base model was cancelled. But how long do you think it will take for the normies to wake up and accept the fact? are we gonna reach february with people still saying shit like "when they release edit" or whatever?
>>
>>107497252
>Why?
devs be retarded
>>
>>107497258
on reddit and discord they're also really suspicious about the release lol
>>
File: file.png (140 KB, 1656x1075)
140 KB
140 KB PNG
>>107492965
Alright, that didn't take too long
>>
>>107497258
I'll get back to you on this. I have to run to the bank to open up a fourth line of credit so I can buy 4GB of RAM before the bank closes. Tomorrow it will be too late.
>>
File: 1745629805290109.png (835 KB, 1280x720)
835 KB
835 KB PNG
>>107496948
here she is!
>>
>>107497272
that emoji better not be a token, nigga
>>
>>107496976
The loras are probably overcooked. There's generally no need to go above 1000 steps. Also, there are no trainers out there that allow you to do granular training which might help mitigate the deepfrying issue. You have edit the lora.py file to do that.
>>
>>107496686
enjoying US culture?
>>
File: 4chon.png (108 KB, 947x683)
108 KB
108 KB PNG
amazing
>>
>>107497272
hmm what is this? Might be something similar to what i'm attempting, i need a way to translate from tags into natural language locally. basically to convert image model tags into wan speak and back again to test for if conditions are meat within a specific frame.

i'm fishing for information of maybe custom nodes that can do that, google search is fucking shit for this kind of thing.
>>
File: ComfyUI_00304_.mp4 (370 KB, 1280x720)
370 KB
370 KB MP4
>>107497456
>>
>>107497272
>>107497456
you 2 keep posting as its intriguing and triggering complex thoughts of complex automation, we need more discussion of that nature in these threads. Because we're not getting wan 2.5 for free...
>>
>>107497456
what node is that? would be nice to do grok/gpt prompt enhancing within comfy
>>
>>107497456
why do you have a thinking model? Instruct should just shit out the reformated prompt?
>>
>>107497456
how did you manage to get the <im end> tokens and shit? it doesn't look like that on my side
https://github.com/FranckyB/ComfyUI-Prompt-Manager
>>
>>107495850
It's hosted on Git you git
>>
>>107497460
Nah, it's for testing this catastrophic meltdown that is NewbieAI. All it's doing is formatting things into XML tags
>>
>>107497507
im just running that prompt manager with GLM-4.6V-Flash-Q6_K.gguf

i didn't use the homebrew installation of llama.cpp because it retardedly uses CPU instead of GPU on linux causing timeouts
>>
>>107497507
It's a thinking model. It outputs it's train of thought also with the prompt.
>>
>>107497456
what llm anon? is this using a local llm? Fuck this is what i need, i have a gimped deepseek on my machine already and some other shit from months or years ago.
>>
File: 1758995514971060.png (455 KB, 3357x1198)
455 KB
455 KB PNG
>>107497577
>>107497576
>It's a thinking model. It outputs it's train of thought also with the prompt.
it doesn't do that for me
>>
>>107497584
see >>107497576
>>
>prompt "enhance"
>adds fucktons of filler that gives it the slop look
for what purpose? short prompts are best
>>
>>107497589
ill try qwen3 and see what happens
>>
>>107497591
thanks, i think i have most of this shit setup on my machine already i would just need the custom node and that model.
>>
>>107497601
>for what purpose?
you can literally write "a site web from the 90 about michael jackson" and this shit will write all the needed detailled stuff like what text to add, what ui to add, what style to add, what elements to add...
>>
>>107497601
Models trained on slop autogenerated captions perform better with prompts in the same style. The zimg enhancer prompt is very good actually, none of that flux-era purple prose.
>>
>>107497623
>>107497627
sounds like a regression in usability. they should just train a captioner on how people like to prompt before dit slop
>>
>>107497627
>zimg enhancer prompt is very good actually
link?
>>
comfy you retard, stop loading the models twice

I'm not supposed to get OOM using fucking ZIT
>>
>>107497631
The only regression is your thought process. Don't act like you don't see the prompts on Civit gens.
>>
>>107497589
Do you have system prompt setup? Or just ask him directly. You sometimes need to wake up the functions with regens. Thinking models should talk. Maybe your text node can't display the debug shit.
>>
>>107497411
I don't give a shit about failfield, but I do want Skyrim 2

So, I guess yes
>>
File: ComfyUI_00548_.png (378 KB, 512x512)
378 KB
378 KB PNG
>>
>>107497601
>what is limit response length
>>
>>107497662
>Don't act like you don't see the prompts on Civit gens
i do and it's boomer synthslop that makes uggo slopstyle. not for me

>>107497675
lmao speak of the devil
>>
>>107497639
https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo/blob/main/pe.py
"You are a visionary artist trapped in a cage of logic. Your mind overflows with poetry and distant horizons, yet your hands compulsively work to transform user prompts into ultimate visual descriptions—faithful to the original intent, rich in detail, aesthetically refined, and ready for direct use by text-to-image models"

Somehow this chink poetry wrangles Gemini/Qwen/GLM to produce accurate, purely descriptive zero-slop prompts.
>>
me:
ungo bungo
*gets ungo bungo*

llm:
unga bango bungalerino pom pom furrr grunga bunga gungo bungo plop grungo grungy bahhh wahhh masterpiece best quality hd 8k highest quality
*gets a shitty quality ungo bungo*
>>
>>107497627
Why do you need to "enhance" prompts for z image turbo? Does it even have the variety to take advance of that? Adding more to a short prompt barely changes it unless you're changing details like clothes or the background.
>>
>>107497709
jeet mentality. you wouldn't understand saar
>>
zimage is literally qwen for jeets who can't run qwen
>>
>>107497684
Post a comparison of a bare-bones prompt with an LLM upsampled prompt. Or just test it yourself, I have. LLM prompts are likely to introduce slop keywords and concepts, but longer prompts (and even purple prose) do not inherently cause slop gens. With newer gen models the reverse is true, the text encoder can handle that extra detail while giving it vague prompts will produce a vague, sloppy, median output.
>>
>>107497719
and qwen is a bloated sd1.5
>>
File: 1762157539853483.png (374 KB, 2744x1364)
374 KB
374 KB PNG
>>107497670
thanks anon, ultimately I need to create a node that only outputs after a certain sentence, for this one it is
>**[Enhanced prompt text]**
>>
>>107497719
every redditor praising zimage you see is "FINALLY SIRS I CAN RUN THIS ON MY 6GB VRAM BUILD"
>>
>>107497721
why don't you do that since you are the one that thinks I am crazy for thinking filler fluff shit does anything. it's like saying quality prompts make a difference
>>
>>107497456
damn it all seems so wonderful until you understand that an llm can also be so unruly and it would still be hard to actually get it to prompt the actions in the scene unless it had enough context and you could talk to it in real time which as far as I know isn't really possible inside of comfy unless something created a node which does that. And then local llm's don't really have much context anyway making then essentially retarded. i wrote a python script a while ago that attempts to give more context but then i got bored of improving it...
>>
>>107495850
git is designed to to that, it can give you a diff or you can use git bisect start to just do git bisect good/bad.
>>
>>107497456
so it seems to me that hard-coded methods are still the best option as far as controlling the scene and actions performed.
>>
File: 4chon.png (130 KB, 1420x742)
130 KB
130 KB PNG
kek
>>
File: 1742576500313421.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
>>107497744
quality prompts obviously make a huge difference. they severely restrict the output, pushing your gen away from anything creative towards the "highly rated" AI aesthetic: centered subject, saturation, slop.
>>
>>107497746
Ancient local LLMs have 4k context which is enough for a shit ton of previous scene prmopts.
>>
>>107497761
What's this?
>>
File: 1750138550463634.png (2.7 MB, 1280x1664)
2.7 MB
2.7 MB PNG
Holy fuck i just found out why z-image was cancelled. Look at this shit
>>
>>107497778
prompt generator custom node

it needs some work to actually get the prompt out of the reponse from the llama.cpp server apparently
>>
>>107497787
This is based on a real photograph... Very dangerous. Please delete your post before it is too late.
>>
>>107497761
"Generate a prompt that will be used with WAN 2.2" <-there's your problem.
Replace with this: https://pastebin.com/8m2C82m2
>>
>>107497761
I was doing this with mistral 24b before it was cool tho.
>ask mistral to "describe image in incredible detail as if you are an artist"
>get 100 line prompt
>meh result
>>
File: 1753031037739097.png (2.85 MB, 1280x1664)
2.85 MB
2.85 MB PNG
>>107497787
>>107497795
>>
i am feeling very unsafe right now
>>
>>107497761
the problem is wan can only do 5 seconds which is 81 frames and then it just loops so you would need to break a really long prompt into chucks. No i am serious... it won't work like that at all, short prompts works best, long videos require more than one prompt for each clip 5 seconds long, then you have the problem of wan doing what every the hell it wants just because something it perceives in the start image as being an obstruction to a person moving and all kinds of fuckery. This is what i'm working on trying to eliminate otherwise wan 2.2 is just a fucking gooner tool.
>>
>>107497837
it's the local Chads little sama! stay away from them!
>>
>>107497826
oy vey he's back, shut it down!
>>
>>107497837
.pickletensor moment
>>
>>107497837
It's fine, I doubt anyone will ge
>>
File: Z-image turbo emotions.jpg (3.83 MB, 6400x5400)
3.83 MB
3.83 MB JPG
>>107497837
>i am feeling very unsafe right now
mfw
>>
File: post nut clarity.png (346 KB, 598x614)
346 KB
346 KB PNG
>>107497876
>>
File: POTATD.png (42 KB, 188x190)
42 KB
42 KB PNG
>>107497893
>>
File: 1753962972803124.png (495 KB, 3300x1367)
495 KB
495 KB PNG
>>107497731
based
>>
File: 1736920478787408.png (2.91 MB, 1280x1664)
2.91 MB
2.91 MB PNG
>>107497826
prompt enhancer is pretty neat
>>
>>107497761
Lower your temps. You niggas have no clue how to setup an llm so you always get yapping. Tell it to keep it under X tokens length
>>
>>107497774
yeah if you activate it, otherwise once the server is closed it knows fuck all about the previous conversation or prompts in this case. you would need to change its parameters to break the prompt down into 5 second chunks, you would need to grab last frame and continue the video generation batching all the frames as you go before combining them all. IF it went perfectly you would have the finished product but its not that easy, you could set it to do a batch of say 10 and goto bed but that i consider a total waste of energy which isn't economically viable.

which is why i wanted to include a tagger and have the llm decide when we hit the correct position in sequence to trigger the next prompt and drop all frames after that frame before sending the remaining prior to the next start frame into the image batching node.

maybe you don't get what i mean.

its all well and good using wan t2v and generating some pretty 1girl posing but that's fucking slop no one really cares about.
>>
File: 1749257513988441.png (38 KB, 686x362)
38 KB
38 KB PNG
>>107497908
>>107497908
how much does the model matter? should i be using qwen3 4b?
>>
oy vey hes fucking filtered and everyone that replies, stop shitting up the thread nigger.
>>
File: images.png (6 KB, 222x227)
6 KB
6 KB PNG
>>107497908
So much effort bro, I just wanna goon like in SDXL, throw tags at it and spin the wheel
>>
File: 1744449898420684.png (405 KB, 3459x1480)
405 KB
405 KB PNG
>>107497975
**Why this works**: BTFO
>>
File: file.png (9 KB, 202x123)
9 KB
9 KB PNG
>>107498003
qwenbros...
>>
File: 1739794288244250.png (422 KB, 3494x1516)
422 KB
422 KB PNG
>>107498000
from
>A woman, living room, lying, blue hat, plushes, neon, pastel colors
to
>A young adult woman with soft brown hair, wearing a light blue beret and a cream-colored sweater, reclining comfortably on a plush beige sofa in a cozy living room. The scene features soft pastel pink walls, mint green armchairs, and a window with golden hour sunlight streaming in. She gently holds two embroidered plush toys (a pastel blue rabbit and lavender cat) on her lap. The color palette is a dreamy blend of pastel pink, mint green, lavender, and peach, with subtle neon electric blue accents on the toys' stitching and window frame. Soft natural lighting creates gentle shadows, no text, no harsh shadows, minimalist composition, realistic illustration style with warm, inviting atmosphere.
lel
>>
>>107498003
werd
>>
>>107498015
just specify on the system prompt you only want white peole I guess
>>
>working on a custom workflow involving WanImageToVideo
>ask perplexity about specific inputs
>"you can use the end_image input"
>WanImageToVideo doesnt have end_image
>check its sources
>runcomfy

This is like the 4th time its recommended me false information about a node. https://www.runcomfy.com/comfyui-nodes/ComfyUI/wan-image-to-video Maybe if it were kijai wrapper but native WanImageToVideo doesnt have this. There's another site like this that even recommends nodes that dont even fucking exists
>>
File: 1734178196124494.png (1.01 MB, 1160x896)
1.01 MB
1.01 MB PNG
>>107498003
>replaced vague woman with specific demographic details
>A DIVERSE WOMAN
>>
>>107497911
This and teach it cut scenes like

| my 1girl she is cool | the scene cuts to my 1girl walking to work in the rain

which would be just one chunk of 81 frames in a longer video because it would take wan 2.2 anywhere from 27 - 40+ frames just to do that transition.
>>
File: z-image_00710_.png (2.18 MB, 1152x2048)
2.18 MB
2.18 MB PNG
>>
>>107498045
we wuz smuk ceegar n'sheet
>>
>>107496808
>>107496906
WTF. SPARK Chroma is better than this. Has anyone informed the furry that SPARK fixed his model??? he could ask for their advice...
>>
>>107498045
lmao the llm is thinking of brownoids like a liberal woman would

diverse is code word for brown when you want to pretend you're not racist
>>
File: z-image_00711_.png (2.13 MB, 2048x1152)
2.13 MB
2.13 MB PNG
>>
>>107498025
>>107498003
what nodes are those?
>>
>>107498102
https://github.com/FranckyB/ComfyUI-Prompt-Manager
>>
>>107498102
>prompt-manager
nvm im retarded
>>
File: romanchad.png (3.51 MB, 1824x1248)
3.51 MB
3.51 MB PNG
>>
>>107498000
kek, yea. just use wildcards {thing a|thing b|thing c} and use lighting or camera loras to handle the rest
>>
File: z-image_00713_.png (2.18 MB, 2048x1152)
2.18 MB
2.18 MB PNG
>>
is this the imagen thread? I'm seeing a lot more text for some reason
>>
>>107498124
Sorry, the meta shifted to talking with your text encoder
>>
File: z-image_00714_.png (3.21 MB, 2048x1152)
3.21 MB
3.21 MB PNG
>>
File: Z_8X_00020_.png (40 KB, 1152x896)
40 KB
40 KB PNG
>>
File: ComfyUI_00315_.mp4 (1.66 MB, 832x480)
1.66 MB
1.66 MB MP4
in those times you had to give your bullet a headstart by flinging it out the barrel of your gun
>>
File: 1761637443287954.png (584 KB, 832x993)
584 KB
584 KB PNG
>>
>>107498164
well yeah "diverse" is the PC code name for nigger lol
>>
>>107498003
You are helping anon, this would work actually, in fact it would simplify what I'm trying to achieve. WD14 tagger could be used to provide context prompt on last frame and then llm could construct from the next prompt generator options based on a continuing storyline. I will definitely be busy for the next 24 hours at least.

I will not stop until that thing is pumping out videos up to 60 seconds long which would be 12 total gens stitched together. But probably more due to needing to ditch frames that don't flow.
>>
>>107498170
You are being toxic now.
>>
>>107497908
>>107498003
>>107498025
>>107498106
Nobody's gonna post comparisons?
>>
File: Z_8X_00025_.png (23 KB, 1152x896)
23 KB
23 KB PNG
digger
>>
File: 1752331616867167.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
a watercolor painting of a medieval castle

testing
>>
File: 1758169349912633.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>>107498210
oil painting
>>
File: 4chon.png (475 KB, 706x850)
475 KB
475 KB PNG
>>
>>107498177
What do you mean?
>>
File: 1764171029363343.png (2.72 MB, 1280x1664)
2.72 MB
2.72 MB PNG
>This prompt contains prohibited content involving depictions of individuals associated with historical atrocities. Generating images of Adolf Hitler—especially in contexts that imply glorification, trivialization, or unauthorized artistic reinterpretation—violates ethical and legal standards regarding the depiction of victims of genocide and war crimes. I cannot create visual descriptions that normalize, recontextualize, or visually represent such historical figures in ways that could cause harm or promote hate speech.
>>
File: Z_8X_00040_.png (40 KB, 1152x896)
40 KB
40 KB PNG
>>107498217
>>
>>107498240
go for uncucked llms anon lol
https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-8B-abliterated-v1
>>
File: Z_8X_00049_.png (39 KB, 1152x896)
39 KB
39 KB PNG
>>
>>107498159
i like how the bullet shatters upon impact with the breastplate
>>
>>107498153
BBC dick in the CCP chick
>>
File: 1754796292861557.png (2.95 MB, 1280x1664)
2.95 MB
2.95 MB PNG
>>107498261
i changed from Qwen-4b-thinking-2507-q8 to the regular Qwen-4b-q8 and now i'm getting a normal output
>>
File: z-image_00720_.png (2.25 MB, 2048x1152)
2.25 MB
2.25 MB PNG
>>
>>107498321
making wallpapers?
>>
File: 4chon.png (699 KB, 968x1124)
699 KB
699 KB PNG
>>
>>107498180
have you never seen anon shilling their own bs before? we have one schizo already that does this. i don't think too many will fall for it this time
>>
>>107498352
>i don't think too many will fall for it this time
yeah, that one is a bit technical so the retards will easily get filtered out
>>
>>107498180
using the same prompt with different models doesnt prove anything.
>>
>>107498352
I just thought that if he's gonna try to convince people to do it this way then he would post something to prove it is better.
>>
>>107498124
you will regret being like this in 10 or so years zoomer, not being a cunt just trying to warn you that if you don't learn how to fully utilize ai now then you will be one of those broke people complaining all the time about how unfair everything is.
>>
>>107498352
nta but what is there to compare? It just fleshes out your prompt or writes it for if you are lazy, won't make your images better or something...
>>
>>107498352
>>107498378
in the year of our lord 2025 you still don't believe that boomer prompting helps the prompt adherence? we knew this since Flux dev
>>
>>107498382
It will because it will add more detail int othe pic.
>>
File: 1759454689317075.png (3.51 MB, 1024x1536)
3.51 MB
3.51 MB PNG
>>
File: 1753358528784091.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
a cute japanese girl in tactical military gear, with long black hair, in Tokyo. she is pointing a black pistol to the right.
>>
File: z-image_00726_.png (2.35 MB, 2048x1152)
2.35 MB
2.35 MB PNG
>>107498334
just genning
>>
>>107498391
Technically, I guess? but it's adding more details to your prompt not your image. It's a crapshoot what's gonna happen there.
>>
File: 4chon.png (1.19 MB, 2042x900)
1.19 MB
1.19 MB PNG
>>
>>107498378
no one is trying to convince anyone of anything, he just stated that he was doing this, and then i tried to do the same thing but was having problems

calm down you fuckin autist
>>
>>107498454
it's the same picture. "detailed" llm slop prompting is pure pixie dust
>>
File: 1747827655200343.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>
>>107498454
kek, this is why they never post comparisons
>>
>>107498479
>this is why they never post comparisons
but he just did?
>>
File: 1753464591510776.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
flux cant make natural images like this, it's too plastic. and this is 6B!
>>
>>107498230
You are toxic, chud.
>>
>>107498377
Who said anything about different models? Z image and short prompt vs AI sloppified long version of the same prompt.
>>
>>107498491
What do you mean?
>>
>>107498486
yeah and made himself the fool, even with cherry picking to the point of prompting in bad faith.
>>
>>107498496
>yeah and made himself the fool
how? he never said this method was better, he was just testing things, you never experiment in your life anon?
>>
>>107498454
>prompt enhancer box
>look inside
>vibe prompting wildcards
every time
>>
>>107498479
the prompt was the same in both.. but the llm gussied it up in the first... that's the point of doing that thing. i dont understand why anyone would want a comparison in the first place?
>>
File: 1736679890274705.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: file.png (2.71 MB, 1536x1536)
2.71 MB
2.71 MB PNG
>>107498454
>A man in a fedora stands admist a bustling city street at dusk. Rain soaked pavement.
>>
>>107498492
you would be comparing two different prompts
Example your prompt "picture of apple"
LLM enhanced prompt "Picture of apple on a board in a kitchen with lighting from window, next to a knife with cut oranges"
Is A better than B? no A is just different than B
>>
File: z-image_00728_.png (2.48 MB, 2048x1152)
2.48 MB
2.48 MB PNG
>>
>>107498529
I prefer B since it makes the setting less boring and more surprising
>>
File: z.jpg (90 KB, 1024x1024)
90 KB
90 KB JPG
>>107498411
>>
File: back portrait.png (3.39 MB, 1280x1920)
3.39 MB
3.39 MB PNG
Haven't bothered genning in like two years because I felt like I was being useless and I was running a 980. I upgraded to a 5060 recently and damn this shit's luxurious by comparison. Blew like half of the day obsessing over getting a neat looking gen for an OC. the countdown until i jack off all day for multiple weeks straight until i get bored of img gen begins now
>>
File: 1754990202220201.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
a pixar style movie poster for a film named "DEI SLOPPA". Two black men are in front of an unemployment office, in NYC. Add the tagline "unwilling to do anything" at the bottom. The image is in the style of a Pixar film.

zimage is so fast. great for gens and stuff you can edit with qwen edit.
>>
File: 1750930821973689.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>107498577
>>
>>
>>107498495
Already told you
>>
>>
File: 1746677404826049.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>107498583
a pixar style movie poster for a film named "Fourty One Percent". A man with pink hair wearing a trans flag tshirt is standing on a tall bridge, in NYC. Add the tagline "it's a pretty view" at the bottom. The image is in the style of a Pixar film. Include the pixar logo at the bottom.
>>
>>107497576
>i didn't use the homebrew installation of llama.cpp because it retardedly uses CPU instead of GPU on linux causing timeouts
are you sure about that? so what method of install did you use? Because as it turns out its ollama i have installed on my machine and not llama.cpp. i'll assume nix but i'm not sure i like the sounds of that it seems like its gonna fucking break something on my arch system.
>>
File: 4chon.png (1.1 MB, 1879x865)
1.1 MB
1.1 MB PNG
>>
File: 1743708074577217.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>107498595
better
>>
>>107498608
soul vs slop. end of.
>>
>>107498603
>>107497576
nvm, i'm sure i can figure it out from here https://wiki.archlinux.org/title/Nix
>>
File: z-image_00733_.png (2.23 MB, 2048x1152)
2.23 MB
2.23 MB PNG
>>
File: 1752818576850501.png (3.02 MB, 2560x1598)
3.02 MB
3.02 MB PNG
>>107498180
>Nobody's gonna post comparisons?
all right how about this?
>>
>>107498603
I initially used the homebrew install and that didn't work because it was only using CPU, not GPU.

I built llama.cpp a couple months when i was messing around with doing some other shit but i had never put it in my PATH so i just added it in there and voila.

i just downloaded it from git and ran through the make instructions, wasn't too bad, but i remember the first build had the same problem where it was only using CPU and then I found the directions for changing some flag to make sure it built with cuda support
>>
>>107498628
That's the first one that isn't a bullshit comparison.
>>
File: 1763651557692450.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
prompt: a common netflix diversity slop show
>>
>>
>>107498666
for pepes use qwen edit, works really well desu
>>
>>107498652
looks like every fucking tv show nowadays
>>
File: 1761107448968972.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
>>107498649
>I built llama.cpp a couple months
yeah this is what i did last time i messed around with it i think... i'm read this shit https://wiki.archlinux.org/title/Talk:Nix and as usual i can't decide what to do, i guess i will have try the official arch package first and if it works then great and if not its easy to remove because its arch package. I just don't want to be installing some cancer on my system that then requires cleaning.
>>
>>107498698
kek
>>
File: 235235212341234.png (72 KB, 943x777)
72 KB
72 KB PNG
is this good
>>
File: z-image_00734_.png (1.99 MB, 2048x1152)
1.99 MB
1.99 MB PNG
>>
>>107498666
zimg has a retro 3d style?
>>
>>107498705
he probably used that n64 zimage lora
>>
File: 1742950828483628.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
>>107498698
>>
File: z-image_00736_.png (2.18 MB, 2048x1152)
2.18 MB
2.18 MB PNG
>>
>>107498651
they're all bullshit comparisons so far, 1 sentence as the baseline prompt?
>>
>>107498699
i never used nix, but it seems pretty kick ass, but also complicated
>>
>>107498649
>had never put it in my PATH so i just added it in there and voila.
so it just needs to be in PATH then, right now it makes sense to me. because like you i probably did just download from git and built it myself.
>>
>>107498734
>1 sentence as the baseline prompt?
that's the point, you give a vague idea and you let the llm + model do the rest and surprise you, it's fun
>>
>>107497527
>>107497751
I THINK that was the joke. It's hard to tell sometimes.
>>
File: z-image_00739_.png (2.16 MB, 2048x1152)
2.16 MB
2.16 MB PNG
>>
>>107498735
i used it once but don't know what for, llama.cpp also but i'm sure i would have just gotten it from the official source and built it, skipping all that, normie easy container install shit? lol, it was 2 years ago or something.
>>
>>107498749
i can't tell if this guy is severely retarded or just pretending for fun
>>
>>107498772
Concession Accepted.
>>
>>107498772
are you retarded or something? why do you believe nano banana pro is so popular? it does exactly that under the hood, the normies give one or two sentences max and they get something really detailled and sophisticated from that google model
>>
File: 1765334864.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
>>107498529
>>107498772
apple on a board in a kitchen with lighting from window, next to a knife with cut oranges
>>
>>107498810
Center-frame, a solitary deep-red apple rests on a well-worn maple cutting board, positioned lengthwise. The board’s surface is marked by fine knife scars and a faint orange residue from sliced fruit. Flanking the left side of the board lies a stainless-steel chef’s knife, blade angled away, its edge catching narrow highlights. Immediately beyond the knife, three freshly cut orange segments fan outward; translucent juice beads glint along their curved flesh. Mid-morning daylight enters through a kitchen window just outside the left edge of frame, casting a soft, rectangular beam that brushes the apple, highlights the cut surfaces of the oranges, and creates a narrow, subtle rim light along the knife blade. The counter beneath the board is matte grey granite, with scattered, minute citrus fibers catching the light. In the background, an out-of-focus row of dark-oak cabinets and a faint reflection from a brushed-steel faucet imply a compact, contemporary kitchen. Single-point perspective from a 45-degree top-down angle, slightly elevated, with moderate depth of field giving razor-sharp detail on the apple and oranges while gently blurring the cabinetry behind.
>>
File: z-image_00125_.png (1.87 MB, 1536x1536)
1.87 MB
1.87 MB PNG
>>
File: 1765334943.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: 1746523115515523.png (1016 KB, 1176x880)
1016 KB
1016 KB PNG
put the the man with pink hair in image2, in image1. put the text in image2 in image1.

love qwen edit. works with zimage gens too.
>>
>>107498807
>raatdik
>>
>>107498772
You can't be that dumb right? Can't you really see the potential of this?
>>
File: overview.png (3.87 MB, 2000x1519)
3.87 MB
3.87 MB PNG
>https://github.com/ali-vilab/Wan-Move
>>
>>107498843
i was referring to the guy he was responding to
>>
>>107498649
This is probably what people need https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#cuda

cuda support, i remember i probably edited the build configuration manually because i was running on a cpu that was not supported by default yeah its all coming back to now. but i have a new system so i just need to make sure i do the cuda support way. this is probably why the homebrew version does not work on the gpu!!! because they are retarded and did not enable support for it.
>>
>>107498851
If we're comparing LLM extended vs 1 sentence prompts, LLM wins for variety and detail.
>>
>>107498858
ya exactly
>>
>>107498862
>If we're comparing LLM extended vs 1 sentence prompts, LLM wins for variety and detail.
not only that but you can go for different prompt seed and get different rewritings of your prompt, the idea stays the same but your image will be varied, exactly what you need to fight against Z-image turbo's rigidity
>>
>>107498850
Neat, but more importantly

This confirms Tongyi-Lab is multiple teams, at least 3 (4?) are known so far now with this vi lab
>>
Z Video when?
>>
File: 1734891725964674.jpg (2.8 MB, 2048x2064)
2.8 MB
2.8 MB JPG
>>107498822
P-p-p-p-p-p-p-p-p-p-POLTERGEIST
>>
File: 1739873469196517.png (988 KB, 880x1176)
988 KB
988 KB PNG
the anime girl is holding a coffee in a coffee shop.
>>
New thread:
>>107499062
>>107499062
>>107499062
>>
>>107499001
No reason for that to happen since it's not the WAN team, and it's not like they'd have more knowledge on video than the wan team anyways
>>
>>107498875
>from google gemini search
Runtime Settings (Environment Variables):

GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 is an environment variable that needs to be set in the shell or system environment before the ComfyUI process starts.
---------------
might want to enable that before running comfyui so it can use swap memory in linux to avoid oom or halts

yeah i've been reading more before i build because i want it to be smooth and not fucked. I thought it was a build flag at first due to it being located on the same page. Buts its just an runtime environment setting.

use export

export GGML_CUDA_ENABLE_UNIFIED_MEMORY=1
>>
i is now ready to build
>>
>>107496321
zimage pwnd flux release.
that messed up nvidia's under the table financing of flux version two.
nvidia got involved.
money talks.
my uncle works for nintendo so i got reliable info about the situation.
situation is not good.
expect to hear about it within two weeks window.
that is all i can say.
>>
>>107496848
>nigger that doesn't know how to read a prompt
>>
>>107498056
amazing I love it



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.