[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Armchair Engineering Edition

Discussion and Development of Local Image and Video Models

Previous: >>108535361

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: 1752857555200056.png (2.75 MB, 1536x1024)
2.75 MB
2.75 MB PNG
lmao
https://xcancel.com/Pirat_Nation/status/2040912110901674137#m
>>
>mfw Resource news

04/06/2026

>UNICA: A Unified Neural Framework for Controllable 3D Avatars
https://github.com/zjh21/UNICA

>WSVD: Weighted Low-Rank Approximation for Fast and Efficient Execution of Low-Precision Vision-Language Models
https://github.com/SAI-Lab-NYU/WSVD

>When Negation Is a Geometry Problem in Vision-Language Models
https://github.com/fawazsammani/negation-steering

>Take-Two laid off the head its AI division and an undisclosed number of staff
https://www.engadget.com/gaming/take-two-laid-off-the-head-its-ai-division-and-an-undisclosed-number-of-staff-182824338.html

04/05/2026

>ComfyUI-ZImage-Triton: Triton-accelerated W8A8 quantization
https://github.com/newgrit1004/ComfyUI-ZImage-Triton

>ComfyUI Assets Manager v2.4.4 update
https://github.com/MajoorWaldi/ComfyUI-Majoor-AssetsManager/releases/tag/v2.4.4

>From RTX to Spark: NVIDIA Accelerates Gemma 4 for Local Agentic AI
https://blogs.nvidia.com/blog/rtx-ai-garage-open-models-google-gemma-4

>FLUX.2-klein-9B — PolarQuant Q5: 9B rectified flow transformer
https://huggingface.co/caiovicentino1/FLUX.2-klein-9B-PolarQuant-Q5

>Qwen3.5-9B-Neo-PolarQuant-Q5: 9B on any GPU with PolarQuant
https://huggingface.co/caiovicentino1/Qwen3.5-9B-Neo-PolarQuant-Q5

04/04/2026

>STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative
https://github.com/escapistmost/Storyboard-Anchored-Generation

>Regularizing Attention with Bootstrapping
https://github.com/ncchung/AttentionRegularization

>LTX2.3-Multifunctional: Functionality optimization based on LTX desktop version
https://github.com/hero8152/LTX2.3-Multifunctional

>Gemma 4 31B IT NVFP4 model is quantized with NVIDIA Model Optimizer
https://huggingface.co/nvidia/Gemma-4-31B-IT-NVFP4

>AP Netflix VOID – ComfyUI Custom Nodes
https://github.com/adampolczynski/AP_Netflix_VOID

04/03/2026

>JoyAI-Image: Awakening Spatial Intelligence in Unified Multimodal Understanding and Generation
https://github.com/jd-opensource/JoyAI-Image
>>
>mfw Research news

04/06/2026

>Not All Frames Deserve Full Computation: Accelerating Autoregressive Video Generation via Selective Computation and Predictive Extrapolation
https://arxiv.org/abs/2604.02979

>VOSR: A Vision-Only Generative Model for Image Super-Resolution
https://arxiv.org/abs/2604.03225

>LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion Transformers
https://arxiv.org/abs/2604.02787

>Evaluating AI-Generated Images of Cultural Artifacts with Community-Informed Rubrics
https://arxiv.org/abs/2604.02406

>MMPhysVideo: Scaling Physical Plausibility in Video Generation via Joint Multimodal Modeling
https://shubolin028.github.io/MMPhysVideo-Page

>Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation
https://arxiv.org/abs/2604.03118

>Learning from Synthetic Data via Provenance-Based Input Gradient Guidance
https://arxiv.org/abs/2604.02946

>Gram-MMD: Texture-Aware Metric for Image Realism Assessment
https://arxiv.org/abs/2604.03064

>Can Nano Banana 2 Replace Traditional Image Restoration Models? An Evaluation of Its Performance on Image Restoration Tasks
https://arxiv.org/abs/2604.03061

>VERTIGO: Visual Preference Optimization for Cinematic Camera Trajectory Generation
https://arxiv.org/abs/2604.02467

>CAMEO: A Conditional and Quality-Aware Multi-Agent Image Editing Orchestrator
https://arxiv.org/abs/2604.03156

>Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning
https://arxiv.org/abs/2604.03114

>QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
https://arxiv.org/abs/2604.02816

>LumiVideo: An Intelligent Agentic System for Video Color Grading
https://arxiv.org/abs/2604.02409

>VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors
https://arxiv.org/abs/2604.02486

>Forget Many, Forget Right: Scalable and Precise Concept Unlearning in Diffusion Models
https://arxiv.org/abs/2601.06162
>>
i know this is a shill thread but can we at least remove the off-topic links from the op ffs
>>
Just ignore it.
Oh wait, it's about you LOLOLOLOLOLOLOLOLOLOL
>>
>>108538702
my use case is
>vpn into the client that hosts comfyui
>that instance is on a unstable network and will reconnect itself to the vpn host
>vpn server only allow dhcp, so the instance may get new ip after reconnecting
I guess my best option is to redirect vpn interface traffic of the comfyui's port to the comfyui instance port and make it listen on localhost
>>
>>108538726
35 stars status??
>>
>>108538762
bro I dont understand, you dont have a reverse proxy? dont you jail/dockerize your shit?
>>
>>108538762
There is probably a way to have a static device on DHCP, but if not you can look into making a virtual network device, forwarding the packets, and binding to that if you don't want to just use a container.
>>
>>108538726
>can we at least remove the off-topic links from the op
but we already removed anistudio's links from the op "anon"
>>
>>108538776
explain how docker help in this context and how to route so that the work under the same vpn subnet
>>
>>108538799
I'm not sure how your setup is done, but typically you have a k8s cluster with multiple nodes (or if you're poor just a single VPS/VM) where you run any number of pods/containers. The containerazation part takes care of the networking, what you expose is set through the k8s yaml files (or docker compose if youre poor), and it also depends on if you have something like traefik as your ingress (valid for both use cases) or a dedicated nginx ingress controller.
Meaning that you don't really care if your app binds to everything, because its binding is going to be limited to the relevant container, then you forward the needed ports.
>vpn into the HOST
why need to do this? just VPN in the same network and access it through your host's interface.
networklet faggot
>>
>>108538811
thanks for sperging and not solving anything, retard
>>
File: comfy__479.jpg (2.05 MB, 1536x2304)
2.05 MB
2.05 MB JPG
>>
>>108538823
oy vey!
>>
>>108538694
>OMG DOOD PEOPLE DEAD
who cares
>>
File: 1769618086864702.png (191 KB, 400x400)
191 KB
191 KB PNG
>>108538872
I'm talking about OpenAI fucking around (by collaboaring with the US government to help them to win the war against iran) and finding out (Iran trying to destroy OpenAI's data centers)
>>
gpt image 2 releases soon, openai won
>>
>>108538818
ok gay retard
>>
>>108538889
bruh no one cares I'm here to ultracoom and gen 1girl, standing, looking at viewer
>>
>>108538889
good
>>
>>108538898
>I'm here to ultracoom and gen 1girl, standing, looking at viewer
must be a superpower to coom to SFW images lol
>>
>>108538905
I have comfy connected to sillytavern, I am literally erping with my bitch while she sends me her nudes and theres NOTHING you can do about it
faggot.
>>
>>108538911
>theres NOTHING you can do about it
why do you think I give a damn about what you do? you're a fucking nobody
>>
adamw8bit
batch4
100 epochs
2e-3
I fixed all your problems
>>
>generate multiple 5 sec wan video clips
>extract keyframes needed
>use ltx2 to refine and get coherent motion
will this work?
>>
Mayano top gun cunny
>>
>>108538944
kinda, too many frames and it won't tween, too few and the motion is still jank or outright spastic
>>
>>108538823
They don't look like that
>>
>>108538676

>>108536566
>And preview 2 is not the final version.
Is it in any way better than the first one?
>>
>>108538987
Some do
>>
>>108538898
>haha bro, I don't care!!!!
>proceeds to care, very much
>>
...how do i get bloodshot eyes on anima
>>
>>108539049
it's local, you already know the answer...
>>
>>108539059
i'm lazy
>>
>>108539063
generate the base image then ask api to make the eyes bloodshot
>>
35 stars?
>>
>>108539073
do you realize that this behavior is not healthy? get your head checked ran, you are obsessed and it's causing harm to people around you
>>
>>108539117
>you are obsessed
>ran
ironic
>>
Put tdrussell's smug lolis in the collage next time baker
>>
>>108539117
>do you realize that this behavior is not healthy?
Do I really have to take advice from the biggest schizo in town?
>>
>>108538941
jokes on you, my dataset is fucking trash.
>>
>>108538989
It's slightly more stable and character knowledge is moderately improved in my comparisons.
Nothing revolutionary, but unless you are dependent on a lot of loras trained on the 1st version no reason not to move on.
>>
how do I write my own comfyui plugin? any best practices?
>>
File: bloodshot eyes.png (300 KB, 704x400)
300 KB
300 KB PNG
>>108539049
masterpiece, best quality, an eyeball covered in red veins, guro,

Eyeball and guro are both key here. You would think with over 1k images tagged "bloodshot eyes" that would work but no, that just means "red irises" apparently.
>>
>>108539248
You mean custom node, and 90% of users just vibe code them using Claude now. Claude Pro using Opus 4.6 will yield the best results. If you want to make something that requires interacting with Comfy's API, make sure to also link the docs for Claude.

https://www.reddit.com/r/comfyui/comments/1scpgiv/maybe_im_late_to_the_party_but_claude_and/
>>
>>108539248
>Only support your own personal use case, for instance if you only use one image with your node then make sure it breaks when users attach batched images
>Make sure to package one useful node with dozens of other nodes that do the same thing as already existing nodes, but have surprising twists in functionality like obvious feature gaps, bugs, or incompatibilities
>Be sure to require a specific version of a popular package like numpy, diffusers, etc. Make sure your application breaks when the wrong version is installed so it is incompatible with other node packs.
>For a fun trick, ensure that a custom wheel must be installed to use your node pack, but still register it with the the manager so that people will install it through the UI without realizing that custom installation steps are necessary
>If possible, try to use vendor-locked features like RT core acceleration, even if it only shaves milliseconds off your node's execution time
>This one requires some luck, but ideally your node should function for 3-6 months before breaking mysteriously due to not specifying a package version number and future updates making the package no longer compatible
>>
>>108539320
You forgot the most important one about leaking memory
>>
>>108539205
Subjectively it feels like I get less unwanted letterboxing/pillarboxing from it, although it still comes up sometimes. (I also learned the word "pillarboxed", which helped with neg prompting.)
>>
>want to goon
>accidentally create the literal perfect futa
>tfw not even a faggot
>>
wow, there's totes so much "development of local image and video models" in this thread
>>
I have two load image nodes and one of them is disabled. what node can detect which is disabled?
>>
>>108539755
If you want anything boolean there is a impactpack "ifnone" node. You can pair that with "Lazy Switch" from KJNodes for example, to switch from two states.
>>
>just run this arbitrary code from the web goy
>>
>>108539867
>>108539073
>>
or you can broadcast them with anything anywhere node. requires you to bypass the aa node too though.
>>
>>108539276
Tried claude for the first time some months ago, and it's fucking insane, will programmers even be a thing within 5 years? Feels like everyone will become a weird walking jack of all trades with access to multimodal llms that can solve most practical issues.
>>
>>108539867
where should I get my code from instead?
>>
>>108539907
>will programmers even be a thing within 5 years?
Yes after the models collapse from synthslop and software stops working due to labyrinthine slopcode that nobody understands.
>>
File: 1775475431319652.jpg (704 KB, 1845x1295)
704 KB
704 KB JPG
>>108539907
>will programmers even be a thing within 5 years
will AI even be a thing in 5 years
>>
>>108539950
just try to take my goon models from me
>>
>pedaling for 30mins on an exercise bike just to generate 1girl
>>
>>108539971
/g/ would become healthier than /fit/
>>
>>108539950
>he doesn't have a doomsday usb with backups
>>
File: o_00896_.png (1.13 MB, 968x1080)
1.13 MB
1.13 MB PNG
>>
File: o_00900_.png (1.24 MB, 968x1080)
1.24 MB
1.24 MB PNG
>>
haha melty
>>
>>108540119
>ranni***r (the baker)
proof?
>>
>>108540119
can't be a good /ldg/ thread without some anifart meltie kek
>>
>>108540119
35 stars status?
>>
use ani's dead mouth as a toilet
>>
File: o_00902_.png (1.41 MB, 968x1080)
1.41 MB
1.41 MB PNG
>>
>back to whining about links after getting his jimmies tdrustled on easter
buckbroken
>>
>>108540119
>ranni***r is so fucking pathetic he will report you for posting ni***r on fucking 4chan.
What's wrong Ani? Are you're short of proxies to ban evade now?
>>
File: file.png (631 KB, 895x1566)
631 KB
631 KB PNG
>>108540119
>he knows 4chan has some rules you musn't break
>breaks the rules anyways
>got banned
>whines about it
he did the meme lol
>>
Is local still not viable for visual storytelling? aka comics/manga?

I'm guessing it still has trouble reproducing an original character with consistency.
>>
File: 1744423740886322.jpg (86 KB, 832x1216)
86 KB
86 KB JPG
>>108539666
Catbox?
>>
File: 1759117052988418.png (531 KB, 640x847)
531 KB
531 KB PNG
>>108540173
>>108538775
>>108539073
I don't lurk her that much anymore. What is this "35 stars" trolling I keep seeing?
>>
>>108540257
>What is this "35 stars" trolling I keep seeing?
https://rentry.org/animanon
>>
>>108538941
that's 400 steps retard
>>
>>108539755
rgthree switch any will pass the first non null input
>>
>>108540325
Isn't the total step count also based on the amount of images you train? 100 epochs means "iterate over the entire data set a hundred times"
>>
>>108540393
yes and i have only one (1) image in my dataset
>>
>Discover how edit works on my rig
>Start browsing pinterest for girls that I then edit them into slop
>Actually becomes some sort of a fetish and addiction
halp
>>
>>108540419
> then edit them
never use wan with shootz lora
>>
File: 1771139804945682.jpg (76 KB, 784x1168)
76 KB
76 KB JPG
>>108540401
for what purpose?
>>
is ltx 2.3 worth it?
>>
>>108540585
does it matter
>>
File: o_00905_.png (1.36 MB, 968x1080)
1.36 MB
1.36 MB PNG
>>
File: o_00906_.png (1.48 MB, 968x1080)
1.48 MB
1.48 MB PNG
>>
>>108540608
That sounds like a recipe for severe over fit even at a low step count. Then again I've never attempted training a concept/character Lora on a single image so maybe you know more than I do
>>
>>108540598
it's good for talking heads
>>
How do i make ComfyUI run in a old 2010s CPU? Anyone have any tips?
>>
>>108540691
the point is without batch and dataset size epoch numbers mean nothing
>>
>>108539666
Nice try Satan, but we all know you are a gay faggot.
>>
>>108540703
I've trained loras from 30 pics to 300 all on 100 epochs just fine.
>>
>>108539666
>likes dicks
>"I'm not a faggot I swear"
https://www.youtube.com/watch?v=hpbGz9JPadM
>>
ai toolkit doesnt even use epochs. bad way to measure quality
>>
and nobody uses ai toolkit so it evens out
>>
>>108540598
depends, usually it's just a disappointment
>>
>>108540699
get a job is the tip
>>
>>108540598
it's good for sfw memes, which is pointless for local since cloud models can do those much better, so it as has no real use.

WAN 2.2 will be the NSFW king for a long time.
>>
File: 00000-672825110.jpg (935 KB, 1448x1616)
935 KB
935 KB JPG
>>
>>108540903
you monster
>>
I hate comfyui's new loading screen
>>
Is ZIT or Flux-Klien better for image editing?
>>
>>108540944
zit
no contest
>>
zit isn't edit model retard
>>
File: 1746630363372768.png (637 KB, 1584x1298)
637 KB
637 KB PNG
>>108540903
*got replaced by Claude*
now what?
>>
>>108540944
flux
no contest
>>
>>108540953
dumbass
>>
>>108540944
>ZIT
it doesn't do image edit, you'll have to wait for Z-image edit to happen (unlikely though, we've been waiting for this shit for 4 months at this point)
>>
>>108540955
>profit surge
They are deep in red what is this fake news.
>>
Isn't Nanobanana 2 the best for image editing?
>>
So which model can do copyrighted characters?
>>
>>108540998
Literally none. you'll need loras on top of loras for that.
>>
>>108540998
pony v7
>>
>>108540998
wai-illustrious-sdxl
>>
File: it's over.png (2.08 MB, 1586x991)
2.08 MB
2.08 MB PNG
>>108540998
none, we have cucked models on local, while API models stop being cucked, we got the bad end anon
https://xcancel.com/filicroval/status/2040855037182685299#m
>>
>>108540119
You're literal human garbage and deserve all the bullying from anons you pedophile lolcow
>>
>>108541024
why is it all furry models? feels like such a waste

>>108540998
>>108541045
nah even chroma can do some shit, it just sucks in general and I was hoping for a model that isn't 90% trained on furries
>>
>>108540998
Just use grok, nano banana or GPT image 2 once that comes. Local is done for and likely not going to see any progress from here on out.
>>
>>108541065
does any of that work without account?
>>
>>108541065
Hooe you get paid for your shilling and it's not just mental illness
>>
>>108541065
>grok
not interested in csam, thanks
>>
>>108541045
they start not being cucked then get cucked. I still don't get the purpose of fake game screenshots with nothing really changed however. do you like looking at screenshots of games?
>>
>>108541158
reterd
>>
>>108541158
>I still don't get the purpose of fake game screenshots with nothing really changed however.
that's his example not mine, it just shows you can copy some cool IP styles, it's up to you to add migu on top of it
>>
>>108541176
how mature of you. also show me an interesting gpt output for once
>>
>>108541185
we still haven't seen a migu yet or a luigi huffing mario farts
>>
i'm 1girling and no one can stop me and my 15 loras
>>
Lads,

What's on the FUD menu today? Leftovers or did anon cook up something fresh
>>
>>108541126
mental illness is spending hours trying to get something done with a local model when it can be done in minutes with API
>>
>>108541218
No that's called having a hobby and being interested in tech you pay-pig
>>
>>108541233
holy cope
>>
>>108541233
you're interested in tech but refuse to look at advancements in it?
>>
>>108541252
cool. what are your other hobbies? playing video games and troubleshooting your linux install? kek
>>
>>108541218
mental illness is to shill API on a local thread, if API shit is so cool and popular, why do you not have an /aicg/ equivalent for API diffusion models?
>inb4 muhh dalle3 thread
this shit is even worse than /sdg/ in terms of spamming slop and not talking to each other
>>
>saar please connect your credit card with your goonprompts saar
>>
Leftovers again it seems
>>
>>108541213
im afraid its only the usual ani's api shill larp sir
>>
>>108541218
still waiting faggot >>108541204
>>
>>108541218
Nah spamming cucked saas shit in a local thread is
>>
Supporting local in 2026 is like trying to eat a long rotten fruit and gaslight yourself it's good.
>>
Get new material julien
>>
>When I get time I want to train at least one "official" style or character lora, and release it, with full dataset and training config file and everything. Just to show it works fine.
Please please please I want to train but don't want to waste time with potentially bunk anon hyperparams
>>
>>108541333
Why'd you release it just to be scraped by jeets?
>>
>>108541305
what do you mean?
>>
>>108541361
At least the jeets then would create better anime pics. Key word "better". Not "good" because they are still jeets.
>>
how do I make the for loop nodes in easyuse work on custom range e.g. [0, 1, 2, 5, 6] ?
>>
File: deCC_zi_00046_.png (2.78 MB, 1920x1033)
2.78 MB
2.78 MB PNG
>>
>>108541510
Fuck off
>>
>>108541510
very cool



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.