[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108756500

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
cocky go boing boing
>>
>>108757904
progress report anon?
>>
>mfw Resource news

05/05/2026

>Decision Boundary-aware Generation for Long-tailed Learning
https://github.com/keepdigitalabc-svg/DBG

>Motion-Aware Caching for Efficient Autoregressive Video Generation
https://github.com/ywlq/MotionCache

>SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On
https://github.com/takesukeDS/SIFT-VTON

>Linear-Time Global Visual Modeling without Explicit Attention
https://github.com/LeapLabTHU/WeightFormer

>Local Dream 2.4.3 - SDXL support, tag autocomplete and more
https://github.com/xororz/local-dream/releases/tag/v2.4.3

>Sora’s downfall signals broader problems with AI’s creative utility
https://theconversation.com/soras-downfall-signals-broader-problems-with-ais-creative-utility-280013

05/04/2026

>UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
https://houyuanchen111.github.io/UniVidX.github.io

>BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis
https://github.com/MaxRondelli/BlenderRAG

>It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
https://akoepke.github.io/divgen/index.html

>Walkyrie 1.3B — Text-to-Image
https://huggingface.co/kpsss34/Walkyrie-1.3B-v1.0

>Caption Creator: Fst and portable tool for image captions and tags
https://github.com/Merserk/Caption-Creator

>VulkanForge: Vulkan-based LLM inference engine in Rust
https://github.com/maeddesg/vulkanforge

>FastSDCPU release v1.0.0-beta.301
https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.301

>Deepbooru TagWalker
https://github.com/Elliezrah/deepbooru-tagwalker

05/02/2026

>Sulphur 2: An uncensored video generation model based on LTX 2.3
https://huggingface.co/SulphurAI/Sulphur-2-base


05/01/2026

>Representation Fréchet Loss for Visual Generation
https://github.com/Jiawei-Yang/FD-loss

>Caption Generator Pro
https://github.com/CoolGenius-123/Caption-Generator-Pro
>>
>mfw Research news

05/05/2026

>TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks
https://arxiv.org/abs/2605.01761

>SteeringDiffusion: A Bottlenecked Activation Control Interface for Diffusion Models
https://arxiv.org/abs/2605.01653

>Disciplined Diffusion: Text-to-Image Diffusion Model against NSFW Generation
https://arxiv.org/abs/2605.01113

>ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text
https://arxiv.org/abs/2605.01135

>AttnRouter: Per-Category Attention Routing for Training-Free Image Editing on MMDiT
https://arxiv.org/abs/2605.01480

>SwiftPie: Lightning-fast Subject-driven Image Personalization via One step Diffusion
https://arxiv.org/abs/2605.01510

>CSGuard: Toward Forgery-Resistant Watermarking in Diffusion Models via Compressed Sensing Constraint
https://arxiv.org/abs/2605.01479

>MOC-3D: Manifold-Order Consistency for Text-to-3D Generation
https://arxiv.org/abs/2605.01743

>VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation
https://yukinonooo.github.io/VAnimProject

>Skipping the Zeros in Diffusion Models for Sparse Data Generation
https://arxiv.org/abs/2605.01817

>Unifying Deep Stochastic Processes for Image Enhancement
https://arxiv.org/abs/2605.01568

>MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models
https://arxiv.org/abs/2605.01520

>GEASS: Training-Free Caption Steering for Hallucination Mitigation in Vision-Language Models
https://arxiv.org/abs/2605.01733
>>
>>108763579
>>108763585
thanks!
>>
Russell abandoned us
>>
File: 1763347421810252.png (534 KB, 1152x842)
534 KB PNG
>>108763550
You dropped >>108758016
>>
>>108763536
>old 1.2
meaning? Anyways, I don't speak Japanese either, but this hobby introduced me to the music of those two particular artists and I like them. I always follow along with romaji translations, but now I have some motivation to actually learn Japanese.

>the vocals are heartfelt. anyway, what's the theme of this one?

Theme is whatever Claude gave me a few days ago to test the LoRA kek, here's the lyrics, then followed by romaji and its English translation (Given to the model entirely in standard Japanese, which either the merge or the LoRA has improved).
https://files.catbox.moe/9ors79.txt

In that case it skipped a line or two, but in my testing it's actually very rare now compared to how often it happens in the Turbo model.
>>
big russ... please, come back...
>>
File: ComfyUI_temp_sbvyi_00012_.png (2.87 MB, 1920x1152)
2.87 MB PNG
>>
File: anima_00036_.png (1.33 MB, 1024x1024)
1.33 MB PNG
20 seconds on a 4gb gpu but at what cost
>>
>>108762944
>>
File: 1778042092486317_.png (1.41 MB, 1024x1024)
1.41 MB PNG
>>108763702
why is she outside the building
>>
>>108763636
>>108763689
Let him cook
>>
>Curious to see if Sogna artist is alive.
>The artist is alive
>Is a woman.

Didn't expect that.
>>
>>108763636
tdrusted
formerly trusted
>>
>>108763667
sorry, I meant the old ace step 1_3_5.

>>108763350
>>108763365
tencent was involved in songbloom.
https://github.com/tencent-ailab/SongBloom

It could be kind of nice. xmas gen:
https://files.catbox.moe/pxu1ia.mp3
The reference audio was iirc Mariah Carey.

The lyrics are "

problem: it basically was really unpredictable. Prompting was a wishlist.

I wouldn't want to go back lol. BUT, as you can hear, they're not jokers.
>>
>>108763782
>The lyrics are "
based on God Rest Ye Merry Gentlemen
>>
>>108763767
pretty style
>>
File: 1760764205433215.png (1.03 MB, 1344x768)
1.03 MB PNG
>>
>>108763530
>>Sulfur can't do pussy

It's a base model, LoRA trainer will improve this, also Sulfur 2.5 waits for funding
>>
File: debo_anima_00009_.png (2.38 MB, 2048x1117)
2.38 MB PNG
>>
warning! a diss track:
https://files.catbox.moe/21bzys.mp3
>>
File: ComfyUI_00151_.png (598 KB, 1040x544)
598 KB PNG
>>
File: Parental-Advisory-Logo.png (54 KB, 3840x2160)
54 KB PNG
>>108763893
>>
>>108763906
I didn't want anyone to take it personally lol
>>
>>108763893
fuck you nigger faggot
>>
ayo I is gittin oppress
>>
File: jordan rudess.jpg (226 KB, 940x535)
226 KB JPG
>>108763893
>>
File: 1757029316324595.jpg (909 KB, 1536x1536)
909 KB JPG
>>
File: ComfyUI_00006_.jpg (1.95 MB, 2286x4096)
1.95 MB JPG
waiting the API schizo this level of detail, this gen is 1 year old btw.
>>
>>108763971
(thumbs up)
>>
File: 1770031980345642.png (1.35 MB, 1344x768)
1.35 MB PNG
>>108763827
>>
anyone else firing up the kino factory?
>>
>>108763550
>https://huggingface.co/circlestone-labs/Anima
>405k downloads last month
woaw
>>
>>108763782
They made a slightly better one actually, it was called https://github.com/tencent-ailab/SongGeneration
It still left so much to be desired and was basically ACEStep 1.0 tier. ACEStep had a 1.5 version which was much better than that, and then it was iterated on and 1.5 XL is even better than standard 1.5. We are very fortunate to have ACEStep. Local audio models were in a very sad and dire state beforehand.
>>
>>108764079
neat. Still getting updates.
>>
File: 1761211995361161.jpg (747 KB, 1536x1536)
747 KB JPG
>>
>>108764079
have you tried running your generations through a mastering model? i remember someone suggesting that in a past thread. never tried it
>>
>>108764115
>a mastering model
is there one that actually works? I have 5 tracks that needs cleaning
>>
>>108764115
All the generations I posted yesterday (Fate Gear, Zutomayo, Miku) are mastered actually. I do it like second nature before sharing gens, because Turbo just is slightly worse at capturing details. I'm the one who recommended it after finding out about Matchering 2 thru Discord. Some songs are better than others for mastering so I just rotate around an album and it usually only takes 1-2 tries. The base merge does not need as much mastering as Turbo anymore though, as in the initial output is not as noisy and the voice is crisp, but that only means it sounds way better when mastered so I go ahead and do it anyways.
>>
Nofap day 10, honestly, keep remembering that motorcyclist's cleavage.
>>
>>108764147
>I have 5 tracks that needs cleaning
Matchering 2
https://github.com/sergree/matchering
>>
>>108764170
>Matchering 2
I've tried it actually, just wasnt super happy with results. Perhaps I used bad reference song.
>>
>>108764154
can you post a before and after?
>>
>>108764177
I recommend trying Web Audio Mastering, but Matchering 2 with a good source can sound much better than that. There are some songs with poor mastering that lowers the volume a ton, finding a good one is just research, find a song that sounds clear. Googling for well mastered songs in genre you're targetting may help. Look thru playlists. Tip: Movie scores tend to be well mastered due to insane budgets. LLMs like Gemini can give ideas as well. You'd be surprised. I don't search anymore because I have good songs and albums that apply to everything I have (just downloading albums from a few target artists is enough).
>>
>>108764014
kino
>>
I bet that motorcycle zoomer wants a gen x nofap boyfriend.
>>
>>108764198
nofap is not supposed to turn you gay. rebuke the devil
>>
>>108764203
>:(
he said show my penis to drivers on the highway
>>
>>108764225
he is trying to use you as a grim reaper
>>
>>108764237
lol
>>
File: lily 3.png (1.31 MB, 848x1264)
1.31 MB PNG
>>108763721
is this local? reminds me picrel claimed to be local too
>>
>>108764186
Turbo gens since I'm AFK, these are from weeks back (before I figured out the DiT only trick so gens might be more boring than what the Turbo model can do).

https://desuarchive.org/g/thread/108694497/#108695746
Top is mastered, bottom is pre-mastered.

Pre-master
https://vocaroo.com/1a4VSBjqkuSX
Master
https://vocaroo.com/1b0F41rAgXqR

Depending on the gen, it doesn't get rid of every issue, but it can make it go from unacceptable on decent headphones to okay sounding.
>>
>>108764192
Do you use 32bit float, just normal settings?
>>
>>108764274
oh, i thought it would fix the metallic sound. sounds like some basic stereo manipulation. someone needs a model that actually rebuilds the full song to make it HD, kind of like asking image editing models to make something higher quality
>>
>>108764283
16 bit WAV files. Default settings on UVL, I use this desktop app
>https://ultimatevocalremover.com/
>>
>>108764300
you can, it often works. i think it's easier on wan though.

usually it's a matter of time and offloading to vram if yourt gpu's vram isn't so large
>>
>>108764274
what do you mean by 'mastered'? What are you using?
>>
File: he sold.jpg (134 KB, 1086x1448)
134 KB JPG
>>
What do you guys actually use these generated images for? I liked the process to get image and video generation working on my system, but now I'm stumped, I really have no use for any of this. Just generating random images is a bit meh.
>>
File: lily 5.png (1.26 MB, 848x1264)
1.26 MB PNG
>>108764270
Which nigga claimed that was local?
(These are Nano Banana + editing in GIMP.)
>>
>>108764398
making kinos for my amusement
>>
35
>>
>>108764495
it was in adt
adt for local
fuck you nigger
>>
>>108764578
adt is for anime
>>
>>108764586
for anime generated locally
do you see any mentioning of nai or other cloud shit
kys
>>
https://files.catbox.moe/x5z448.mp3
The text:
https://webstersdictionary1828.com/Dictionary/joy

Needed cfg scale to be 13.

>>108764398
similar to above. for memory stuff.
>>
>>108764617
fyi, if I had fed that into udio, at least at the time I left, it would have never worked. You had to really slave to get uneven verses working.
>>
>>108764672
ie udio, at least in the past, wants to talk instead of sing.
>>
>>108764734
Hmmm nyo
>>
>>108764597
Fuck you, /adt/ is for anime diffusion in general. Here is for local realism and western art
>>
File: 345956.png (759 KB, 1835x1047)
759 KB PNG
anyone else making ltx kinos?
>>
>>108764781
I am glitching out with Sulphur. What are your steps/cfg/distill LoRa strength?
>>
>>108764781
it can do 40 seconds?
>>
>>108764781
upload i wanna see
>>
>>108764757
nigger
> do you see any mentioning of nai or other cloud shit (in the op)
fuck you
>>
>>108764787
i am using the default settings on wan2gp. it seems to be 8 steps and i don't know what the rest of the settings are since it's hidden from the UI
>>108764790
it can go as far as you want. i'm just being meticulous with small extensions
>>108764808
i will once i reach the bridge after the chorus
>>
>>108764734
>>108764511
>>
>>108763567
I captioned the dataset late night and said fuck and started training. Just woke up.
It's done training but preliminary test seem to show disappointing results. Maybe earlier steps are better or needs a different prompting meta than I am doing. (Cope I know but I need to test it more.)
I honestly didn't expect to get it right first try, but still.
I will make a training help blogpost later.
>>
>>108764781
i was trying sulphur, it still seems quite tricky compared to wan tho
>>
>>108763550
Do you know how to configure ComfyUI ZImage to work with the Hammerai website? I have the portable version for Windows. I have a problem with it not being able to find the models.
>>
>>108765059
the prompting for ltx seems to be very sequential. you have to write it out in the order in which things should be seen. if you mention that she has boots at the end of the prompt, the camera will aim down to look at her boots only at the end of the video
>>
hello? retards?
>>
>>108766065
retard here how can i help
>>
>>108766094
where are you guys
>>
>>108766065
hello? is this thing on? am I all alone? is anyone there?
>>
>>108766102
>>108766065
anima posters moved to anime generals
>>
>>108766239
they’re discussing loras and stuff in /hgg/
>>
Give it a rest pal no one believes you
>>
File: Anima_02242_.png (1.05 MB, 832x1216)
1.05 MB PNG
So here is an interesting thing about anima's @ keyword for styles.
Anima understands on a fundamental level that "@" is associated with styles to the point that it will hallucinate watermarks and patreon links on occasion if you feed it a nonsensical style tag.
(This was with @real photo and my lora disabled for testing comparisons.)
You don't see anything similar in sdxl.
When there is a watermark or link hallucinated, it is almost always gibberish and not strongly related to anything on your prompt.
>>
>>108766300
what was your captioning like, and how many steps was that?
i was curious to see realism trained as an @, but that doesn't look promising.
>>
File: 1588629222545.gif (933 KB, 220x220)
933 KB GIF
i've received warnings for racism (what) in the past, but look at the attitude of the Chinese. they're now ignoring the users who made them famous. that's not a respectful attitude, is it?
>>
wat is weeb labs?
>>
File: ComfyUI_temp_gqkgg_00022_.png (2.53 MB, 1520x1040)
2.53 MB PNG
After playing around with the Spark.Chroma 1024 model, my conclusions are as follows:
1 - this seems like a sidegrade compared to 512 and a downgrade to preview
2 - It is better at following some specific prompt details, like film grain
3 - It is worse at replicating now-photographic styles (unable to do american comic books, worse at black and white manga for two examples)
4 - It seems worse at generating faces on a crowd, something I didn't felt with 512 model
5 - the preview model still is the best overall, making a good compromise and being better at non-photographic styles
6 - Chroma is still the best for NSFW work, being the only one that is able to generate correct male genitalia (not just on the man that is penetrating, but also on males on a crowd or by themselves) and blood/gore (being able to render carcasses, body interiors, blood drips etc. much more coherently than other models)

https://files.catbox.moe/5js4so.png
>>
File: Chroma_final_00028_.png (2.31 MB, 1520x1040)
2.31 MB PNG
https://files.catbox.moe/c8bcl1.png
>>
File: Chroma_final_00034_.png (2.54 MB, 1520x1040)
2.54 MB PNG
https://files.catbox.moe/1h72q7.png
>>
File: Chroma_final_00005_.png (2.22 MB, 1040x1520)
2.22 MB PNG
https://files.catbox.moe/j94wjf.png
>>
File: Chroma_final_00032_.png (1.84 MB, 1920x800)
1.84 MB PNG
https://files.catbox.moe/ao3ew5.png
>>
File: Anima_02243_.png (1.13 MB, 832x1216)
1.13 MB PNG
>>108766338
I already posted a bit about the captioning last thread but here is an example caption:
>@real photo. A young White woman in her early 20s with vibrant, wavy red hair leans back against the thick, gnarled roots of a large tree in an autumnal forest. She gazes thoughtfully upward and away from the camera, her face framed by her bright hair and accented by dark, plum-colored lipstick. She wears a black long-sleeved turtleneck sweater and white pleated trousers, with her arms crossed comfortably over her chest. Small, white flower-shaped earrings are visible on her ears, and her fair skin has a soft, rosy glow on her cheeks. The massive, grey tree roots cradle her body, while the ground behind her is covered in a thick layer of fallen orange and brown leaves. The background trees and the leaf-covered slope are softly blurred, creating a shallow depth of field that keeps the focus entirely on her.
I had 5 epochs with 1300 images so 6500 steps. Fuck it might as well as turn this into a help me post. Batch size 2, 1024p, AdamW, LR 0.00003, cosine, WD 0.001, betas=0.9,0.99 96 rank 48 alpha, 0.05 dropout, sigmoid_scale 1.3 max_grad_norm 1.0
Any ideas how to proceed?
As embarrassing as it is, here is the same image with the lora. There are better images and worse ones.
>>
comfy will pay you minimum wage to make workflows
https://docs.google.com/forms/d/e/1FAIpQLSdCrLN2UBKjeqz30__wMjTAXoVqhypTTq1Gl08y2nvKaEf98A/viewform
>>
File: Anima_02225_.png (1.27 MB, 896x1152)
1.27 MB PNG
>>108766637
Posting a slightly better example to feel less shame.
>>
File: Chroma_final_00024_.png (2.12 MB, 1040x1520)
2.12 MB PNG
https://files.catbox.moe/oak48j.png
>>
>>108766637
desu i just use the diffusion-pipe defaults
>>
File: ComfyUI_temp_epzdk_00005_.png (3.14 MB, 1824x1248)
3.14 MB PNG
https://files.catbox.moe/qqx56k.png
>>
File: ComfyUI_temp_yhnyn_00037_.jpg (1.07 MB, 2880x1632)
1.07 MB JPG
https://files.catbox.moe/0kv33c.png
>>
>>108766681
I mean I don't think this deviates strongly from the settings in the rutkowski lora.
>>
File: ComfyUI_temp_yhnyn_00076_.jpg (1.25 MB, 2880x1632)
1.25 MB JPG
https://files.catbox.moe/rlv0j1.png
>>
>spend a few hours carefully curating more data for my lora
>train next version
>its worse than the precious version
>>
File: ComfyUI_temp_varei_00002_.jpg (707 KB, 2880x1632)
707 KB JPG
https://files.catbox.moe/38yj18.png
>>108766782
many such cases
>>
>>108766637
6500 steps doesn't seem like nearly enough steps with 1300 images. 1300 is kind of a huge amount of images for a lora what a single concept.

Just try baking it for longer. If you have a pretty diverse data set you might wanna take it all the way to like 20000 steps, just make sure you're saving checkpoints often.
>>
>>108766637
>>
what the fuck is the obsession with making realistic loras for an anime model? makes no sense, it looks like shit
>>
what the fuck is the obsession with making an anime model by using cosmos a realistic model? makes no sense!
>>
>>108766786
Are the tall dudes invisible to the bottom dudes?
Also second dude from left has an arm overlapping the middle cape.
>>
>>108766814
6500 takes close to 8 hours on my GPU.
I guess I can train for a day after taking so long to curate the dataset.
But I wonder if I can pull off faster convergence without frying with higher LR?
>>108766880
Troll post but I want to use its character and NSFW knowledge for realism.
>>
>i-it's a troll post if i get told the truth
>>
>>108766880
it's a small model, it has great prompt adherence, it's easy to train, nsfw out of the box, it has a lot of concepts trained into it that negate the need for specific character/style loras, and it's fun to fuck around.
plus a lot of people are looking for something to replace legacy models, anima will be that model, so may as well get a jump on it now.
>>
>>108766786
i wish we had z-anima, the parameter count for anima is just too small to do scenes like this in it.
>>
>>108766880
The real question you should be asking is why hasn't anyone made a 3DPD porn model even though it has the most original and high quality data on the internet. It's been years and not even a hint of one being made. Very sad for pig enjoyers.
>>
>>108766924
preview 3. no sense in wasting compute until the model is done.
>>
>>108766910
>anima will be that model
almost every lora for anima has a barren sample section, nobody is using the model and the people that use it are making the exact same slop they did with sdxl so creativity has peaked
>>
>>108766907
What model do you use when you want to see a photo of a fictional character taking multiple cocks into her asshole?
>inb4 Chroma
>>
>>108766971
>taking multiple cocks
even in your fantasies, you watch.
>>
If a use a distilled model which uses 8 steps does it mean the image has reached its full convergence? If it’s so then why increasing steps it will still change? As an example using ZIT or Ernie around 10% of the image will still change between 8 and, let’s say, 20. It will also produce some kind of patchy textures, why? Using other models like CHROMA FLASH this will be even more pronounced, again why? Also all these models should in theory work at best when using ODE samplers and a simple noise scheduler, then why SDE samplers seem to work good?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.