[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>108076014

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: 00003-49464576.jpg (1.18 MB, 2560x2048)
1.18 MB
1.18 MB JPG
>>
>>108078691
>same gartbage collage
at least its a fagollage I guess
where the fuck are the 1 girls faggot
>>
>>108078782
>>108078788
It's "@pigeon666, masterpiece, realistic" and the description of everything you see in the picture there. Now what?
>noo I need the entire prompt
Well I'm not giving it to, that's my OC donut steal, I'm not donating my OC to your dirty grabby hands. This is the bit that's relevant to the style
>>
File: 0158331.png (1.15 MB, 928x1120)
1.15 MB
1.15 MB PNG
>>
File: 45246.png (1.71 MB, 928x1120)
1.71 MB
1.71 MB PNG
>>108078817
was meant to be like this
>>
>>108078817
good gen but
>no slight pantyshot
sad
>>
File: 1.jpg (394 KB, 1320x1568)
394 KB
394 KB JPG
>>108078817
See how soulless that style is compared to the awesome style of the old models I use
>>
>>
File: 10523217.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>
File: 00005-3411327429.jpg (2.32 MB, 2560x2048)
2.32 MB
2.32 MB JPG
>>
>>108078813
>no qwen boilerplate
>highly simplistic prompt
Duh it "looks good" on whatever slopmix. With raw finetunes such as Anima you need to be autistic with your prompt. I'm assuming you weren't one who used regular Noob and instead preferred some downstream mix.

It's fine to prefer slopmixes because they are easier to handle, just don't whine about not getting good outputs from raw finetunes because you only use very simply prompts.

You will call this cope but the real cope is needing to wait for someone to mix a bunch of models together so that "1girl, standing" doesn't look like ass.
>>
>>
>>108078863
without metadata its pretty pointless to continue honestly
>>
>forced tags
>@
>"you are a helpful..."
tripleslop
>>
>>108078856
yo who is this?
>>
>>108078863 (cont)
The entire idea of mixes is a tradeoff between ease of use and elasticity/expression, this is not a controversial concept. It is simply a fact.
>>108078882
Combined with the fact that they aren't even the same resolution. But I'm guessing it's moreso stubbornness than purposeful trolling. I saw the same thing when anon would refuse to acknowledge that Illust was better than pony.
>>
File: 9357.png (3.16 MB, 1088x1904)
3.16 MB
3.16 MB PNG
>>
>>108078845
how can u like this illuslop
u have shit taste
kill urself
>>
>>108078897
jay effkay
>>
File: i4u-kero.jpg (1.86 MB, 1800x1260)
1.86 MB
1.86 MB JPG
>>
>>108078863
>no qwen boilerplate
NTA, but do people really? It's making no noticeable difference in my testing so far, other than feeling stupid.
>>
File: 1759446052394249.png (3.65 MB, 1152x1888)
3.65 MB
3.65 MB PNG
okaeri!!!!!!!!
>>
File: 1764651025344789.png (3.35 MB, 2016x1120)
3.35 MB
3.35 MB PNG
yuribros!!
>>
>>108078913
>>>/g/dalle
>>
>>108078845
This is more of a GPU RNG vs CPU one, GPU RNG is superior at expression
>>
File: 1766896355957496.png (3.98 MB, 1728x1248)
3.98 MB
3.98 MB PNG
today its frieren friday
>>
File: 1753276062350623.png (3.86 MB, 1152x1888)
3.86 MB
3.86 MB PNG
>>
>>108078914
The more complicated the prompt the more the effect is noticed.
>>
File: 1769682403629668.jpg (655 KB, 1536x1536)
655 KB
655 KB JPG
>>
>>108078920
>>
File: 1754847444097549.png (3.69 MB, 1216x1824)
3.69 MB
3.69 MB PNG
ready for the stark date
>>
>>108078934
Is that still the case for tag-style prompts?
>>
>>108078955
Yeah. I don't use any NLP with Anima anyway. But my prompts are still paragraphs long.
>>
File: 1.jpg (296 KB, 1320x1568)
296 KB
296 KB JPG
>>108078863
>simple prompts
I want the style of the artist, dumbass. I don't want that
>digital painting, highly detailed, cinematic lighting, sharp focus, concept art, trending on artstation, award winning, unreal engine 5, deviantart, octorender, 8k, 4k, 16k, alphonse mucha, ilya kuvshinov, artgerm, greg rutkowski, magic the gathering art, d & d character
prompt that you think is the peak of style, I'm just not interested.
>>108078905
If this is slop lock me up in a pigsty
>>108078898
>he thinks style changes based on resolution
You have literally never used a booru model have you
>>
>>108079005
illuslop is 100% recognizable because all illu gens have the same style/shading
its uncanny
SLOP
>>
>>108079005
The idea of longer prompts extends to the overall ability and adherence of a model. Again, it is not new information that raw finetunes necessitate autistically tagged verbose prompts. If you do not wish to take advantage of the elasticity provided by non mixed models than that is your prerogative. The situation you find yourself in, preferring old mixes over newer better finetunes, is neither new or unique - given enough time, unless a better model drops before, I'm sure someone will release a mix you will be happy with. That's just how model timelines go.

>You have literally never used a booru model have you
It's just one more thing to knock you on, it's not a true 1:1 comparison.

I won't dog on you for preferring slopmix styles but you shouldn't be surprised when very simple prompts don't "look good" on non mixed models. Again slopmixes are unironically predicated on being easier to use at the expense of a stronger "default" (read: slop) style.
>>
>108079093
nice llm reply shill
>>
>>108079116
It is a little humorous considering this same kind of conversation took place during the early days of Illust, but unfortunately you are incorrect
>>
File: 1740104729141163.png (1.78 MB, 1312x1568)
1.78 MB
1.78 MB PNG
>>108079005
>>
>>108078944
>dongload
>>
>>108078928
Plastic ears
>>
https://github.com/sdbds/ACE-Step-1.5-for-windows/tree/qinglong?tab=readme-ov-file#-installation

acestep 1.5 with cover functionality, get the portable zip, like comfyui portable.
>>
File: o_00197_.png (1.55 MB, 1280x768)
1.55 MB
1.55 MB PNG
>>
I don't think cutting out artist styles is a good idea but I don't get why people care about a model having too much of a specific look. Every single model does that, even SaaS slop like NBP that probably trained on everything. I think the future is going to be a bunch of artists training their own individual finetunes to be nothing but their own style rather than there being a single "do everything" model. At least once the current generation of anti-AI luddites die off.
>>
>>108079264
>native Comfy nodes are still crap
>kana112233/ComfyUI-kaola-ace-step doesn't work
Might as well try this.
>>
>>108079276
>I don't get why people care about a model having too much of a specific look
it makes the model less interesting. predictable.
>>
>>108079300
But all AI models do this like I said. Complaining about it seems a little pointless and it's probably going to become the norm in the future because copying someone else's style is even less interesting.
>>
>>108079276
Okay, catjak.
>>
uh oh meltie incoming
>>
>>108079295
try it, there is a shitload more functionality than the comfy workflow. also not that large (zip)
>>
>>108079351
It isn't but it's downloading all safetensors again. Fucking hell.
>>
>>108079361
it's worth it for the cover/repaint options, pretty funny even though im still figuring it out.
>>
File: KekstoneDoesItAgain.jpg (3.75 MB, 2496x1664)
3.75 MB
3.75 MB JPG
>>
>>108079320
some have less a default style than others, i dont get why youd want more of a specific default look even if you assume its inevitable, which i disagree with
>>
>antichroma schizo again
>>
>>108079005
how do these models handle the (actually a lot of) older Booru artists who never published any work at even 1 megapixel? I.e. the native resolution of their entire body of work is below that
>>
>>108079320
Emulating and combining multiple artists styles will ALWAYS be more interesting than relying on a models default look kek
>>
File: 1709947290459.png (12 KB, 715x174)
12 KB
12 KB PNG
>>108079371
More interested in picrel there but I hate Gradio and don't have patience to prepare an actual audio dataset.
>>
File: 1755387787780410.jpg (274 KB, 1179x1627)
274 KB
274 KB JPG
>pulled
>ModuleNotFoundError: No module named 'comfy_aimdo'
>>
>>108079399
update the python deps retard
>>
File: laughing oiran.jpg (57 KB, 852x480)
57 KB
57 KB JPG
>>108079399
>he didnt update requirements.txt
>>
>>108079379
It's less about wanting it to happen and more about it being inevitable. If even Google can't stop their model from being easily detectable after its gens get spammed billions of times I don't get why people expect some random finetuner will figure it out.
>>108079390
That's what model merging is for. Combining artist styles from just prompts is just a band-aid for our current era where it's still really difficult to train full models without being rich and hoarding terabytes of data.
>>
>>108079374
I would fund this
>>
>>108079393
>Setting constrained decoding max_duration to 240s based on GPU config (tier: tier5)
Well shit. In Comfy I tried to do 480s on 16GB and oom'd, if my limit is this low it isn't happy news.
>>
>>108079414
Google actually has an incentive to make their outputs homogenized so if they are brought to court they can easily say "the average person would know that's a gen because all our gens look the same".
>>108079414
>That's what model merging is for.
No, models like NoobAI or Anima are adept at combining artists styles BECAUSE they do not suffer from the same default look as something like WAI. I'm having trouble understanding what you're trying to say...
>>
>>108079371
also might try this:

https://www.reddit.com/r/StableDiffusion/comments/1qxs5qv/acestep_15_full_feature_support_for_comfyui_edit/
>>
File: o_00203_.png (897 KB, 1280x768)
897 KB
897 KB PNG
>>
File: 1761354563519452.jpg (102 KB, 1004x1235)
102 KB
102 KB JPG
>>108079401
>>108079408
>had to run the updaters multiple times
>and paste in three separate string before it stopped complaining
>>
>>108079282
>am I supposed to use natural language or not?
nah its not needed, just an added benefit if youd like to do it with spactial prompting and whatnot
>>
>>108079447
>Google actually has an incentive to make their outputs homogenized so if they are brought to court they can easily say "the average person would know that's a gen because all our gens look the same".
Fair enough but I can usually tell which base model is being used for gens no matter how diverse it supposedly is at different styles so I don't think anyone has figured this out yet.
>I'm having trouble understanding what you're trying to say...
Because you're only looking at it right now. At a certain point when training becomes less demanding and costly, models aren't going to be focusing on doing everything when there will be hundreds of other finetunes doing the same thing. It's just speculation but I doubt we'll still be in this state where you need huge amounts of data and GPU power train an individual style with the same level of depth as a full model instead of just a lora.
>>
File: z-turbo_00011_.png (1.82 MB, 1536x1152)
1.82 MB
1.82 MB PNG
>>
>>108079276
>I don't get why people care about a model having too much of a specific look
because it's more fun to play around with different styles?
>>108079414
>That's what model merging is for.
That's more effort for less flexibility. Better to have one versatile model

also something you might not be getting here: these models need as much data as they can get, in order to generalize. only training on one artist will make for a very overfitted piece of shit. the model can get better at one style by mastering other styles and being able to handle as many perspectives and scenarios as possible. this gives it the capability to be creative, because it can pull from its deep knowledge and better understand the style you're going for, as it can relate it to other styles
>>
>>108079569
IMO dedicated loras are always better at capturing styles than native even if a model is "really good" at it. And it sounds like you'd agree with that. That's probably why I don't follow your speculation I think.
>models aren't going to be focusing on doing everything
I'd classify "everything" as realism AND anime as opposed to just all-of-booru. Like, since 1.5 we've had a speration of models for realism and models for anime. Sure, some of the realism models are decent at illustrations but I cannot recall any that were REALLY good at both realism and anime at the same time. We are already at the point of hundreds of finetuners doing basically the same thing, no?
>>
>>108079569
>but I doubt we'll still be in this state where you need huge amounts of data and GPU power train an individual style with the same level of depth as a full model instead of just a lora
you can train XL lora on a shitcard already tho?
>>
cozy levels of the bread off the charts
>>
Playing with Ace step gradio version right now. It fucks up the lyrics a lot. Not at all like the examples you guys posted here. Is it just bad seeds?
>>
>>108079414
>about it being inevitable.
It's really not unless the future is comprised solely of training on the outputs of other models. Thankfully based china proved you don't need to do this to get a good small model.
>>
File: 968.png (2.64 MB, 1136x1408)
2.64 MB
2.64 MB PNG
>>
>>108079654
are you using any of the speedup options like flash or sage-attention?
>>
>>108079659
those disgusting green/purple smudges make me irrationally angry
>>
>>108079667
What smudges?
>>
File: ComfyUI_temp_optta_00178_.png (3.94 MB, 1824x1248)
3.94 MB
3.94 MB PNG
I'm gonna slop out
https://files.catbox.moe/440v8o.png
>>
>>108079659
i dont like cats, stop posting them
>>
>>108079667
Could be a monitor issue?
>>
File: ComfyUI_temp_optta_00169_.png (3.3 MB, 1824x1248)
3.3 MB
3.3 MB PNG
https://files.catbox.moe/wrnvcg.png
>>
File: 0700.png (2.8 MB, 1136x1408)
2.8 MB
2.8 MB PNG
>>108079674
enhanced saturation/vibrance so you can see, they are mostly visible in the white of the package, I will generate some polystyrene so its more obvious too
>>
>>108079687
Can't see anything.
>>
File: ComfyUI_temp_optta_00159_.jpg (353 KB, 2880x1632)
353 KB
353 KB JPG
>>108079659
I like cats, please post more of them
https://files.catbox.moe/6yh6c2.png
>>
File: ComfyUI_temp_optta_00156_.png (3.72 MB, 1824x1248)
3.72 MB
3.72 MB PNG
https://files.catbox.moe/e7t411.png
>>
File: ComfyUI_temp_optta_00150_.png (3.85 MB, 1824x1248)
3.85 MB
3.85 MB PNG
https://files.catbox.moe/qb1tzd.png
>>
>>108078691
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
how did these end up in the OP again?
>>
>>108079662
I don't think so. There doesn't seem to be any speed up options that I can see in the setup file, nor do I see anything mentioned in the terminal on load-up. i'm not using any --switches
Workings of the gradio app seems a little obscure though
>>
https://files.catbox.moe/6rbcoj.png
>>
File: 83156959.png (2.93 MB, 1536x1344)
2.93 MB
2.93 MB PNG
>>
File: 2792.png (3.63 MB, 2047x1024)
3.63 MB
3.63 MB PNG
>>
File: ComfyUI_temp_optta_00134_.png (2.68 MB, 1824x1248)
2.68 MB
2.68 MB PNG
>>108079722
helps with engagement; nothing like some ragebait to put some life into the thread
https://files.catbox.moe/nv5tpn.png
>>
>>108079742
What's the difference in prompt or settings between left and right?
>>
You are a tiger mother. Your son is the user. He is has no job, and hasn't applied for work in months. He owns a computer but has never been on a date. Your task is to honor your ancestors by producing grand children through him, your sole heir. He likes to be called "anon".
>>
File: 1753588181915869.jpg (643 KB, 1344x1728)
643 KB
643 KB JPG
I want anima trained on 1024px. 512px is hurting details. But at least it's clear that it can do nice composition.
>>
File: ComfyUI_temp_optta_00125_.png (3.35 MB, 1824x1248)
3.35 MB
3.35 MB PNG
https://files.catbox.moe/dbyur7.png
>>
>>108079744
Don't engage with tran, comfy, debo and trani
>>
>He is has no job
>>
>>108079654
Depends on BPM, duration of song, etc... when they are very wrong you miss a lot of lyrics, but it will get most lyrics right if they're okay. Try keeping duration at -1 and playing with BPM (small 5 increment/decrements).
>>
>>108079762
same image but with vibrance/saturation enhanced on photoshop to make the smudges more obvious
>>
new comfyui bug. the bouncing asset panel image. It looks like it "wants attention". This is when the thumbnail is smallest.

it sometimes only does text.

And it isn't consistently present.
>>
File: ANIMA_P___00031_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
anima. prompting is hard.
>>
File: o_00204_.png (1.13 MB, 896x1152)
1.13 MB
1.13 MB PNG
https://voca.ro/1kXiYH5IX4ru
took 20 minutes
>>
I can use local generation for most things, but I cannot create a 1080p 10 second image to video using WAN 2.5. My only option is using Civitai. However, that would cost 3000 buzz each try.
I heard you can get 400 buzz a day with the daily challenge. I don't give a crap about Civitai nor the challenge. I just want the 400 buzz a day. Can I use any model that I generate locally (like Z Image Turbo) and submit 20 trash images to the daily challenge to collect my buzz?
>>
>>108079727
what about duration? supposedly leaving it at -1 instead of setting a specific duration helps with lyrics
>>
File: 02033-2598046212.png (1.1 MB, 896x1152)
1.1 MB
1.1 MB PNG
>>108079777
thanks, will play with bpm
Jesus trips of righteousness
>>
kill ani
>>
>>108079838
On the gradio app 4 minutes is max. This is what I used a lot but also -1 doesn't really improve things much for me.
>>
File: 1764470866879104.jpg (536 KB, 1728x1344)
536 KB
536 KB JPG
>knee sticking out
nice detail
>>
>>108079883
nice arms retard
>>
https://www.reddit.com/r/StableDiffusion/comments/1qxs5qv/acestep_15_full_feature_support_for_comfyui_edit/

try the edit or cover workflow + nodes, pretty neat
>>
>>108079687
i already told you once, i don't like cats, stop posting them
>>
File: ANIMA_P___00041_.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>>108079796
>>
>>108079883
I like the optical illusion. notice the thing that's a roof... or is it a hallway?
>>
>>108079933
stop making me horny
>>
>>108079888
checked.

Still don't see that amputee lora.
>>
File: ComfyUI_10315.jpg (3.11 MB, 1440x2160)
3.11 MB
3.11 MB JPG
What's wrong with cats?
>>
File: Anima_00008+9.png (3.1 MB, 2560x1600)
3.1 MB
3.1 MB PNG
@mapyuhin za puremiyamu is surprisingly effective despite only having like 80 pics on Danbooru, or 123 on Gelbooru. (But then, it also works with Wai.) We need to go obscurer.
>>
File: ANIMA_P___00044_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>108079949
>>
>>108079964
somehow sakimichan didn't work at all last time I tried on Anima, despite having a large number of images.
>>
acestep 1.5 in comfy with the aio workflow template

https://files.catbox.moe/lvzcas.mp3
>>
File: ANIMA_P___00045_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>108079977



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.