[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101664954 & >>101657582

►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: ComfyUI_00039_.png (887 KB, 1024x1024)
887 KB
887 KB PNG
►Recent Highlights from the Previous Thread: >>101664954

--Maid adds text-to-speech support with babylon.cpp library: >>101667393 >>101667490 >>101667471 >>101667537
--Largestral leads in ERP capabilities poll, followed by Command R+ and Mistral Nemo: >>101665997 >>101667532 >>101667594 >>101667598 >>101667624 >>101667705 >>101668093 >>101668248 >>101668302
--Gemma Scope and Claude analysis: >>101664998 >>101665132 >>101665157
--Gemma 2 2B and 27B generate enjoyable sibling stories: >>101672105 >>101672164
--Flux 12b model discussion, image generation, and GPU limitations: >>101669149 >>101669894 >>101670742 >>101670797 >>101672266 >>101672295 >>101671128 >>101671179 >>101671327 >>101671190 >>101671594 >>101671637 >>101671639 >>101671657 >>101671191 >>101671195 >>101671267
--FLUX.1-dev image model discussion and testing: >>101668155 >>101668479 >>101668569 >>101669064 >>101669193 >>101669242 >>101668789 >>101668964
--Clarification on static and imatrix quants: >>101665702 >>101665764 >>101669793 >>101669861
--BitNet model revival and skepticism about scalability: >>101668170 >>101668751 >>101668920 >>101668984 >>101669486 >>101669525 >>101669603
--Anon discusses running AI models with q8 or 8bpw and shares their setup upgrade plans: >>101667179 >>101667212 >>101667275 >>101667318 >>101667443 >>101668007
--Vulkan backend for llama.cpp and mixed GPU environment: >>101671136 >>101671339 >>101671402
--Image gen model can handle spatial relationships and some explicit content: >>101670204 >>101670256 >>101670640
--Black Forest Labs' new model compared to DALL-E and Midjourney: >>101671299 >>101671362 >>101671398 >>101671566
--Shnell models are lighter on resources than dev models: >>101670815 >>101670872
--Anon asks about torchchat for local LLM deployment: >>101669054 >>101669086
--Miku (free space): >>101669042 >>101670071 >>101671643 >>101671941 >>101672007 >>101673018 >>101673187 >>101673237 >>101673246

►Recent Highlight Posts from the Previous Thread: >>101664959
>>
>>101673831
Heyyy it's my pic, yayy
>>
>>101673831
You missed a couple of Mikus, recapAnon.
>>
Someone should make a "I don't want to play with you anymore" meme for this.
>>
we should just ban all shitters that run models lower than 70b, heck, running Command R+ locally is a requirement to be considered white
>>
>>101674116
Lergestral is the meta, old man. Are you living under a rock?
>>
>>101674116
I propose we merge /lmg/ with /ldg/. Anyone with less than 24GB VRAM can go back to /aicg/ or make their own /smg/.
I don't care which, just not here.
>>
>>101674142
>censored
no thanks, how else will I generate BBC gangbang cuckolding stories, am I right lmgsisters?
>>
File: output_000000002.png (707 KB, 1024x1024)
707 KB
707 KB PNG
>>
File: 1692233388587013.png (161 KB, 303x262)
161 KB
161 KB PNG
smol languag modls
>>
>>101674193
BIGSUNE NIGGU
>>
File: output_000000004.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>
File: output_000000011.png (969 KB, 1024x1024)
969 KB
969 KB PNG
>>
File: output_000000022.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
Flux is pretty good
I'm impressed
>>
>>101674150
I propose that you fuck yourself rectally with broken glass, along with anyone else who has a superiority complex associated with their VRAM. And no, before you imply otherwise; it's not the fact that you've got more of it that bothers me. It's your completely unnecessary attitude that does.
>>
How much VRAM I need to make good models? Should I buy a 3090
>>
>>101674354
What... does that mean?
Training from scratch? Stop. You won't.
Finetuning? Use cloud compute or buy a dedicated server rack, one 3090 is nothing
Loras? Might need multiple 3090s for bigger models/loras (or a lot of patience and ram)
Merging? Just a bit of ram iirc
>>
>>101674345
Touched a nerve there? No one wants to ban vramlets due to a superiority complex.
If anything the problem is some of you have this inferiority complex and drown out normal discussion with subjective opinions and stupid claims like fotm 3b model is better than 100b models as a coping mechanism.
Vramlets on average are also not as invested in this hobby and only come here for tech support and low effort erp which attract sloptuner shills like flies.
All our problems would be solved if you just had your Nemo vs Stheno discussions elsewhere.
>>
Pretty okay.
>>
>>101674477
>noo people are talking about local models I don't use in the local models general, please ban them janitors
this general is always filled with off-topic posts, the fact that you choose to complain about people that actually discuss local llms means that you're one of those spammers and you should fuck off instead
>>
>>101673831
>Largestral leads in ERP capabilities poll, followed by Command R+ and Mistral Nemo
How is nemo better than any 70b? No Miqu or llama 70b?
>>
>>101674477
>you're one of those spammers and you should fuck off instead
This, get out of here with your intellectually dishonest bullshit you jannie wannabe faggot.
>>
>>101674689
I guaran-fucking-tee you if all vramlets were banned the shitposting would come to and immediate end.
>>
>>101674718
It's creative.
>>
File: 1708811302667.png (522 KB, 774x776)
522 KB
522 KB PNG
>>101674740
Imagine a thread without Americans...
>>
>>101674768
"household" not surprising knowing jeets live by groups of 30 in the same house.
>>
File: dogolie2.jpg (259 KB, 1024x1024)
259 KB
259 KB JPG
Yeah, flux is pretty good in terms of controllability and consistency. A finetune would be huge.
>>
>zucc releases 400b
>suddenly talks about AI not being profitable from every jew outlet in the universe
I doubt 400b is close to SOTA, but he is signaling that he can wipe out any big AI player by doing open source gibs, probably blackmail going on in the background also. Big money not gonna buy into this situation ofc.
Local won basically, but we all gonna lose anyway.
>>
File: 1713302348393689.png (506 KB, 898x890)
506 KB
506 KB PNG
https://x.com/nisten/status/1818529201231688139
Bitnet bros?
>>
>>101674740
>>101674477
Thing is unless you want to move to a Discord where you only approve people upon proof of not being a vramlet, you're just gonna have to deal with it and seethe, cause you're not a janny you don't decide who can or can't post here.
>>
>>101674795
What's cheaper for some company that wants to use AI, paying openai for api use or renting space to run a 405b?
>>
>>101674795
>>suddenly talks about AI not being profitable from every jew outlet in the universe
Nah, they've been saying this for like 2 years
Im still waiting for the collapse that will kill us all and delete mistral large from my computer
>>
look guys if you don't have given nvidia at least 80 thousand dollars a month then you are a vramlet and should be killed and raped alongside the rest of your family you poorfag fucks
>>
>>101674795
>>101674803
>>101674825
>>101674899
This, but unironically.
>>
Some flux friend could help me with this error in my ComfyUI
module 'torch' has no attribute 'float8_e4m3fn'
>>
>>101674116
Yes, this, seriously and unironicaly. Most consumer mobos have ram limit of 128gb. You can buy 128gb of ram for $275, which is not much. You don't even need a GPU. I'm not an elitist. If you fail to do something as basic as that, you must be living below 20°N and have a skin browner than #C4A484. In that case I wouldn't want to hear your opinion about ANYTHING. No politics, no anime, no llms. Fucking subhuman niggers.
>>
>>101674930
try updating pytorch, the float8 type is relatively new
>>
>>101674986
People told me having 4 sticks on my am5 would run slower since it could be unstable at higher speeds, so I only have 2x48. Never tried it out with more though.
>>
dude just run all models at 0.4t/s or sell your asshole to jensen huang lmao
>>
>>101675027
>sell your asshole to jensen huang
I would but they wont release the 5000 series
>>
File: ayy.jpg (64 KB, 933x928)
64 KB
64 KB JPG
>load the model
>rp for a few turns, surprisingly good responses, no repetitions
>have I finally found good settings for Nemo?
>model seems to randomly put [/INST] in the middle of responses
>suspicious
>look up what is wrong
>I mistakenly loaded Stheno instead of Nemo
ayy lmao
>>
>>101675036
go to sleep sao it's like 1am
>>
>>101675005
Yeah, it's quite unstable at base clocks. Just loosen the timings without decreasing MT/s and it will run without problems.
>>
>>101675036
hi sao
>>
>>101675035
>consumer cards for dumb goyim
ngmi
>>
File: test_flux_distilled.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>100%|| 4/4 [18:59<00:00, 284.77s/it]

Flux spilling to RAM isn't a good experience...
>>
>>101675053
Oh okay, so I could get 4x48gb and still run 6000+?
>>
qlora-pipe anon here. I think I'll try making a pipeline parallel training script for diffusion models. I've been wanting to do this for a while, and now flux basically makes it necessary. Should be able to train a lora on 2x3090 with it.
>>
>>101675027
I'm fine with .4, mistral large is worth the wait.
>>
>>101675076
Depends on mobo and cpu, if they both can handle 192gb, then sure.
>>
>>101675099
Happy feelings
>>
>>101675109
well i'm not so i've already sold my house and cut off my cock so that i could offer it all to my jewish overlords and my god jensen huang for an a100 cluster
>>
>>101675099
make a bitnet one too
>>
Will people actually make their own fanmade anime when video exists and will it be good or is it all just a meme?
>>
>>101675050
>>101675054
I've yet to see working setting on Nemo, I tested every single json file posted here the last week, it still doesn't work like it should. Stheno just werks.

Maybe finetunes will be better.
>>
So what's the recommendation for Nemo currently?

I heard MIni Magnum is meant to be the best?
>>
>>101675174
buy an ad
>>
>>101675070
>six fingers
lmao these are the legendary "really good hands" you shills keep talking about?
>>101673666
>>
>>101675174
DRY.
>>
>>101675172
sure, in 10 years
>>
>>101675197
What would take 10 years about it? They would probably run out of video data less than 4-5 years from now, if that
>>
>>101675186
I don't know about mini magnum, but dory is alright
>>
File: Miku Basada.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
Finally I get this work.
>>
>>101675186
i prefer mini magnum
>>
>>101675194
It's the float8 quant, so you can't really blame the model... much.
>>
>>101675196
Even if you get it to not repeat phrases it's still going to end up getting boring after a while. I assume people who like it just stop after 4k context for whatever reason and move on to something else. Bigger models are still way better overall for long interactions.
>>
>>101675257
I'm using it for 30k long chats so not really
>>
>>101675213
Hardware must be way better for video, also models must have time to mature. We had LLMs for like 6 years and they still are trash, video is way more complicated than text.
Now that I written it I see that I was mistaken. 15 years may be a better estimation if we talk about studio tier animes.
>>
>>101675276
And that really goes better than your 30k long chats with 70b+ models?
>>
>>101675174
Try Celeste.
It uses some of the same dataset as Stheno.
>>
>>101675421
can c.ai do this?
>>
>>101675447
only if you are a huge mikufaggot
>>
File: file.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>
>>101675421
cringe text and art 3/10
>>
>>101675471
you are seething and coping
>>
>>101671339
you may get better performance if you use the RPC backend. build llama.cpp with CUDA, and the RPC server with HIP or the other way around, and this will allow you to use CUDA and HIP at the same time.
>>
https://www.youtube.com/watch?v=22dquW4EYsQ&t=1506
This video shows how normalfags see AI. Main points for those who don't want to watch:
>Useless for any practical tasks
>Lame personalities
>Boring jokes
>"Spicy" personalities are too tame
Corpos only have themselves to blame.
>>
>>101675376
>trained on nemo instruct, not base
>trained at 8k context
>trained on reddit writing prompts
I don't know dude, that doesn't sound like it would work right
>>
What will AI be like in 15 years?
>>
>>101675471
Not cringe, just a terminal mental illness.
>>
>>101674804
100% this. So much this. We can't do anything without jannies. It is only thanks to their hard work and dedication that we don't have people discussing cloud shit in here and drinking piss for free proxies, right?
Just look at how effective they are at dealing with this faggot >101675421
>>101675548
Don't question it. Our expert finetuners know what they're doing.
>>
>>
>>101675246
>>101675217
Do you have any nemo conversation/instruct templates I can use?
>>
File: a5iwjwlkhwfd1.jpg (88 KB, 1024x694)
88 KB
88 KB JPG
>>101675447
i like to check the cai leddit now and then and watch things get worse and worse. its interesting to watch people cling to it knowing it went to shit a long time ago, yet somehow continues to get even worse
https://old.reddit.com/r/CharacterAI/comments/1egv2fp/more_censorship_on_cai/
>>
File: file.png (2.02 MB, 1024x1024)
2.02 MB
2.02 MB PNG
Fuck this is fun but I only got 12gb, it's so over.
>>
File: Miku Online.png (1008 KB, 1024x1024)
1008 KB
1008 KB PNG
Sneak peek of my new online game, Miku Online, powered by MythoMax
>>
>>101676110
>>101676119
what are the requirements for this?
>>
>>101675492
>RPC backend
Oh yeah, somehow forgot about it releasing.
When I get off my ass and shuffle cards around I'll give it a shot.
>>
Well, I'm bored of it already. How do we finetune flux?
>>
>>101676119
What prompt did you use for this?
>>
>>101676119
I thought it was a real game.
>>
File: out-0.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>101676240
>Screenshot of a online 2D sidescroller game. a Hatsune Miku cat girl character is on the center of the Green Hill Zone map, wearing gold armor, the text "Anonymous" can be seen right below it. The chat box on the bottom corner has the text "[Anonymous] We are so back!" written on it, below multiple lines of other players talking.

It took me 4 tries though.
>>
>>101676134
16GB~24GB vram
>>
File: Untitled.png (239 KB, 898x954)
239 KB
239 KB PNG
Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data
https://arxiv.org/abs/2404.01413
>The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration until fitted models become useless. However, those studies largely assumed that new data replace old data over time, where an arguably more realistic assumption is that data accumulate over time. In this paper, we ask: what effect does accumulating data have on model collapse? We empirically study this question by pretraining sequences of language models on text corpora. We confirm that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse, then demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse; these results hold across a range of model sizes, architectures, and hyperparameters. We obtain similar results for deep generative models on other types of real data: diffusion models for molecule conformation generation and variational autoencoders for image generation. To understand why accumulating data can avoid model collapse, we use an analytically tractable framework introduced by prior work in which a sequence of linear models are fit to the previous models' outputs. Previous work used this framework to show that if data are replaced, the test error increases with the number of model-fitting iterations; we extend this argument to prove that if data instead accumulate, the test error has a finite upper bound independent of the number of iterations, meaning model collapse no longer occurs.
>>
>>101676385
not the regular papers anon, but i saw this mentioned in /ldg/ and don't remember seeing it here
>>
>>101676306
Holy fucking kino
>>
>>101676306
DALL-E bros? Our response?
>>
File: OIG2.jpg (194 KB, 1024x1024)
194 KB
194 KB JPG
>>101676450
>This is the best D3 could do
It's so fucking over
>>
File: ComfyUI_00092_.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>Fluxisn't overbaked on 2B like Dalle is so I can actually prompt my waifu consistently now without it looking like a 2B clone
I'M SO BACK.

Also feels like the model benefits from boomer prompting. I had to explain in great detail how her hair is covering one eye and that she's not winking in order for it to (consistently) gen her this way, or else it'd have her winking half the time instead.
>>
File: out-0.png (884 KB, 1024x1024)
884 KB
884 KB PNG
>>101676306
A couple tries is fine by me. I'm trying to make images I can possibly chop up for sprite tilesets and I've been dealing with SD's shitty clip prompts for so long that asking for what I want in normal human prose is wild and crazy and counterintuitive now.
>>
File: 4586768-.webm (550 KB, 960x960)
550 KB
550 KB WEBM
>>101674795
>Speaking on the former Facebook's Q2 earnings call, founder Mark Zuckerberg explained Meta is "planning for the compute clusters and data we'll need for the next several years" and assumes that building its next model – Llama 4 – will likely need an "amount of compute … almost 10x more than what we used to train Llama 3." The latter model is believed to have required 16,000 GPUs. Zuck enthused that Meta aims for Llama 4 "to be the most advanced [model] in the industry next year."
>>
>>101676110
Same boat.
Though it looks like they won't let me see it without giving them a blood sample so fuggit.
>>
File: 1697999955025257.png (655 KB, 1024x1024)
655 KB
655 KB PNG
>>101676450
its over flux won
>>
>>101676577
Did Zuck not hear about the net of 1.58 bits?
>>
File: vramlets btfo.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>
>>101676604
Too risky. Moar parameters and moar tokens is tried and true path to success.
>>
>>101676616
>don't talk me
So close.
>>
>>101676616
prompt!
>>
>>101675492
NTA but coincidentally I just attempted to get l3 405b i1-iq2_m (130 gb) running on a 2x3090 ti + 128 gb ram PC and a macbook pro m1 with 64 gb ram. That's 48 gb of decent speed (3090 ti's), ~50 gb of ok speed (macbook), and the remaining 32 gb would be on ddr4.
It didn't work. Not sure why.

llama_new_context_with_model: n_ctx = 131072
llama_new_context_with_model: n_batch = 2048
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 500000,0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: failed to allocate buffer for kv cache
llama_new_context_with_model: llama_kv_cache_init() failed for self-attention cache
llama_init_from_gpt_params: error: failed to create context with model './llama3.1-405b-i1-iq2_m.gguf'
main: error: unable to load model
>>
File: actually decent.jpg (117 KB, 612x1153)
117 KB
117 KB JPG
Ok, I dunno how the fuck but cheers >>101675246

This is actually so much better than the normal Nemo instruct I was using... I'm pretty new though, can anyone assist me with the proper temps?

I'm also using the base Mistral instruct/conversaiton as I can't find any alternatives. It's super good but I know it can be refined a little
>>
So how much information should I be putting in prompts for the model to reference? I think it's something wrong with my config, but feels like it's parroting chunks from it too often; ignoring instructions to not parrot exactly from the system prompt.
>>
Whats the best model you can run with 12GB of vram right now? I want to make the most of my 3080 Ti.
>>
>>101675548
Yeah, all 3 items apply to Stheno too.
>>
File: vramlets btfo 2.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>101676635
>Screenshot (frame). Computer game. Visual Novel. A Hatsune Miku cat girl character sprite is in the center with a grumpy face. The text "[Hatsune] Huh, you have less than 16GB VRAM? Ewww, don't talk to me." can be seen in the chat box at the bottom left corner of the screen.

picrel one is:
>Screenshot (frame). Computer game. Visual Novel. Scholl in the background. A Hatsune Miku cat girl character sprite is in the center with a grumpy face. The text "[Hatsune] Huh, you have less than 24GB VRAM?! Ewww!! Don't talk to me!" can be seen in the chat box at the bottom left corner of the screen.
>>
>>101669338
The problem is not getting it, the problem is using it. I don't know of a service that lets you rent a bunch of H100's anonymously.
>>
>>101676703
Holy fucking shit
>>
>>101676579
there's a fp8 version and doesn't require giving info
>>
>>101676703
man, this MOGS openai. how did they fall so far behind?
>>
>>101676802
their gpt-4o can generate images (check the announcement), but they still haven't release that feature. They're too afraid to ship fast.
and dalle3 was king for almost a year, which is not a small amount.
>>
Time for 48GB cards. Flux is only the beginning. 20B model soon.
>>
>>101676703
Those are so good that it's not even funny. It doesn't even use all that much VRAM compared to the results.
Stable Diffusion? More like Stable Confusion!
>>
File: 1693568656236151.png (865 KB, 1024x1024)
865 KB
865 KB PNG
>>
These are amazing, but Miku is probably one of the easiest characters to generate considering the massive amounts of Miku content available to train on. The text stuff is pretty sweet though, I guess.
>>
Does this mean we're finally out of the two year SD 1.5 death spiral
>>
>>101676878
most people can't run flux
>>
File: ComfyUI_00091_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>101676703
>>
File: 1713969160389409.png (693 KB, 1024x1024)
693 KB
693 KB PNG
Who needs controlnet QR monster when you have Flux?
>>
>>101676884
Why? Because it's 12 GB? Maybe we can quant it.
>>
>>101676884
3090 is $900 CAD. Less if you buy from Facebook Marketplace.
>>
>>101676898
>Because it's 12 GB?
It's 24GB, 12B
>>
>>101676903
So? Most people wouldn't want to spend that much just for an image model.
>>
>>101676904
Oops, yeah that's what I meant. Quant time.
>>
>>101676854
people are impressed because of the coherent text anon, not by the ability to generate miku
>>
File: vramlets btfo 3.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>Visual Novel. Classroom background. A big breasted Hatsune Miku sprite is in the center with a disgusted face and arms crossed under her breasts, looking at the shooing viewer. The text "[Hatsune] Huh, you have less than 48GB VRAM?! Ewww!! Don't talk to me!" can be seen in the chat box. The buttons to "Quick Save", "Quick Load", "Save", "Load", "Settings" can be seen in the bottom of the screen. The top right of the screen has the text "Day Time" with a sun.

This one was almost perfect if it wasn't for the "Load" "Load" instead of "Save" "Load", dang.
>>
needs quanting, do it bros
https://huggingface.co/Kijai/flux-fp8
>>
>>101676645
the lack of more specific error message means that likely it is the rpc server that failed the memory allocation, likely due to being out of memory. try a lower cache size.
>>
>>101676923
oh shit, i was not expecting such a big model drop for image gen
>>
>>101676923
>This one was almost perfect if it wasn't for the "Load" "Load" instead of "Save" "Load", dang.
inpaint
>>
>>101676913
Image models don't quantize well, even 8bit has a noticeably quality drop from FP16. Below 8 is pretty much totally incoherent
>>
File: file.png (95 KB, 228x326)
95 KB
95 KB PNG
>>101676854
Yeah, I can't get it to gen Teto :(
But at least it gets the color of hair/eye correct
>>
File: kek.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>101676889
>>
>>101676940
Thanks. Yeah I noticed one of the RPC servers actually printed a cuda OOM message. Also realized I forgot to pass a -c flag so it tried to load with 128k context.
>>
>>101676987
now gen nipples and you'll see how shit the model is at them
>>
The grapes... they're kinda sour...
>>
>>101676922
Yeah, I get it.
>>
>>101676119
cool
>>
>>101676306
>>101676119
what did you make it with?
>>
>>101677033
flux
>>
>>101677033
flux-dev
>>
>>101676925
How worse are the gens with fp8 compared to the original?
>>
File: ComfyUI_00091_.png (992 KB, 1024x1024)
992 KB
992 KB PNG
>>101676991
>captcha: TRVV
>>
>>101676981
Grab yourself a lora, there's like... a trillion of them
>>
>>101677097
>day 1 flux teto loras
damn that's fast
>>
>>101677097
dumb ass retard alert
>>
File: would you.png (894 KB, 1024x1024)
894 KB
894 KB PNG
>>
After using fp16 and fp8, I feel like fp8 really is worse. The quality isn't bad on it, but it includes less of the things in my prompt.
>>
>>101677126
Oh, I'd fixime, alright.
>>
>>101677138
It's the same thing with textgen.
>>
>>101677138
post examples
>>
>>101677147
A man of cluturer.
>>
File: ComfyUI_01458_.jpg (508 KB, 1728x1152)
508 KB
508 KB JPG
>>101677138
>The quality isn't bad on it, but it includes less of the things in my prompt.
this but I am a lowly single gpu monkey

>>101676925
there is already one built in why use this shit?
>>
>>101677175
Maybe the built in quanter sucks ass and that's why you're getting poor results?
>>
>trani post
Disregarded
>>
>>101676990 (me)
I just realized that llama-cli is taking a bunch of GPU VRAM as well as the rpc-server. It keeps OOMing even though I'm only trying to load 75 layers with 4096 context. Maybe I should do CUDA_VISIBLE_DEVICES= to the llama-cli command. The docs do not mention this. In fact the docs moronically assumes I am running the llama-cli command on an unrelated machine not part of the rpc-server cluster. Why the fuck would they assume that.
>>
>>101677223
I'm not comparing code, I implemented my own model today and I want a break

this thread has the roaming schizo too?
>>
File: 39__00002_.png (984 KB, 1024x1024)
984 KB
984 KB PNG
it's official, cloud is dead and local won
>>
>>101677003
i just bought a 3080 Ti 12gb and I want to fucking headblast toaster bath right now.
>>
https://www.youtube.com/watch?v=Y0sLyMD5mV8

This nigga is saying you can convert any model weights to 1.58 bits, finetune it afterwards and it would work. Someone must test this and compare the performance with the original model, if it is similar we can have 7x less size at 70b.
>>
>>101677329
Dalle-4 release soon. Twice the the safety, twice the resolution.
>>
>>101677329
>he doesn't know about gpt-4o image gen
>>
>>101676577
>petra, kling anon, and dr. evil anon were all the same person
The personality list is getting big...
>>
>>101677388
>3 months ago
why didn't he do it yet then
>>
>>101677388
but why do i have a feeling nobody is going to try it out...
>>
Oh nice, flux can do decent anime nudes if you corral it right.
>>
>>101677440
give catbox
>>
>>101677412
>>101677416
Why don't the retard slop tuners actually do something useful for once. This has great potential.
>>
File: vn.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
77 CLIP tokens is just not enough for adding details.

>Visual Novel. 90s Anime. A nighttime background with sakura trees, and petals floating in the air. A young girl with long blonde hair, large green eyes, and a serious face is in the center, she is barely visible because of the darkness of the night, being only illuminated by the moon light, wearing a red dress with white sleeves and a blue ribbon in her hair. The girl is touching one of the trees. The text "----I'm a useless wizard who can only see projected dreams----" is displayed in the chat box at the bottom of the screen.
(I'm having a hard time trying to get the model to do any other lighting condition than "fully lit", maybe it is just impossible.)
>>
>>101677487
so the new thing is just that it has better text?
>>
>>101677266
Last time I tried it, I only ran rpc-server on the remote pc, and on the main one I ran llama-server normally, with --rpc flag and the IP of the other machine.
>>
File: sakura.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>101677500
it's better than Dall-E 3 at almost everything
>>
>>101677522
what happens if you try a lewd teto dating sim
>>
>>101677522
>those hands
>>
>>101677522
Are we making da capo image packs now? Gens are looking really nice though
>>
>>101677522
How good will this stuff be when it's maxed out? Will it be like an app like paint everyone has on their computer? Come to think of it, why do peoples computers have much power at all when it can be run from the cloud?
>>
>>101677388
>>101677416
>>101677453

Kye Gomez is a grifter
>>
>>101677572
>why do peoples computers have much power at all when it can be run from the cloud
Chromebooks exist and they suck dick to use as personal computers
>>
File: file.png (806 KB, 800x600)
806 KB
806 KB PNG
>>101677559
lol, you noticed? yeah, i was just doing some experiments by trying to replicate picrel picrel cg.

>>101677572
>why do peoples computers have much power at all when it can be run from the cloud
privacy, maybe?
>>
File: 39__00020_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>101676981
Translation: Skill Issue
>>
>>101676925
Anyone use this? When I load it without the --fp8 commands, it still maxes out my 3090's memory with low vram mode. What's going on here?

Another weird thing is that I can load the fp8 weights with the --fp8 command, and it DOES actually take the expected amount of VRAM now.

Interestingly, it seems to follow my prompts more closely compared to loading fp16 in fp8. FP8 without the --fp8 flag seems to do better than with the --fp8 flag, and both are better than fp16 loaded with --fp8. So actually yeah I think whatever real-time quanting method is using in Comfy might not be optimal.
>>
>>101677620
dang, good job, did you describe the hair?
>>
>>101677631
I will just wait for multi-GPU support.
Or spend all of my money on A6000.
>>
>>101677606
Will cloud for "AI" (GPU) related stuff eventually stop being the main way people use it, and instead it be replaced by local?
>>
File: ComfyUI_temp_qtpdf_00029_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>101677631
The full models runs in a single 3090, if it isn't connected to a screen.
>>
>>101677514
I think I got it past the memory limitations, but it barfed a "the provided PTX was compiled with an unsupported toolchain" at me and crashed. Tried upgrading the nvidia drivers to 555 and rebooted, so back to trying again. (Each attempt takes like 15 minutes.)

I assume llama-server is not compatible with the RPC stuff, as it crashed when I tried it. Would be nice if that worked, so I could do more than a single prediction before waiting 15 more minutes.
>>
>>101677660
Oh really? What resolution can you gen at before it OOMs?
>>
>>101677677
What resolutions it supports? Not much more than 1024 with FP16 everything.
>>
File: preset shill.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
I saw a few people asking for *stral presets and mine was feeling pretty dialed-in, so I decided to share it. Also, I wrote an unnecessary rentry to go along with it.
https://rentry.org/stral_set
Mainly for largestral but should work fine with nemo et. al.
Thank you for your time.
>>
Are the X posts about people easily making games in the future using AI a meme or is that really going to start to happen?
>>
>>101677731
Dang. I'm not sure what it supports. It'd be nice if we could do FHD.
>>
File: test_flux_distilled_3.png (972 KB, 1024x1024)
972 KB
972 KB PNG
I tried this prompt >>101676703 with the fp8 schnell model... I can't say I liked it.
>>
>>101677821
Damn billions must buy VRAM...
>>
>>101677733
I'll check out what you wrote in a bit, did you test it with nemo at all? I might try it, though I've grown attached to large that I just cope with the slowness of using ram.
>>
What are the new popular models people are running, uncensored preferably? I've been gone for a bit and this stuff changes ever day. Gemma 2 2B looks interesting but the
>prioritizing safety
part plus it being Google and all makes me think it's gimped to shit.
>>
>>101677888
I did very quickly just to verify it wasn't completely broken, and it seemed alright. I haven't done anything beyond that though.
Even if it stinks for it, it should at least be a good foundation to tweak off of, I hope.
>>
File: 1694745045780392.png (220 KB, 1570x780)
220 KB
220 KB PNG
>>101677670 (me)
TAKE-OFF!
27 GB used on macbook (no idea why it didn't use more).
23.4 + 23.9 GB used on the 3090 ti's.
Not using any CPU RAM for some reason. Where did the remaining ~50 GB of model weights go?
Anyway, the prompt was (no formatting whatsoever) "Hello, my name is".
>>
>>101677937
Well I won't hold it against you if nemo still agrees with me saying U is element 82.
>>
File: 1721952375998090m.jpg (88 KB, 1024x1024)
88 KB
88 KB JPG
Has anyone else tried grabbing the smallest, most retarded model you can find and making it do shit? Asking qwen 4b to give me a chicken skewer recipe has been really fun, fucker suggested I boil the chicken while frying it in a pan and said I should put on the marinade (which was about 50 ingredients) AFTER I had started cooking the chicken.
>>
>>101677670
>I assume llama-server is not compatible with the RPC stuff
it should be compatible with any example
>>
For some reason loading nemo in ooba causes "ValueError: Failed to load model from file".
And I can run the same model with kobold just fine. Weird.
>>
>>101677733
Does miqu use the mistral stuff too? Do you ever use that?
>>
Update: the FP8 model linked above is actually IS the same as the FP8 quant that Comfy does natively, just a different type of FP8. Turns out there are two types of FP8 quants Comfy can do. I was using --fp8_e5m2, but --fp8_e4m3fn-unet is the one that the downloaded quant apparently actually is. If you load the fp16 model with --fp8_e4m3fn-unet, the outputs are exactly the same as the uploaded FP8 quant when also loaded with --fp8_e4m3fn-unet.

And it seems to me that --fp8_e4m3fn-unet is better than --fp8_e5m2, at least for this model. But I haven't done a huge amount of gens to AB test against. At least for the few ones I've tried, it definitely feels like e4m3fn is better. So for now that's what I'm going to recommend.
>>
File: tetoliteral.jpg (226 KB, 1024x1024)
226 KB
226 KB JPG
>>101677639
Yeah, not perfect but its ability to understand the prompt makes up for it.
Dalle/Bing has a significantly harder time, picrel.
>>
ok flux is cool, but will fluxdiff run on my 24gb potato?
"FluxDiff is a text-to-video model that leverages past frames for conditioning, enabling the generation of infinite-length videos"
https://blackforestlabs.ai/up-next/
>>
>>101678115
rad pic
>>
>>101678115
Those drills will pierce my heart
>>
>>101678123
Buy A6000.
>>
>realize i wasnt offloading as many layers as possible
>70b at 1.3t/s now
oh this is nicer
>mistral large 0.7t/s
fffffffff
>>
File: 1664011689.png (1.28 MB, 1488x752)
1.28 MB
1.28 MB PNG
which side are you on /lmg/?
>>
Anyone getting an error when trying to gen at 1920x1080 using fp8?

einops.EinopsError: Error while processing rearrange-reduction pattern "b c (h ph) (w pw) -> b (h w) (c ph pw)".
Input tensor shape: torch.Size([1, 16, 135, 240]). Additional info: {'ph': 2, 'pw': 2}.
Shape mismatch, can't divide axis of length 135 in chunks of 2
>>
>>101678061
Miqu uses almost the same prompt format, maybe you'll need to add a space before [INST] and [/INST] in the instruct sequences. Other than that, I'd expect it to work fine - I think most of the prompting is pretty model-agnostic, with the exception of the one bit about not waiting for {{user}} to respond that's in there because largestral is autistic.
>>
Will T4 work instead of a 3090?
>>
>>101678172
It probably has to be a multiple of 8, 16, 32 or something like that
>>
>>101678115
>Dalle/Bing has a significantly harder time, picrel.
dalle on the api could probably do it, but yeah it'd be more effort
>>
Would you watch genned anime video, or would you get bored of it?
>>
>>101678159
White lives matter
>>
>>101678197
>implying it can do anime
every time realistic footage is in the dataset it's over
>>
File: miggu.png (2.48 MB, 1024x1024)
2.48 MB
2.48 MB PNG
>>101677452
Sure, here's migu topless
https://files.catbox.moe/3pbilx.jpg
>>
>>101678285
Yeah, as I suspected, bad nipples.
>>
>>101678305
Yeah, that's inpainted a few times.
>>
>>101678159
Hate everyone, sex both of them
>>
>>101678159
miku is always the best
>>
>>101678216
Will people spend a lot of time watching video
>>
File: ComfyUI_temp_vvjht_00531_.png (921 KB, 1024x1024)
921 KB
921 KB PNG
>>
>>101678358
considering the length of time video models are interesting to people, probably a few days then drop it
>>
>>101678396
pible
>>
I will donate to and shill for however finetunes this on anime. This is close to Dalle3 levels of imagen, but local.
>>
>>101678416
It does anime pretty well out of the box. We need pr0n finetunes to bankrupt jewish american porn industry. I'm sick and tired of it.
>>
File: Image.jpg (472 KB, 1344x768)
472 KB
472 KB JPG
local bros this is the most comfiest image gen model i've ever used
>>
File: 39__00073_.png (869 KB, 1024x1024)
869 KB
869 KB PNG
>>101678316
Unironically all it needs is a good fine-tune
>>101678316
Is this still the meta for inpainting on comfy? https://comfyanonymous.github.io/ComfyUI_examples/inpaint/
>>
>>101678524
I'm not sure, I grabbed comfy for the first time just to run flux. I actually generated it on my server then inpainted locally with automatic webui
>>
>>101678305
T5 is already stupid when it comes to naughty bits which is going to make training it in shitty
>>
>>101678591
damn didn't know that, fuck
>>
>>101678591
Actually just finetune T5 on some smut then finetune the model on the danbooru set after. Should work.
>>
>>101678596
what's worse is the t5xxl takes up more RAM with nothing really to show for it over the t5xl
>>
>>101678615
It has no visual component. It can't learn from images only text. you have to fry the clip again and clip L is a piece of shit on it's own to do that kind of lifting
>>
Is there information on the differences of t5xxl compared to clip so I can proompt it better?
>>
>>101678653
T5 is just a language model, no reason you can't finetune it. The image model itself then uses the T5 embeddings, so you'd just need to finetune it on images with generated captions afterwards. Unless I'm misunderstanding you.
>>
Just tried 2560x1440. Seems to go kind of schizo. Maybe 2048 is the most they trained with.
>>
>>101678709
Yes, read the official announcement:
https://blackforestlabs.ai/announcing-black-forest-labs/
>All FLUX.1 model variants support a diverse range of aspect ratios and resolutions in 0.1 and 2.0 megapixels, as shown in the following example.
>>
>>101678672
you prompt it like you are a nlp vlm. it has no specific styles (the funnest part imho) only generic ones like anime or cartoon. for some reason painting styles almost never proc
https://research.google/blog/exploring-transfer-learning-with-t5-the-text-to-text-transfer-transformer/?m=1
>>
Got a proxy that has gemini pro how will I know if the newer one is available
>>
>>101678409
people still post ai generated art though, that wont go away
>>
File: Untitled.png (730 KB, 720x1573)
730 KB
730 KB PNG
AI-Assisted Generation of Difficult Math Questions
https://arxiv.org/abs/2407.21009
>Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet demand for diverse and challenging math questions. Relying solely on human experts is both time-consuming and costly, while LLM-generated questions often lack the requisite diversity and difficulty. We present a design framework that combines the strengths of LLMs with a human-in-the-loop approach to generate a diverse array of challenging math questions. We leverage LLM metacognition skills [Didolkar et al., 2024] of a strong LLM to extract core "skills" from existing math datasets. These skills serve as the basis for generating novel and difficult questions by prompting the LLM with random pairs of core skills. The use of two different skills within each question makes finding such questions an "out of distribution" task for both LLMs and humans. Our pipeline employs LLMs to iteratively generate and refine questions and solutions through multiturn prompting. Human annotators then verify and further refine the questions, with their efficiency enhanced via further LLM interactions. Applying this pipeline on skills extracted from the MATH dataset [Hendrycks et al., 2021] resulted in MATH2 - a dataset of higher-quality math questions, as evidenced by: (a) Lower performance of all models on MATH2 than on MATH (b) Higher performance on MATH when using MATH2 questions as in-context examples. Although focused on mathematics, our methodology seems applicable to other domains requiring structured reasoning, and potentially as a component of scalable oversight. Also of interest is a striking relationship observed between models' performance on the new dataset: the success rate on MATH2 is the square on MATH, suggesting that successfully solving the question in MATH2 requires a nontrivial combination of two distinct math skills.
interesting
>>
>>101678694
booru prompts are tag lists. t5 is only nlp so you would either have to feed the booru tags into something that converts to nlp then train which could be a while then do the DiT model. It is possible but a gigantic pain in the ass. you would also need multi GPU support to train without quanting too. There is a method to use bitnet and resize back up but it's not really field tested.
>>
>>101678824
I'm very surprised nobody's done auto-captioning with prompting techniques (ranking, self cirtic, etc) across the whole image set with some multimodal model yet.

Any local models good enough and censor free enough to do it?
>>
>>101678799
>people still post ai generated art though, that wont go away
it can run on pretty much all consumer compute so yeah people will use it. when video gen is controllable other than a single prompt it will be cool but I don't like my outputs when I use txt2vid models
>>
File: Untitled.jpg (68 KB, 372x447)
68 KB
68 KB JPG
is l3 instruct the proper template for non-instruct 3.1? without instruct enabled, would i put my system prompt above the start header, or between the end header and #if system?
>>
>>101679229
>is l3 instruct the proper template for non-instruct 3.1?
No. That's for instruct. Just remove all the <|ble|> bits. I'd even say remove everything and just type your prompt there. I don't know why they add so much shit to something that simple.
>>
>>101679315
>remove all
i actually did try that and it seems ok, even alpaca worked fine. i wanted to make sure i was using the correct one though since the model can be weird sometimes. meta's hf page for the non-instruct model literally shows nothing specific. i hate these useless model cards so much
>>
>>101679412
>meta's hf page for the non-instruct model literally shows nothing specific
Non-instruct models don't use any chat template at all. No alpaca, no chatml, no llama3, nothing. If they even follow an instruction is because they can read past the noise of the template just like they can read past the typos.
Only instruct models use chat templates.
>>
>>101678948
I just wonder what the main kinds of videos they make will be
>>
>>101678866
I am trying to get something going from the top civit images which makes the most sense. obviously trim the fat like "quality" tags but everyone would rather rush to be first than plan a good dataset and logistics
>>
>>101679450
i guess thats why so many other models worked well with the alpaca rp one i modded and left in place. i'll leave the st stuff then paste my old prompt in without any extra things then, thanks for clarifying.
>>
>>101679482
porn
>>
>>101678039
Ooba outdated.
>>
>>101679615
You could even set up a model and make a sfm with prompting
(or whatever is used instead of sfm now)
>>
>>101679450
Why are so many people trying to use the non-instruct for chats?
>>
>>101679483
Your time would probably be better spent on the danbooru dataset. Way more diversity.
>>
>>101677979
OK, will try when the next one finishes. I just waited a long while for it to generate only to have it write a few words and then EOS my ass. Forgot --ignore-eos flag...
>>
File: ComfyUI_01499_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>101679699
maybe but I think 2D animation would make a comeback
>>
>>101679721
how would I train how people prompt off a tag list? that's a little counterintuitive and what people do already. you need a a vlm to be able to do that before tagging real data for imagen training which is what I'm talking about. synthetic images in imagen doesn't work
>>
>>101679701
I often use non-instructs. They're ok for book-like co-op writing and they don't have the assistant bias baked in as much. Just few-shot something like
>peep1: dialog
>peep2: ble
>peep1: glup
and off it goes. It's easier to introduce new characters by just adding peep3 or whatever. I've had logs with entire casts of people. Much like what people struggle with when doing multi-persona chats. Easier to add a narrator voice as well.
>>
>>101679779
Is this Flux? Damn.
>>
mini magnum almost handled my card but it has poor chain of thought.
>>
>>101679701
in my case i find base instruct and various tunes to be dumber than miqu for rp. i'm now trying non-instruct 3.1 to see if its any better or worse
>>
>>101679779
>Should it go armor or armour... fuck it. plausible deniability...
>>
>>101679806
You'd generate captions for the images, that's what I've been talking about. Use a multimodal LM to generate captions using a prompted format, use self criticism or some other prompting technique to generate highly descriptive captions, and use that as your dataset.
>>
File: 1721014691349569.jpg (43 KB, 680x656)
43 KB
43 KB JPG
>can never get good roleplay no matter what settings or card I use
>last time it happened it was back on mistral 7b and nothing since has worked
It's like my PC died inside. feels bad bros
>>
>>101679834
yes but lacking NSFW like early dalle. I explained the t5 encoder earlier but it is accurate. maybe lacking in in the style department a bit too. the furry that did fluffyrock is already on it and getting 80gb of compute ready for training. So maybe we might get a filthy model after all

>>101679858
kek

>>101679861
we are on the same page then I don't know where we got lost in translation
>>
>>101679779
Isn't this almost manga-tier
>>
>>101677733
I wonder how good these presets will be considering the user is so fucking retarded he writes *stral instead of mistral, meaning his stupid shit will never be found in a search
>>
>>101679876
>ooba
>ST
>Stheno
>Llama3 context and instruct preset in ST
>temp 1.2
>topK 50
>topP 1
>typP 1
>MinP .075
>Rep Pen 1.1

It just werks. Learn how to prompt. After cooming your brains out for awhile, go ahead and run other models to see what you like.
>>
>>101680005
Is ooba better than koboldcpp?
>>
>>101679876
just checking that you are not as retarded as I am, but do you have instruct mode enable in silly tavern? I did not rofl
>>
>>101680012
I like ooba better, but kcpp is good too.
>>
>>101680017
I do. I always enable it, although I don't know if that's right either.
>>
File: out-0.jpg (84 KB, 1088x896)
84 KB
84 KB JPG
>>101678948
>I don't like my outputs when I use txt2vid models
get better at prompting then. prompting for video models is fundamentally different than prompting for image models, you should mention camera position, lighting etc
>>
>>101679909
>I don't know where we got lost in translation
Hah, same. I guess I misunderstood you. Any idea on what models are up to the task? VisionLlama seems prudish/not great.
>>
>>101680025
What about all the other options in the context/instruct settings? You know how the right side of the screen has a bunch more of stuff, including custom stopping strings or something? Do you mess with anything on that side?
Also how are your cards formatted? Just plain text?
>>
>>101680005
you shouldnt need top k with minp, 0.05 min p is usually good but might need to go up a bit if you're using a low quant
>>
Is it worth paypigging for Runpod?
>>
>>101676306
nice
>>
>>101680005
There's no reason to use Stheno when Celeste is finetuned with a similar dataset on the 3.1 and Nemo models.
>>
>>101677979
Worked. Must've done something wrong when I tried it before. Thanks anon.
>>
>>101680162
wait celeste got a nemo update?
>>
File: ComfyUI_01507_.png (1.19 MB, 896x1152)
1.19 MB
1.19 MB PNG
>>101679971
picrel but outfits with low variation do hold consistancy much better... or just use a character the model knows. loras and finetunes are probably going to be sparse.

>>101680035
I am the father of conditioning keyframing since SD 1.4, if I can't control when and how things play out during an animation it's boring. txt2vid is mostly a slot machine for something that isn't slop

>>101680115
cute game
>>
>>101680172
https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9
>>
>>101680184
time for me to put an unreasonable amount of hope on to another small model. A story writing finetune of nemo sounds perfect for what I need.
>>
>>101680032
>although I don't know if that's right either.
It is if the model you're running is instruct.

>>101680047
>What about all the other options in the context/instruct settings
All you should have to do is select the llama 3 instruct preset for your context template and instruct mode. Leave the rest alone.

>Also how are your cards formatted? Just plain text?
Write in plain english. Use proper grammar, sentence structure, and paragraphs. Write most of the card in Description. I try to not use advanced definitions. Scenario can be useful. Stay away from Dialogue examples unless you really want your card to follow it closely.

I also like to refer to the user and character as {{user}} and {{char}} when I write my cards, although I'm unsure if this makes a difference compared to just referring to the actual names.
>>101680062
Noted and thanks.
>>
>>101680162
>After cooming your brains out for awhile, go ahead and run other models to see what you like.
>>
>>101680232
Buy an ad, sao.
>>
>>101680162
>>101680172
>>101680184
>>101680201
Super organic and natural.
>>
>>101680244
Who are you?
>>
>>101680251
yes
>>
>>101680254
Petrus.
>>
>>101680262
I don't know who you are.
>>
File: ComfyUI_05293_.png (975 KB, 1024x1024)
975 KB
975 KB PNG
Read the room retards, unless you're tuning flux no one gives a shit
>>
>>101680333
flux deez nuts, haha.
>>
>>101680333
shut the flux up
>>
File: J8CNdBG5HKywhtrY38p1o.jpg (183 KB, 1024x768)
183 KB
183 KB JPG
pure gold
>>
>>101679779
How many pulls did it take to get the text that good?
>>
>>101680333
>tuning flux
good luck without the base model lmao
>>
Do I need a specific branch/PR of comfy to run flux or does it just works?
>>
>>101680590
Newest comfy just werks
>>
>>101680571
>good luck without the base model lmao
whats wrong with tuning a distilled model?
>>
File: ComfyUI_01513_.png (1.17 MB, 896x1152)
1.17 MB
1.17 MB PNG
>>101680513
picrel but it's pretty dependant on how complex it is.

>>101680571
it's in main. check the examples page on the repo. you can also swap the custom sampler to something else. I just use the regular ksampler at 1.5 cfg so I can neg prompt

I'm getting wasted before bed. wish me luck in the coming days because I'm putting out a cpp ui for diffusion. I'll leverage sd.cpp and comfy for now while I make something not webdev cancer first
>>
>>101680641
So it's better than DALL-E3 at text then?
>>
>>101680708
probably. Dalle has artist styles which is the thing I want most. we haven't gotten a model that had that much variety since
>>
>>101680708
It's about as good as DALL-E3 in general, slightly better at text generation, slightly worse in general knowledge
>>
>>101680732
>since
since 1.5
>>
Biggest big if true
https://x.com/nisten/status/1818529201231688139?t=a2_oszg66OrDGlwweQS1iQ&s=19
>If this works then should be able to do finetuned Meta-405b running low-loss in just 60 gigs.
>>
>>101679986
Fine.
>Looking for mistral largestral nemo mixtral preset context instruct settings prompt prompts sysprompt sequence json jsons import sillytavern?
>>101677733
>>
>>101680708
by a lot
>>
I can only run the q3 of mistral large, will that affect it if I use it to generate code? Will it make mistakes? I was previously using wizardlm8x22b so it just would have to be better than that.
>>
>>101680634
https://github.com/black-forest-labs/flux/issues/9
>>
>>101680846
I'm not sure why it would be an issue. Even if it was distilled, you should be able to use a normal image objective, even if you need to rewrite the training code. Maybe that's what they mean, the code wont work out of the box.
>>
File: NemoSettings2.png (140 KB, 1330x1300)
140 KB
140 KB PNG
>>101679876
Same issue, found out modifying the settings even semi-well is the same as upgrading 3-4 quants. I'm reading that most people are using the new DRY settings to really force models to be original, a bit of Min_P to stop low-probability weird answers, one/two TOP_ samplers active for flavor, and a little bit of rep_pen to force different words. Pic related was working for me in Nemo-Dory in a story I wanted to be a bit zany, but isn't the best (Top_K is way too high).
>>
>>101680846
Has anyone gotten flux working using the official inferencing repo, not through ComfyUI? It should only require 16GB, but it fills up all the VRAM and OOMs.
>>
>>101680833
Y-you mean 96GB won't be VRAMlet tier anymore?
>>
>>101680845
Use Llama 3.1 70B instead.
>>
>>101680833
I'm more interested in 70b in 10GB being evaluated extremely fast.
>>
how to memory, aka write a entire book
naked rat girls on post it?
>>
>>101680903
Really? It's better than both those for the task?
>>
whats a good model for brainstorming ideas. I can't run anything past 8b though.
>>
>>101680947
For code just use claude 3.5. It is so night and day better than anything else that using anything else is just shooting yourself in the foot.
>>
Alright how do I get comfy ui to use more than just GPU0 so it doesn't use cuck memory mode?
>>
>>101680958
Doesn't that cost money?
>>
>>101680947
Definitively better than Wizard. Probably better than Large at Q3 if you can use a bigger quant with 70B.
>>
>>101680977
Not if you use the code to make money.
>>
>>101680972
You don't, no multi-GPU support.
>>
>>101680833
Imagine a cut down 8B being used for speculative decoding of a cut down 405B. We'd be so insanely back. Might even work well on mostly CPU layers.
>>
>>101680885
Nevermind, I figured it out.
>>
>think of a million different ways to try and make the model get EVERYTHING from the prompt
>what if I use exclamation marks on the stuff that the model misses often
>kind of works?
>try TWO exclamation marks!!
>it works better
Hmm.
Now the question is if this wasn't just a fluke as I only tested on one seed.
>>
>>101681102
Square brackets take precedence if you use them.
In dialogue can use double asterisks like **this**, the models seem to pay special attention to that.
>>
File: conceptual granularity.png (1.02 MB, 895x916)
1.02 MB
1.02 MB PNG
Okay what the fuck. Now it didn't get it all. But I'm pretty sure that is still well beyond the level of conceptual granularity dall-e3 can handle. There's definite knowledge gaps... likely a dataset issue, but I would reckon this is smarter than D3.
>>
>>101681123
I tested those and at least on this seed they didn't work, but somehow !! or !!! did.
>>
>>101681013
Wait so how do you guys load this thing? I have 2x3090. Thought it would be no biggie.
>>
>>101681180
Load it at 8bit, it's an option in comfyui.
>>
>>101681139
More steps = better understanding too, if you're not doing 50.
>>
>>101681180
--fp8_e4m3fn-unet
>>
>>101681190
Yeah, but anon above was comparing it to f16 and saying it was bad. I assume he was able to load it somehow, and wasn't using a 40+ GB card...
>>
File: 1710793075640.png (2.47 MB, 1229x1229)
2.47 MB
2.47 MB PNG
>>101681139
Model9 didn't have this problem
>>
>>101681203
Seems to get the rock at 50 steps, but still no frog on a log. Either way PonyFLUX when?
>>
>>101681225
Totally. It just ignores the other half of the prompt.
>>
>>101680162
>>101680251
I tried Celeste with aaaaaaaaaall of the settings they recommend and it was utter dogshit
>>
>>101681244 (me)
my name is sao, btw
>>
>>101681251
t. nothingisreal
>>
>>101681257
t. drummer false-flagging as both to make everyone else look bad
>>
>>101681244
>>101681251
>>101681268
I'm going to reinstall that garbage, config it and post screenshots just to fuck your shit up and prove Celeste is bad
>>
File: 1710122200739.png (2.47 MB, 1229x1229)
2.47 MB
2.47 MB PNG
>>101681241
That's an older one, didn't prompt for the lizard
>cat playing a banjo on a rock on a turtle
This one was cat playing a banjo on a turtle on a rock
>>
File: file.png (27 KB, 693x157)
27 KB
27 KB PNG
>>101681276
already off to a bad start
>>
>>101681276
Nah, I'm sorry but it's great. A lot of money was spent finetuning it, so go elsewhere if you want to complain.
>>
>>101681276
Post side-by-sides with Stheno to show how superior it is too.
>>
File: ComfyUI_00019_.png (395 KB, 512x512)
395 KB
395 KB PNG
I went down to 512^2 so I could run them in batches and at 50 steps flux will sometimes give the lizard frog-like features and put it on top of a log. So about 50% of the time it's getting all but 1 concept in the prompt.
>>
>>101681284
Not a banjo. Not dissing on any of the models, really. Last time i did image gen the thing could barely distinguish between left and right. Both seem impressive.
>>
Will Flux finetunes finally deliver a good model for interspecies anthro furry porn?
>>
>>101681307
How many gens did it take to get it?
>>
>>101681320
Only if Sao starts doing diffusion finetunes.
>>
File: file.png (28 KB, 779x381)
28 KB
28 KB PNG
>no system prompt
>actually do use this one
ok celeste, whatever you say
>>
Hi all, Drummer here...

>>101681268

Don't get me involved in this shit show, please.
>>
Wake me up when Sao is done with the meltdown.
>>
File: file.png (53 KB, 756x416)
53 KB
53 KB PNG
ANOTHER system prompt what the fuck
>>
>>101681360
>you are {{char}}
>>
File: ComfyUI_05344_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>101681388
>increase the humanness
>we trained on human stories
>human-like
The word human is doing a lot of heavy lifting here isn't it?
>>
File: notreallyanthro.png (1.24 MB, 981x910)
1.24 MB
1.24 MB PNG
Alright so I did a comparison. I had to remove the references to anime and anthropomorphism on the bing prompt to avoid getting filtered...(1/2)
>>101681342
Probably about 2 or 3
>>
File: bingtho.png (699 KB, 1131x754)
699 KB
699 KB PNG
>>101681483
(2/2)
It seems to give bing a bit of an aneurysm. Even with a less complicated prompt to follow it's starting to blend concepts together or just miss them entirely and do whatever the fuck it wants.
>>
>>101681493
Oh I had to replace hugging with shaking hands too, of course. Since nothing offends AI ethicists more than love.
>>
File: celeste.jpg (872 KB, 2550x1263)
872 KB
872 KB JPG
>>101681244
>>101681251
>>101681268
>>101681288
>>101681295
>>101681395
>>101681458
Woooooooooooowwwwwwww guys, Celeste is so fucking goooooood
>>
File: flux1.jpg (177 KB, 1845x1007)
177 KB
177 KB JPG
>>101681493
Flux got it close, wolf looking a little too happy though
>>
>>101681547
>Name (Wife)
also she unbuttoned her shirt twice
>>
>>101681547
Now I want to see this exact prompt on all the 8b and nemo fine tunes just to see which ones stand out.
>>
File: lots of shit.png (1.39 MB, 1203x1401)
1.39 MB
1.39 MB PNG
>>101681579
It's also pretty bad at goats.
Another head-to-head with d3 via bing
It's as though Flux is actually making a sincere effort whereas d3 just has a fucking aneurysm.
>>
>>101681547
>Repetition showing up in literally first message
>"pleasing expression" appearing two maybe three times, will put money on the fact that the model name tooltip is covering it
ngmi, I recommend recycle bin
>>
File: file.png (116 KB, 761x271)
116 KB
116 KB PNG
>>101681547
>>
>>101681639
*pleading
>>
>>101681621
Q4_K_M is well above the cutoff point for determining if something is shit, it's not like he's using an IQ quant like some disingenuous fuckers.
We're talking 12b here, not some shitty 7b from a year ago.
>>
>>101681547
>>101681580
>>101681616
>>101681639
>>101681640
Wooooooooooooooow guys Celeste is so immersive, it gets my persona down exactly. This model is incredible!!!

>>101681621
>NOOOOOOOOO IT DOESN'T COUNT! THE MODEL AIMED AT LOWER-END SYSTEMS ONLY WORKS WELL WHEN YOU USE IT ON A HIGH-END SYSTEM
oh word?
>>
>>101681677
>waah waah gib monies pls
ko-fi drying up already?
make yourself useful and tune flux, otherwise stfu
retards can't even astroturf without showing their ass all over the place
>>
Anyway my final thoughts on FLUX everything it's weak at seems like it's probably a dataset issue, and everything it's strong at it gives D3 a good beating on. Sadly unless and until finetunes of it exist SDXL is just better by virtue of accessibility and having finetunes available for specific use cases.
>>
>>101681639
Yeah, it's just like when I see people say nemo is good. I don't trust their judgement. I'm sticking with mistral large.
>>
File: kek.gif (170 KB, 360x346)
170 KB
170 KB GIF
>>101681681
>mouse cursor moving halfway through the rant
>>
>>101681678
> not like he's using an IQ quant like some disingenuous fuckers
*looks up from llama3.1-405b-i1-iq2_m.gguf* huh? wait what?
>>
FLUX GGUF WHEN???
>>
File: ewewewqeweq.jpg (31 KB, 426x341)
31 KB
31 KB JPG
I'm confused, I only use GGUF because theyr'e way easier (one file).

Do I download all of the files here - https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9/tree/main

And which one would I Load into kobold if yes?
>>
>>101681714
>time to press stop
>hold on, let me show these guys how bad it is
>>
>>101681681
What's with the underlined parts of the text?
>>
>>101681968
Grammarly thinks that text could be improved.
>>
>>101682019
>>101682019
>>101682019
>>
>>101682023
I chortled.
>>
>>101681933
All and you have to convert them to gguf and then quantize if you need it. There should be a convert_hf_to_gguf.py or something like that somewhere in your kobold dir:
>./convert_hf_to_gguf.py path/to/dir/
Just the path to the directory where you download the files. huffingface has a cli program to download repos if you want to make it simpler.
Once converted, use llama-quantize (or whatever kobold calls it) like
>./llama-quantize path/to/dir/*.gguf Q8_0
>>
>>101681920
https://github.com/leejet/stable-diffusion.cpp/issues/323



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.