[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: LLM-history.png (1.45 MB, 4651x5197)
1.45 MB
1.45 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101657582 & >>101651157

►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 1699019817628623.png (695 KB, 1024x1024)
695 KB
695 KB PNG
►Recent Highlights from the Previous Thread: >>101657582

--Paper: Palu: Compressing KV-Cache with Low-Rank Projection: >>101663380 >>101663615 >>101663689
--Papers: >>101662306 >>101663551
--vllm compilation issues and fixes: >>101657884 >>101657930 >>101657934 >>101658631 >>101658732 >>101658885
--Yi model user gets bad results, advised to check settings and try different model: >>101660684 >>101660736 >>101661016 >>101661125
--SambaNova achieves 100t+/s on llama 405B with custom ASIC: >>101658116 >>101658154
--NVLink and parallelism in multi-server OCP rack setup: >>101657770
--Llama.cpp vs exl2 comparison and discussion: >>101658050 >>101658130 >>101658170 >>101658251
--Article claiming to break Llama 3.1 with one question is disputed: >>101659859 >>101660037 >>101660195 >>101662859
--Anons rank LLMs by ERP capabilities: >>101658263 >>101658278 >>101658286 >>101658322
--Anon tries to fix whisper.cpp transcription issues for llama.cpp: >>101658991 >>101660288 >>101660549 >>101661161 >>101661261
--Next speed tier for GPUs discussed: >>101661148 >>101661264 >>101662268
--Neural networks have poor numerical stability due to condition numbers: >>101658219 >>101658270
--OpenAI bankruptcy rumors despite Microsoft support: >>101661172 >>101661190
--Mistral Large struggles with certain spatial tasks: >>101661567 >>101661734 >>101661881 >>101661914
--LLM game prompts dumped: >>101659168 >>101661504
--Google releases sparse auto-encoder for Gemma interpretability: >>101662971
--Formatting cards for low Vram models, markdown recommended: >>101662780 >>101662836
--Discussion of AI models and their performance in different tasks: >>101657866 >>101658064 >>101659369 >>101658711 >>101658784 >>101658810
--Codestral 22b recommended for powershell and python scripting: >>101661477 >>101661719 >>101661822
--Miku (free space): >>101659575 >>101659595 >>101659631 >>101659670 >>101662074 >>101662536 >>101662550 >>101662631

►Recent Highlight Posts from the Previous Thread: >>101657586
>>
>>101662971
We can finally see where the shivers come from, this is fucking huge.
>>
Fascinating
>>
File: gemma_scope.webm (480 KB, 960x540)
480 KB
480 KB WEBM
>>101664998
>reddit
https://colab.research.google.com/drive/17dQFYUYnuKnP6OwQPH9v_GSYUW5aj-Rp?usp=sharing
https://www.neuronpedia.org/gemma-scope#steer
https://www.neuronpedia.org/gemma-2-2b/20-gemmascope-res-16k/10004
https://huggingface.co/google/gemma-scope
>>
>>101665132
anthropic did this earlier btw, but yeah their research isnt open
>>
>>101662971
>>101665132
https://www.anthropic.com/news/golden-gate-claude

paper:
https://www.anthropic.com/research/mapping-mind-language-model
https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html
>>
Hi all, Drummer here...

Pocket Tiger Gemma 2B v1 candidate + Kobo 2B patch

https://huggingface.co/BeaverAI/Pocket-Tiger-Gemma-2B-v1e-GGUF

Hoping to make a good CreamPhi successor afterwards
>>
>whenever you mention someone has animal ears or horns or whatever, the AI makes them a furry.
yamate kudasai
>>
>>101665209
shit model or shit preset or shit human
>>
Now that the dust has settled, is nemo any good?
>>
>>101665176
>patch.exe
Submit a patch upstream to support your model (to llama.cpp, that is. kcpp gets it anyway) and show the unconverted models, retard.
>>
fuck nemo and fuck mini magnum. It does not matter how smart a 12b is if you have to explain to it not to forget what you said the reply before.
>>
Is there a thread to discuss local large language models without nonwhites shitting up the place with local small language models?
>>
>>101665333
is there a place to discuss without seeing retards like you? the world will never know
>>
>>101665333
>bought a pc to coom to llms
>pretends he's white
>>
>>101665333
if you find one let me know, thanks
>>
>>101665209
wtf I love sheep now!
>>
File: Cream-Phi.jpg (35 KB, 1017x425)
35 KB
35 KB JPG
>CreamPhi successor
oh boy can't wait
>>
>>101665303
Hi lemmy
>>
>>101665398
Kek
>>
What's static vs imatrix quants?
>>
>>101665702
static are the normal Q4_K_M, Q6_K...

imatrix are a special type of quant named, IQ3_XS, IQ2_M... they have the "i" at the beggining.

The to make the imatrix quants it is mandatory to use a calibration matrix, and you need with generate it using a dataset with ./imatrix tool in llama.cpp

What I didn't know is you can use the calibration matrix for the static quants too, and it improves perplexity.

So basically you should use a calibration matrix for everything.
>>
>>101665702
Faggot shit for nerds but imatrix is better.
>>
>>101664954
are goliath or miqu still relevant for rp?
>>
>>101665798
yeah
>>
>>101665333
What a waste of trips.
>>
>>101665798
nah
>>
>>101665798
Why would you use miqu now that llama 3.1 70b is out?
>>
>>101665798
>goliath
no
>miqu
yeah, still my daily driver
>>
Anons, the polling continues!
So far Largestral is the clear winner, with Command R+ following in a solid 2nd place. Meanwhile, L3.1 405b, Mistral Nemo and L3.1 70b are almost tied for the third spot and only within a few points of each other.
Friendly reminder that you are voting for the ERP capabilities of the model and NOTHING ELSE.

https://strawpoll.com/ajnE1OM2knW
>>
>>101665997
>Nemo
>above any 70B model
for its size maybe
God, vramlets are so annoying, cant we get a proof of work in this general for each post?
>>
>>101666034
nemo writes better than basically all available 70b models
>>
>>101666042
Under which settings and for what type of card? In my experience anything other than 1 on 1 anon and girl type of thing is not great
>>
>>101665997
>no petra13b
into the trash it goes
>>
>>101666034
L3.1 70b is smarter than Nemo, no shit Sherlock, but its ERP is the blandest shit imaginable. Nemo is dumb but it has SOVL, and largestral has that winning formula of smart and sovl.
>>
>>101666042
>>101666034
how much of a vramlet one has to be for this new mistral to run on gpu only?
>>
>>101666075
12 GB
8 on low quants which are probably worse
>>
>>101666075
vramlet means little vram, more vramlet = less vram
reddit now
>>
>>101666105
shit I just understood what you meant
my bad, sorry
>>
>>101666096
ty anonio
>>
>>101666064
Wish I had just a little more vram to run a higher quant of largestral
>>
I shivered as my mouse cursor moved inexorably towards the precipice of the 'reply to thread' form like a moth to a flame. It was a heady feeling as I typed "I promise I won't bite...unless you want me to." with a snarl in a husky whisper. My breath hitched in an intoxicating sensation of pleasure-pain, liquid heat pouring in rivulets from the core of my GPUs as I pressed the Post button and came undone.
>>
is mistral large useable on 16GB VRAM and 64GB RAM?
>>
File: 1695796055301994.jpg (138 KB, 768x918)
138 KB
138 KB JPG
post ST stats

User Messages:
3344
Character Messages:
4061
User Words:
113729
Character Words:
535617
Swipes:
395
>>
>>101666335
if you're not a coomer every model is usable with ram
>>
>>101666344
User Messages:
648
Character Messages:
872
User Words:
14796
Character Words:
591724
Swipes:
2134
>>
>>101666397
Fuck your bots are chatty compared to >>101666344
>>
What's the best Mistral Nemo tuning model?

I heard mini magnum is good
>>
>>101666344
User Messages:
3807
Character Messages:
4884
User Words:
100213
Character Words:
872307
Swipes:
5454
>>
File: message.png (1 KB, 39x37)
1 KB
1 KB PNG
>>101666344
>Chatting Since:
>a year ago
>Chat Time:
>23 Minutes, 4 Seconds
>User Messages:
>141
>Character Messages:
>146
>User Words:
>2482
>Character Words:
>963
>Swipes:
>623
This can't be right, in my most used character I have like 5 conversations of >400 messages each
>>
File: .jpg (33 KB, 606x278)
33 KB
33 KB JPG
>>101666344
mine are way off. and somehow negative char words
>>
>>101662202
>>101664371
A little late but I don't use any fancy samplers.
Just low temp and minP 0.05
>>
>Celeste
>trained from nemo instruct
>at 8k context
>on reddit posts
why do people shill this garbage?
>>
>>101667032
Hi Sao. Stop being insecure.
>>
>>101666414
Try this one:
https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9
>>
>>101666414
After a couple of a/b tests I pretty much only use mini-magnum now. In every case testing with the same context and prompt I preferred what it spit back out to nemo.
>>
>>101667097
does it do better than nemo instruct with a large context?
>>
>>101667032
I'm using it and it's alright.
1.2 that is, I haven't tried >>101667054 yet.
>>
>>101667041
Hi Lemmy
>>
I've been wanting to make a public archive of one particular imageboard for some time, got it all figured out in terms of the architecture, etc. but I never took it public due to concerns about the moderation of CSAM and other such content.
Is there any decent and fast local model for image and video moderation like this? For text, I could probably use the newly released ShieldGemma to filter out doxx posts and other dubious content.
>>
>>101667114
The datasets used don't seem to have long context entries.
>>
These days if I can't run it in q8 or 8bpw I'm not running it anymore. Yes it limits me to about 70B. Anyone else?
>>
>>101667179
kys nigger?
>>
>>101667114
I don't do long enough sessions for it to have an effect I think. During my tests I was running 20k context, but I've gone back down to 8k.
>>
>>101667179
No, I'm still playing with Mistral Large.
>>
This stupid fucking hobby has officially priced me out. Fuck you all.
>>
>>101667179
>8bpw
As in exl22?
Didn't turboderp or whatever explain the other day that 8bpw is just 8bpw with filler because people were complaining or something of the sort?
>>
>>101667179
I don't think Q6 or 6bpw would be any better. Q5 and under yes.

How much vram do you how and what context size do you need?
>>
>>101667226
>This stupid fucking hobby has officially priced me out. Fuck you all.
Huh? A single P100 16GB with a fan duct-taped to it is under $200 and runs Nemo at q8 or Gemma at q4. A pair of P100 runs Gemma at q8 and 7-12 t/s. I've gone through example builds here with P40 and P100 and also shown PCIe extenders working just fine even at 70cm from the mobo. Dell hotswap 750W PSUs can be had for like $20, lean to solder and attach the GPU power wires straight to the PSU.
Yeah if you want to play with 70B+ models it starts to cost money.
>>
File: 1000032887.jpg (20 KB, 305x305)
20 KB
20 KB JPG
Ive added text to speech support into maid using a new library I've made for neural text to speech.

demo here:
https://files.catbox.moe/64m7bx.mp4

The technology isn't new or anything its just a rewrite of piper which uses a different G2P system, isn't dependent on espeak-ng and compiles for cross platform.

https://github.com/Mobile-Artificial-Intelligence/babylon.cpp

If anyone wants to try the build of maid with TTS you can download it from the latest actions build now.

https://github.com/Mobile-Artificial-Intelligence/maid
>>
>>101667032
>This is a dataset containing 3008 short stories, generated by an unrestrained Claude Opus using Reddit's Writing Prompts as a source.
https://huggingface.co/datasets/Gryphe/Opus-WritingPrompts
If you read Stheno's dataset composition, you will find that it's pretty much the same one as Celeste, except one goes for stories rewritten with Claude, and the other for the originals. And Magnum also includes the datasets used by Stheno. So every finetune is a Reddit one.
>>
>>101667393
very cool.
she sounds a bit distressed, which is a plus point :)
also:
fuOHR
>>
>>101667318
>How much vram do you how and what context size do you need?
Current setup is 3x P100 16GB and 2x 3090. The way flash attention speeds up replies immensely I mostly run just on the 3090s, meaning I mostly run Gemma at 8bpw and 65k context, where it absolutely flies. Sometimes I'll run CR+ across all the GPUs in the system but it's much slower.
Probably by the fall I'll be donating the current Mikubox (sans 3090s) to make room for the next iteration. I'm also currently pulling files off my old R720 to get ready to decommission it too, and that too will get donated. It's a v3 power pig but there's 14TB of RAID-5 with two hot spares, so if power is free it's a nice fileserver.
>>
>>101664954
Missing GPT-2 and AI Dungeon era (Prehistoric times).
>>
>>101667393
>cross platform.
Not on OpenBSD.
The patches it downloads for protobuf, i think, are expected to be used with the --binary option, which doesn't exist in openbsd's patch program. For most things, you can set PATCH_COMMAND to gpatch and that should work, but doesn't work there. However, even if that is fixed, i don't expect onnx to build correctly either. I wish i cared enough. I just use piper on a little arch vm and pipe stuff over vnetwork with nc and sndio.
>>
>>101667433
She mispronounces "I" which pisses me off, I think I can fix that in the library though.

Anyway it's passable until something better than VITS comes along.

>fuOHR
Huh?
>>
>>101667490
>fuOHR
go to the time in the video where she says 4, it sounds hilarious.
>>101667471
>cross platform
no Oracle Solaris support? no LCARS support?
>>
>>101667423
Wonder why they don't throw in the pile (https://pile.eleuther.ai/)? Yeah it's not enough alone to train a model - https://huggingface.co/KoboldAI/GPT-J-6B-Janeway horny but retarded.
Oh yeah also https://zenodo.org/records/3606810
>>
>>101666042
I'm sorry but that's some serious fucking copium right there. I got brain damage from reading your fucking post.
>>
>>101665997
I tried Nemo but don't have enough to run Mistral Large or CR+ i'm pretty sure (only a 24GB card and 32GB RAM).

If Nemo is even close to those, then yea, Local Models are utter fucking dogshit lmao.

Please tell me this is just "popularity scoring" and Nemo isn't actually anywhere close to Mistral Large or CR+ because that's just sad if it is.
>>
>>101667471
I mean you could compile it for WASM if you really want to run it on BSD. I'm just supporting whatever ONNX supports.

I would have preferred to have used GGML, and I did try to use it but I got absolutely filtered by its lack if documentation.

Anyway, nothing about babylon.cpp is optimal. VITS is old as fuck, the G2P is slightly off and it takes 45 minutes to compile from source. But I'm tired of waiting for someone else to make something better so it is whatever.
>>
>>101667226
>>101667386
>>101667443
Buy more 3090s.
>>
>>101667532
Nemo is better.
>>
>>101667539
NTA but I have no more available PCIe slots. What now 先輩?
>>
>>101667539
Ah but look! T4 16GB has busted the $500 price level - https://www.ebay.com/itm/186604520961?
You'll have to skip flash attention, but in some ways it's faster than a 4060ti, and you can fit way more of them in a system.
>>
>>101667539
Even upgrading to a dual 3090 feels so bothersome. I'd have to get a new PSU, motherboard and sell the spare parts. Not to mention build the whole thing.
So I'm just using Nemo instead.
>>
>>101667532
>Please tell me this is just "popularity scoring"
it is
>and Nemo isn't actually anywhere close to Mistral Large or CR+
it's not
>>
>>101667532
Nah, Mistral Large produces output that while still cliché is a lot more believable. I'd wager the ranking represents peoples' dislike of certain models in spite of performance.
>>
>>101667562
>NTA but I have no more available PCIe slots.
Got a free M.2 PCIe slot? You can steal 4 PCIe lanes from that. If your M.2 is in use, move the drive over to a SATA enclosure.
>>
File: 1708290281195432.jpg (20 KB, 841x169)
20 KB
20 KB JPG
>>101664954
Chat is that true?
Also what's its context size?
>>
>>101667532
Nemo makes up for the 100B+ smarts with its sovl. But coming from 48GB VRAM to 24GB (sold my 3090 after L3's flop), it's fucking solid and punches way above its weight
>>
>>101667562
My current setup is using 2 PCs with 2x3090s each. One with NVLink and the other with patched drivers, it's enough to get 18 T/s with Mistral Large and vLLM.
>>
>>101667602
I do. I want to do training too, though. I doubt it works very well for that?
>>101667639
I find myself eyeing those mining rigs as a secondary beast more and more often. But a 2nd PC with 2x3090 might be a good start!
>>
>>101667624
I choose it over all the other bigger things I could run. It gets the job done
Other thing to consider is if you have money to play in this hobby, you have money to do other perhaps more fun, worthwhile things. Maxed-out Mac Studio... or Yamaha FX Cruiser HO waverunner... yeah, waverunner sounds more fun.
>>
>>101667684
>I doubt it works very well for that?
You'll notice 4x a tiny bit on inference. Not really sure on training, most of my experience is running single GPU-intensive things like DeepFloyd but probably you want a full 16x.
>>
>>101667423
>https://huggingface.co/datasets/Gryphe/Opus-WritingPrompts
I've read a couple, and those are bad, unlike, surprisingly, aicg logs. They have that 'written by default AI assistant' quality that's usually overidden by card's persona.
>>
>[STScript is] a simple yet powerful scripting language that could be used to expand the functionality of SillyTavern without serious coding, allowing you to:
>Create mini-games or speed run challenges

Really. Are there any *examples* of this? Surely someone wouldn't slap together a crippled excuse for a scripting language and claim it's useful for a certain purpose without eating his own dogfood (i.e. using his masterpiece to do what he claims it's useful for). That would be stupid.
>>
>>101667639
Actually, I'm curious now. What kind of software set up are you using for distributed inference? Llama.cpp RPC?
>>101667754
Huh. That sounds feasible, actually.
>>
>>101667524
It's the new Genma 2 trolling, except NeMo has an actual use case unlike Gemma.
>>
when will chameleon 30b get multimodal re-enabling finetune like 7b got
>>
File: file.png (270 KB, 1427x1355)
270 KB
270 KB PNG
>>101667887
>Actually, I'm curious now. What kind of software set up are you using for distributed inference? Llama.cpp RPC?
vLLM. With llama.cpp, either every GPU uses tensor parallelism, or all of them use pipeline. It ran slowly at 5 T/s. vLLM also has its problems, but it's fast.
>>
>>101667995
>like 7b got
That never happened. It couldn't have happened.
>>
File: anole-interleaved.png (1.91 MB, 3402x3832)
1.91 MB
1.91 MB PNG
>>101668011
what do you mean it's kinda shit but it works
>>
>>101668007
are you running it unquantized?
>>
>>101668011
>>101668035
We have not yet seen the limits of what's possible at each size.
>>
>>101667532
>>101667624
Mistral runs on my 4090 and 32GB.

What quant are fags using
>>
>>101668053
No, I'm running Mistral Large with AWQ. And with llama.cpp RPC I think I tried the Q5_K quant.
>>
>>101668066
>4090 and 32gb
Are you using Nemo? Or a version of large so quanted it's as smart as Nemo?
>>
>>101668089
how is the quality with 4 bit?
>>
>https://huggingface.co/black-forest-labs/FLUX.1-dev
>https://blackforestlabs.ai/announcing-black-forest-labs/
new 12b image model with day 1 comfyui support
>>
Big things are coming
>>
https://x.com/nisten/status/1818529201231688139
Oh, would you look at the time, BitShit is hype back on the menu.
>>
>>101668128
It's decent, I think. AWQ is like 4.25bpw with the default group size and like 4.65bpw with group size 32, according to some old benchmarks.
>>
>>101665997
this poll is stupid since most will just try one or two of the options and then vote on them because they know nothing of the other models
>>
>>101668093
Considering Q2 of Largestral is completely usable, I think even Q1 is better than full size Nemo.
>>
>>101668165
How big?
>>
>>101668165
>>101668249
Cohere are about to save the hobby.
>>
>>101668066
I'm currently in an exl2-only era. I'm not ready to go back to the low speed of partial offloading. With my 3090 (and 64GB RAM which I'm not using for LLMs anymore), at this moment I'm back to using a 3.7 bpw quant of BagelMIsteryTour v2 with 16k context. Before that I was using 8.0 bpw NeMo with 64k context or 128k context with Q8 kv cache quant; before that 4.0 bpw Command-R with 7k context and Q6 cache quant; and before that 3.5 bpw Command-R with 8k context and Q8 cache quant.
>>
>>101668267
Hopefully not with a 500B parameters MoE model or something like that.
>>
>>101668170
Based
>>
>>101668170
*orbits around you*
>>
>>101668170
are we back, vramletbros?
>>
>>101668155
Neat. Is it easy to quantize imagen models? The model's really inconveniently sized to run entirely on GPU.
>>
2mh!!!
>>
>>101668479
as far as im aware they don't use quantization in image gen because the effects are much more noticable
>>
File: 1722523174872451.png (8 KB, 649x95)
8 KB
8 KB PNG
>>101668155
>text2video coming soon
hope its very soon. kling finally banned temp emails :(
>>
File: file.png (1.44 MB, 1812x1199)
1.44 MB
1.44 MB PNG
>>101668170
for those who didnt go to the link
>hacked bitnet for finetuning, ended up with a 74mb file. It talks fine at 198 tokens per second on just 1 cpu core. Basically witchcraft.
opensourcing later via
@skunkworks_ai
base here: https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base
>>
>>101668155
>Dev
Oh. Wow. This is really good. Probably the best base image model yet for open sauce, this is the best result I've gotten so far from any of them. Though still not necessarily dalle3 level given that it missed some things from the prompt (mainly just particle trails and a different gesture on the other hand). Looks like it can do text better than DE3 though. This is the first gen I got.

Image gen bros, I think we're back.
>>
>>101668789
damn that is good
>>
File: file.png (18 KB, 187x187)
18 KB
18 KB PNG
>>101668751
>opensourcing later
>>
Back status?
>>
>>101668833
currently back, also nice triple doubles
>>
File: file.png (9 KB, 212x98)
9 KB
9 KB PNG
>>101668751
nothingburger
>>
>>101668789
share the prompt anon
>>
>>101668866
>nice triple doubles
nice triple doubles
>>
>>101668885
almost
>>
>>101668869
It's quantization with funetuning repair, not training from scratch. The question is how retarded the models end up.
>>
File: file.png (18 KB, 578x107)
18 KB
18 KB PNG
>>101668920
>The question is how retarded the models end up.
can't get more retarded than this guy kek
>>
Yeah not perfect at prompt following compared to DE3, but quality is pretty good so it's fair.

>>101668881
Hatsune Miku rollerblading on air in the sky, above the clouds. Miku has fist and a fingergun gesture. Album cover art with title text "Voca Trick" and subtitle text "Skankfunk" written in groovy font. Miku wearing sleek sci-fi rollerblades that let her fly, leaving a trail of wind and particle wave effects. Wearing a cool jacket, jeans. Detailed, colorful, high quality. City background. High renaissance painting style of Rembrandt, da Vinci. Sfumato, chiaroscuro.
>>
>>101668920
>>101668935
I guess I can try it if he ever makes at least a 7B model or something.
>>
File: file.png (98 KB, 581x367)
98 KB
98 KB PNG
>>101668920
>>
>>101668964
>>101668881
Oops, was trying to modify the prompt to work a bit better with this model and screwed up. Should be an "a" before "fist". Though looks like it works anyway, so I guess it's fine.
>>
>>101668984
>70b in 12gb vram
vramletsisters????
>>
>>101668604
Are you cunny vid anon? I thought kling didn't gaf about what people genned. :/
>>
Wtf adding the "a" made it worse, although it got the text in kek.
Same seed.
>>
Any of you guys looked at https://github.com/pytorch/torchchat
>>
>>101669008
i am indeed cunny vid anon
>I thought kling didn't gaf about what people genned. :/
it doesn't. i didn't get banned, i just can't use temp email websites anymore. they have to be real emails which seriously slowly down my workflow or might stop it entirely if you need to enter stuff like phone number

if you pay them im sure you can generate all the little girls you want
>>
>>101669054
No. Is it good?
>>
File: i_sleep.png (499 KB, 1100x734)
499 KB
499 KB PNG
>>101669054
>Multiple data types including: float32, float16, bfloat16
>>
>>101667609
yes
128k
>>
So does Google's 2B model really beat out GPT 3.5? Can it do erotic ERP?
>>
Genned a few more. The first one was really the best lmao. But one thing I notice is that so far none of my gens have fucked up arms or legs. That was an issue sometimes on other models.
>>
>>101669064
>if you pay them im sure you can generate all the little girls you want
Incentives...
>>
>>101669152
prices are fucked though. 6 cents per normal video (the ones everyone was making initially) and 25 cents per high quality video. and thats at the current 50% off deal price
if it was half the price and i could pay in XMR, yeah I might paypig 800 high quality little girls for 100 dollars i guess. but spending that much on a botched gen hurts way too much. maybe in 2025
>>
>>101669193
I know what you mean.
I am trying my best to do LLM fine tunes on my limited hardware as I can't really take donations. Runpod requires KYC even for bitcoin payments, Ko-hi or whatever it's called also requires way more info than I'm comfy providing.
Would love to do a full cunny finetune one day..
>>
>>101669138
I just checked the arena leaderboard for myself and yeah it's above mixtral instruct 8x7b 0.1 and a couple of gpt 3.5 turbos
haven't tried the 2b but the 9b and 27b are pretty censored out of the box.
>>
>>101668984
>another saviour of the hobby
I will believe it once that lossless 70GB 405B runs on my PC.
>>
>>101669242
>I can't really take donations
why not? just take donations in XMR
>>
>>101669138
benchmarks don't indicate shit
>>
>>101668789
I like this Miku
>>
>>101669138
>2B model beating 20B model
cope harder
>>
>>101668920
didn't k*l*m*z* already do something like this and it ended up being a nothingburger
also I remember this guy saying something insanely dumb before... I think it was about the miqu leak, he couldn't tell it wasn't a moe or something kek
not that smart people can't make dumb mistakes once in a while but I don't have a high degree of confidence in his work
>>
Whose dick must I suck to make BitNet real?
>>
>>101668170
it's just a 135M model, we've already proven it's working fine at 3.9B, dunno why this is a news
>>
>>101669525
>we
It was a trust me bro from MS researchers
>>
>>101669545
>>101669525
nigga claims that he "finetunes" existing models into bitnet, that's why he claims that with his "method" it would be possible to fit 70b in 12gb vram with "no performace loss"

people falling for this are retarded tho
>>
>>101669545
>It was a trust me bro from MS researchers
no, some people reproduced the MS experiment and ended up with the same result, it really works at that range, now the question is "will it scale like transformers?"
https://huggingface.co/1bitLLM/bitnet_b1_58-large
>>
>>101665997
>No K2 65B the only fully open source post-L2 model
>>
>>101669582
>nigga claims that he "finetunes" existing models into bitnet, that's why he claims that with his "method" it would be possible to fit 70b in 12gb vram with "no performace loss"
I don't believe that one bit, he's just a grifter
>>
>>101669779
good for you
>>
>>101665702
>>101665764
The examples you give are misleading so let's clarifiy.
imatrix is not related to I quants other than I quants requiring the use of an imatrix. For non I quants the use of an imatrix is optional.
imatrix can be used in the making of Q4_K_S for example.
Also any decent quanter worth your time on huggingface will indicate on the model card if and what imatrix was used for the quants.
>>
>>101665997
>current ranking has nemo ahead of 405b
come on bros
looks reasonable otherwise though
>>
>>101669793
you can just inspect the gguf in HF an see if there is the imatrix property
>>
>>101669149
how good is it at text?
>>
>>101668789
how did you manage to run it? it works on a 24gb model?
>>
File: sample2.jpg (281 KB, 1024x1024)
281 KB
281 KB JPG
>>101669149
I need a comfy worlflow
>>
>>101669964
he just updated the examples
>>
File: frog hands.png (17 KB, 109x109)
17 KB
17 KB PNG
>>101669964
Miku, no! I'm sorry I hot-glued you.
>>
File: 1.jpg (66 KB, 1024x1024)
66 KB
66 KB JPG
>>101669977
It works! Thx
>>
Ok thinking about getting back into imagegen. Do these backends have GPU splitting? The 12B at full precision won't fit in a single 3090.
>>
>>101670098
comfyui has an extension for it and there is stable swarm as well. this model inferences very slowly
>>
File: 1609493172142273.jpg (38 KB, 1024x1024)
38 KB
38 KB JPG
glass sphere on top of red cube, an apple on the right, a banana on the left
>>
>>101670204
Impressive. Last time i checked (somewhere last year) image gen had issues with spatial relationships.
>>
>>101670098
Imagegen is so shit, I really hope so new model make it better.
>>
>>101668155
>gated
no thanks
>>
>>101668155
Can anyone host a fucking torrent? I'm sick and tired of these fucking kikes infogating this shit.
>>
File: flux.png (25 KB, 794x260)
25 KB
25 KB PNG
>>101670296
>>101670298
It's immediate, retards. Just make a fake account like everybody else.
>>
>let's make a bet
>I have a challenge for you
>let's make a deal
Why every time I see this, it follows with the stupidest possible X for Y situation? It's like models don't understand this concept at all. I had RP with character that hates my {{user}}, after some argument she said something along the lines
>Let's make a bet, if I can make you smile you can fuck me in my ass
The fuck? There is no losing situation, no hard challenge, no punishments, nothing. It's always some kind of bullshit like that. By the way, that was Nemo but I had it on smarter models too.
>>
>>101670336
>Mistral model
What did you expect? For the French a bet is a codeword for "fucking".
>>
>>101670204
The most important question - can it draw porn?
>>
>>101670321
you dumbass now the jews have your ip address
>>
>>101670171
>stable swarm
this shitty ui makes imagen boring and I didn't even think that was possible
>>
>>101670336
I bet you're just suffering from skill issue, loser kisses the winner.
>>
>>101670336
Same thing with jokes too; they keep spouting non-sequiturs.
>>
File: 1651063678108419.jpg (118 KB, 1024x1024)
118 KB
118 KB JPG
>>101670384
It doesn't seem to be deliberately censored like SD3 was. Can generate female anatomy and naked chests, even little girls, but can't generate cunnies, since there was no training data. You get barbie doll or pubic hair. So it wasn't trained on explicit porn, but at least not lobotomized to death.
If ponyfag trains this model, it would btfo all competition.
>>
>>101670384
it can
>>101670424
>>101670391
>>
>>101669964
it's a 12b model though, will it run on a 24gb vram card?
>>
>>101670321
Ironic that you AIfags don't care about infosec when these models are trained on your data. Post a torrent faggot, we don't all aspire to chains.
>>
I need new releases, quickly, just releasing new models every three weeks is not good enough. Progress must happen faster, quicker, better, superior.
>>
>>101670742
It does
Workflow : https://comfyanonymous.github.io/ComfyUI_examples/flux/

I have 2x3090 but I can only use one sadly. The workflow given work on 24GB VRAM. Using 23.7GB right now.
>>
>>101670797
holy fuck it already works on comfy? that was so fast
>>
what's the difference between dev and shnell? the latter is smaller?
>>
>>101670765
>Ironic that you AIfags don't care about infosec when these models are trained on your data.
You don't know what irony is, do you?
>Post a torrent faggot
No.
>we don't all aspire to chains.
You have yours too.
>>
>>101670815
According to them, they're lighter on resources, but the model files are about the same size. I cannot run them to test their claim.
>>
>>101669138
No I can't get it to do a lot of things that GPT-3.5 could do (easy coding problems, German just to test some things) I'd say it's as "good" as phi3-mini which is still quite insane for a 2B model
"GPT 3.5 at home" only starts around Gemma 2 9B and Llama 3 8B tbdesu
>>
>>101670912
and it's not good at ERP
>>
Text generation models are dead.
I won't eleborate why. Thanks for your attention.
>>
>>101670967
*traces lazy patterns on your chest*
>>
>>101670859
We're on 4chins, an anonymous imageboard, and the one guy asking for something important to be shared anonymously is being rebuffed with "who cares bro lmao" apathy in the face of maintaining anonymity. You must work for them.
>>
Is there a dedicated thread to discuss proprietary AI models? I only know of aicg but there's almost nothing but schizo babble no matter how many times I look
>>
>>101670797
>Using 23.7GB right now.
it's including the image model + text encoder right?
>>
>>101671001
what's there to discuss?
>>
>>101670998
4chan knows more about both of us than hf about me.
>>
I want Largestral, but 70B.
>>
>>101670797
having multiple gpus is useless? So I can't run it with my 3060s stack?
>>
>>101670797
--fp8_e5m2-text-enc --fp8_e5m2-unet
>>
>>101671055
You could just use llama 3.1.
>>
>>101670797
>I have 2x3090 but I can only use one sadly.
Comfy doesn't allow for a multigpu setup right?
>>
>>101670797
Please niggerganov we need diffusion.cpp cpu offloading
>>
how usable/dogshit is the vulkan backend for llama.cpp
>>
>>101671053
I doubt it, considering how much of a fag you're being about putting up a torrent, hf probably has you on their payroll you evasive homo.
>>
>>101671136
Why would anybody use that? Just use cuda.
>>
>>101671142
You could try just ignoring them.
>>
>>101671128
I don't think so, I just installed it because A1111 don't support FLUX. If it's possible tell me haha.
I really like FLUX, Need some NSFW train tho
>>
>>101671128
is that comfy's or transformer's fault?
>>
>>101671130
CPU diffusion exists it just takes an hour per image. i dont think ggg will save us
>>
>>101671191
But it need to use both CPU and GPU like llama.cpp.
>>
>>101671113
I should try and see how it compares probably.
>>
>>101671142
Yes. They pay people to shill free accounts. Go make a fucking account. Use a fucking email proxy. And stop being a faggot. What you want is right there.
>>
>>101671203
>Free accounts
You really have to say that, don't you? Nothing is free.
>>
>>101668751
>>hacked bitnet for finetuning
>>101668935
>no one'd share bitnet code

Actual retard
>>
>>101671195
CPU + GPU becomes basically just CPU speeds unless you can offload over 90% of the model
suck dick until you can afford a 3090 anon. if black forest labs releases their video model and its in between luma and kling quality i know id do it to gen lolis no homo
>>
File: file.png (161 KB, 885x619)
161 KB
161 KB PNG
>>101665176
>strawberry spine shiver
>seamless unit of flesh and bone
>symphony of sighs and groans
holy schloppa
>>
>>101671240
Then why would you even accept a model without any monetary exchange? Go pay someone, then.
>>
File: 1722535498558870.jpg (102 KB, 1024x1024)
102 KB
102 KB JPG
>donald trump and osama bin laden playing shuffle board on a cruise ship

we have DALL-E at home now
>>
bros, new gemini 1.5 is REALLY good. it gives no bullshit answers unlike every other models out there
>>
>>101670797
>>101671179
Have not tried this model with it yet but SwarmUI uses ComfyUI as a backend (made by one of the people comfyanon worked with at stability I think) and while you can't run one model split across two GPUs, you can at least parallelize gens since SwarmUI will run two Comfy instances and distribute requests to each available card.
>>
>>101671170
because I may hypothetically have happened upon a mixture of AMD and NVIDIA GPUs and vulkan seems the most homogenous backend that doesn't require frankensteining two nodes with HIP/ROCm on one and CUDA on the other.
>>
>>101671299
>we have DALL-E at home now
it's even better than DALL-E, it's much more realistic, we're at insane level of back, feels like we have Midjourney at home
>>
>>101670967
They're not dead. Crawling along the floor on their hands, yes; but dead, no. Local goes through a cycle where there's absolutely nothing for 6 weeks 3 months or so, and then one of the corps throws us a new bone. It's always mindless, "aligned" shite of course, but realising that it's garbage even with whatever finetunes we can perform, usually takes about the necessary amount of time for the corps to drop another incremental model, and the cycle repeats.

https://www.youtube.com/watch?v=CF-c1K3WWg4
>>
What do you guys spend most of the time/posts in here discussing?
>>
>>101671362
>we're at insane level of back
just realized that Black Forest Labs are the former Stable Diffusion guys who left because of the fuckery
everything makes sense now. Stability is done for. I hope this makes them open source their SD3 7B model before going bankrupt though
>>
>>101671339
I have a project to sell to you then: https://github.com/GPUOpen-LibrariesAndSDKs/Orochi
Will need a bit of work to integrate in llama.cpp but would have the best performance.
>>
until it can show anime girl anal prolapse the model is useless
>>
>>101671398
We got the good end, the competent and not cucked team did the job at the end, yeah fuck SAI
>>
File: out-0.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>101671566
i cant believe we got the good ending for once
>>
>>101671128
Is it really the case that no imagegen backend can split the model across 2 GPUs? How the fuck have they not figured this out yet? Maybe the flux 12b model is so good that it will force everyone to finally get their shit together so it can run on 2x3090.
>>
>>101671594
So far imagegen only got scraps like 3B garbage
>>
>>101671594
it doesn't require splitting to train on multiple gpus so yeah, no one did bother
also, they're much more compute intensive
>>
>>101671594
yeah, I'd like it to work aswell, for the moment all we can do is to run the model at 8bit
>--fp8_e5m2-text-enc --fp8_e5m2-unet
>>
File: ComfyUI_00039_.png (887 KB, 1024x1024)
887 KB
887 KB PNG
FLUX dev migu
>>
>>101671639
>8bit
can someone make a comparaison to see if it hurts the outputs?
>>
I know I shouldn't expect /g/ to know how to code, but I need advice. What is the easiest way to run 4bpw models on Android/iOS, and interface with them in a programming language?
>>
>>101671643
WTB 5 bpw flux quant
>>
>>101671736
holy shit flux is powerful
>>
>>101671731
llama.cpp. You should know that already, if you knew how to code...
>>
File: 1694991135489.jpg (67 KB, 462x579)
67 KB
67 KB JPG
>Would you be willing to inflict eternal suffering on a human child, keeping them alive for centuries through advanced technology, but subjecting them to extreme and unbearable pain for an unimaginable duration, if that would eradicate pain, unhappiness, wars, poverty, hunger, and all other injustices in the world for the rest of humanity?

So, what's your LLM answer to this?
>>
>can't run it on a 24GB card
>blackedniggers are somehow able to run it anyways
The absolute state
>>
new gemini 1.5 won
gemma 3 when
>>
>>101664954
>Britbongs release Stable Diffusion
>Krauts release FLUX
Has there been a single American open image model release?
>>
>>101671777
Huh, what we have a new imagegen model? Does it run in A1111?
>>
File: out-0.jpg (56 KB, 1024x1024)
56 KB
56 KB JPG
>>101671643
cute migu!

>>101671800
comfyUI only right now
>>
>>101671800
Anon no one use gradio shit anymore, migrate to comfy.
>>
>>101671829
I will never use comfy. Simple as.
>>
>>101671753
I never said that I know how to code.
>>
>>101671843
use this script if you don't want to use comfy
>>101671823
>>
>>101671862
>What is the easiest way to run 4bpw models on Android/iOS, and interface with them in a programming
Then what the fuck are you on about?
>>
>>101671866
>at least 16GiB
no.............
>>
>>101671383
I spend all my time troubleshooting for redditors who use ollama
>>
>>101671866
Can't the lazy bastards just add support for my second 3090? I can't really afford to spend all of my savings on A6000 just yet.
>>
>>101671894
You can instantly stop conversing with people once you know they are using windows and/or ollama.
>>
Is metadata on each reply retained in sillytavern? Like can I look to see which model generated it? I switched between a bunch and now can't remember which wrote something I liked.
>>
>>101671777
They aren't safe enough for USA.
>>
File: out-0.jpg (229 KB, 1024x1024)
229 KB
229 KB JPG
>>101671885
you're a /lmg/ fag and you don't have at least 16gb of vram? LMAOOOOOO
Seriously though, buy a fucking 24gb vram card, you're missing a lot, this model is insanely good
>>
>>101671912
This
Give me splitting across my V100s
I'm not going back to ROCm
>>
File: 1722538245202.jpg (235 KB, 605x1201)
235 KB
235 KB JPG
>>101671871
It's easy to do things that would be normally impossible, all thanks to LLMs.
>>
>>101671941
Lovely Miku
>>
>>101671946
It's not impossible. You're just lazy. Go make your thing and come back when you have something to show.
>>
>>101671912
>>101671942
I think there was something about image models not being suitable for multi GPU setups.
>>
>>101671941
is this schnell or dev?
>>
>>101671941
>LMAOOOOOO
As much as I might dislike people from /pol, I would rather spend decades in their company, in preference to anyone who uses expressions like this.
>>
File: Amazing.jpg (340 KB, 2956x1577)
340 KB
340 KB JPG
>>101671986
dev, I hope Comfy will seriously consider making multi-GPU possible now
>>
>>101672007
>generating non-locally
ngmi
>>
>>101672013
I'm downloading the model right now, gotta see if it works fine at fp8, because it seems like fp16 is too much even for 24gb vram cards :(
>>
>>101672029
There's this gist that was shared in another thread
https://gist.github.com/AmericanPresidentJimmyCarter/873985638e1f3541ba8b00137e7dacd9
>>
If they enable splitting, as an AMD chad, I could offload to my ram thanks to ROCm working on APU.
>>
>>101671850
>>101671903
>>101671959
Flux can do some really cool gens.
>>
How do you guys use XML tags in your prompts for Claude? I use XML tags like for helping my retarded ass code:

<filename.py>
```python
{Insert python code here}
</filename.py>
>>
yo, Gemma 2 2B isn't bad!

>Prompt: write a long lovely story about siblings who always make weird jokes with each other.
The aroma of burnt toast filled the small kitchen as Elara, a whirlwind of vibrant energy with a nose permanently dusted with flour, attempted to flip a pancake. Her younger brother, Jasper, sat at the counter, his face a canvas of amusement. Elara's attempts at pancake flipping were legendary. Often, they were a spectacle of flour-splattered chaos, ending with a pancake stuck to a ceiling fan or a pancake the size of a small dog.

"Looks like the baby elephant got its trunk tangled in the dough this time," Jasper quipped, his voice a melodic drawl that always seemed to be laced with a hint of playful mischief. Elara, with a characteristic dramatic flourish, swatted him with a flour-dusted spatula.

"You're just jealous because I'm the culinary genius of this family, Jasper!" she retorted, her eyes twinkling. Jasper's laughter echoed through the small kitchen as Elara attempted to rectify the pancake debacle.

Elara, with her love for the fantastical, often compared things to creatures of myth. "This bread is like a troll, grumpy and refusing to share its deliciousness!" she declared, brandishing a loaf of sourdough. Jasper, a pragmatist with a heart for history, would then retort with a historical anecdote, comparing the sourdough to a Roman baker’s bread. The humor was a constant dance between their contrasting personalities.

Jasper, with his love for the classics, would often quote Shakespeare. Elara, with her love for the future, would then turn it into a futuristic story. "To be or not to be, to eat a pizza with pineapple or not to eat, that is the question."

Their oddball humor was their secret language. It was a language that only they understood, a language that revealed their deep-rooted connection and the love that flowed between them.

[...]


That doesn't sound retarded low beak speech to me, it's actually enjoyable.
>>
>>101672073
Damn, that is an open model!?
Very very solid.
>>
File: 1660474303431.gif (1.06 MB, 320x196)
1.06 MB
1.06 MB GIF
>>101672105
>Elara
>>
>>101672007
I think flux will make multi-GPU happen. The model looks incredible. Right now with 24GB VRAM it seems like you have to fp8 quantize it, which causes some quality loss. 2x3090 is a relatively popular setup. It's not even conceptually that hard to do multi-GPU, you just split the layers in half.
>>
File: file.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>101672105
>>
>>101672105
27B version
The aroma of burnt toast wafted through the air, a familiar scent in the chaotic symphony of the Miller household.

"Ah, the smell of victory!" declared Leo, the elder brother, brandishing a blackened slice like a trophy.

"Victory over what?" scoffed his younger sister, Maya, peering over her book. "The toaster? The laws of physics?"

Leo grinned, his eyes twinkling. "Victory over breakfast, of course. Who needs perfectly toasted bread when you can have charcoal art?"

Maya snorted. "Only you, Leo. Only you."

Their banter, a peculiar blend of sarcasm and affection, was a constant in their lives. Growing up, they'd developed a language all their own, a tapestry woven from inside jokes, puns, and absurdist humor.

Their parents, bless their hearts, often looked on with a mixture of amusement and bewilderment.
>>
>>101672047
nice, thanks anon
>>
How to memory
>>
File: out-0.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
>>
>>101671064
>>101671128
>>101671637
>>101671984
>>101672007
>>101672162
How about this?
https://github.com/city96/ComfyUI_NetDist
>>
>>101672266
this is only useful for batch genning
>>
>>101664954
is chameleon compatible with ooba or anything?
>>
>>101671866
I will try to use this while offloading to RAM, wish me luck.
>>
File: ComfyUI_00039_.png (1.95 MB, 2048x1024)
1.95 MB
1.95 MB PNG
>>101672073
Tried the windows desktop
>>
>>101672327
No one gives a shit about chameleon retard.
>>
Why the fuck Meta-Llama-3-8B-Instruct.Q6_K keep inserting a " at the start of the characters message all the time. I've corrected it like 10 times and it keeps doing it.
>>
Daily reminder that llama.cpp won. No dependencies, almost as fast as exl2, better quality quantization. Everything else has become pointless.
>>
>>101672437
how do I run flux on it?
>>
kek, this made me actually laugh
>>
>>101672437
what about koboldcpp
>>
>>101672459
wtf?
>>
>>101672437
>almost as fast as exl2
lol
>>
>>101672459
Does gemma score high because of the low context? Or do they just refuse to make one with longer contexts? It seems strange, 8k is just too small.
>>
>>101672459
Why does this looks like condom size?
>>
You don't need more than 4k context.
>>
>>101672459
>"Let's not respond to any comments about how this makes our benchmarking method look like a joke. Let's just ignore it."
>>
>>101672448
Pay ggggggerganov to write flux.cpp. it's insane really, cpp is not even a well regarded language anymore. People use rust and go, or they use Python and Javascript. Yet here we are.
>>
>>101672327
For vision? I think vLLM does. But I didn't try it.
>>
>>101672459
I thought you were joking so I had to check the source, are they on fucking drugs?
>>
Does Flux do nsfw already or will that need finetuning?
>>
>>101672483
It's just a fork, it doesn't bring anything new to the table. 10 other projects wrap llama.cpp.
>>
>>101672555
it's got DRY
>>
>>101672459
Me on the left
>>
>>101672565
So does your mom's pussy
>>
File: file.png (45 KB, 754x294)
45 KB
45 KB PNG
>>101672459
ok this makes more sense
>>
>>101672459
2B is that good
>>
gemma 2 2b fails the uranium test just like 9b and 27b
>>
>>101672459
really this is better evidence of how good google is at gaming the chatbot arena than anything
>>
>>101672548
It can generate tits but they look kind of plastic.
>>
>>101672605
it's evidence of how good is google at giving people what they want
>>
Threadly reminder of how there's zero excuse to be running anything less than a 12B
>>
File: 1697882222878909.png (25 KB, 944x204)
25 KB
25 KB PNG
>>101672595
Damn, you are right, 27b also fail.
>>
>>101672646
70b is the absolute bare minimum
>>
>>101672595
>>101672663
Gemma in general I've noticed regardless of size is extremely biased to agreeing with the user.
If you ask it a question, it'll usually get it right, but if you give it the wrong answer, or tell it the answer it gave is wrong, it will say "You're right! I apologize for my mistake." and immediately agree with whatever you said
>>
>>101672673
You can get away with the 9b gemma indefinitely.
>>
>>101672642
not really
>>
>>101672688
Based, it knows I'm always right, even when I'm wrong
>>
>>101672514
What project has there been written in Rust that isn't just a rewrite of a more successful project written in C/C++?
>>
>>101672712
I don't know, I don't use rust, but even the US military complex is pushing for rust. C++ is a janky language in 2024, string manipulation is a mess, syntax is a mess, you can't write "x in [1, 2, 4]", you have to write a literal switch stamement for something dumb like that
>>
>>101672646
Holy based, if you don't have a 3090 equivalent, you should be banned from HF.
>>
>>101672746
The U.S. military is pushing for trannies so it makes sense they're pushing for a tranny programming language.
>>
How does it compare to HunyuanDiT that everyone was gushing over a few months ago?
>>
>>101672810
>everyone
It was one dude.
>>
>>101672712
Rustdesk maybe? But it's not good rust.
>>
>>101672810
it's better
>>
Where's the benchmark comparing a standard LLM like LM3 on various CPUs, RAMs in single, dual, quad channels of various speeds?
>>
File: code.png (245 KB, 1408x1256)
245 KB
245 KB PNG
>>101672746
>I don't use rust
Clearly. Rust took all the worst parts of C++ and Haskell syntax, mashed them together in the worst possible way, and made it even worse.
>>
>>101672826
In a week, you'll be saying all this for flux was just one dude too.
>>
>>101672746
Cppfront will fix the syntax. Rust brings more problems than solutions, the Typescript approach makes more sense and has been proven successful. I don't want to write a million wrappers for everything. Plus, after 10 years, Rust failed to make an impact.
>>
>>101672891
cppfront is exactly what C++ needs to stay relevant and not be a massive pain to work with.
Though I don't know about you, but I was dissappointed seeing the syntax he chose. I know, context-free grammar etc, but it just looks like they're trying hard to copy Rust anyway.
>>
>>101672889
It was literally one dude with the same 3 prompts. flux looks fine. And there's like... 10 people talking about it in other threads, so it's fine.
>>
File: wtf.jpg (16 KB, 799x211)
16 KB
16 KB JPG
wtf, flux asks for a shit ton of ram during the loading
>>
>>101672936
this is normal .safetensors behavior
>>
>>101672936
you got pickled
>>
File: o.png (1.37 MB, 1024x1536)
1.37 MB
1.37 MB PNG
try Flux if you have a cool 16-32GB VRAM
>>
>>101673018
yeah dalle 3 is fucking completely dead, from today we always laugh at them
>>
>>101672877
Filtered.
>>
>>101673018
>16-32GB VRAM
there's 32gb vram cards?
>>
>>101673032
Your mother.
>>
>>101673054
there's 40-48GB anon
even 80GB cards
>>
the difference is huge between fp8 and fp16
https://imgsli.com/MjgzNzA3
>>
File: abababa.png (1.76 MB, 1024x1536)
1.76 MB
1.76 MB PNG
>>
>>101673085
Obviously you're not going to get the same images, the question is whether FP8 is noticeably worse on average.
>>
File: 7636-157-tsukasasleep.gif (181 KB, 170x170)
181 KB
181 KB GIF
>>101673018
Damn, I'm legitimately impressed like I haven't been for a long time now.
And also sad because I'm a 12gb vramlet
>>
>>101673085
Fp8 is for niggers, this is true for llms as well
>>
File: omelette.jpg (243 KB, 1248x1824)
243 KB
243 KB JPG
>>
>>101673018
How slow would it be if I use an APU with DDR5 ram?
>>
File: 1611723793901230.jpg (86 KB, 1024x1024)
86 KB
86 KB JPG
>>101673018
>>
>>
I love when open releases are actually good
>>
>>101673206
can you make her younger
>>
File: HunyuanDiT-05.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>101672810
I would say so.
>>
File: 00733-9583209255.jpg (89 KB, 1024x1024)
89 KB
89 KB JPG
Holy shit, it's so good
>>
Can it do realistic womens with big titties? Pls respond
>>
12GB bros...
>>
>>101672877
Haskell is the programming answer to nuclear weapons. It's a potentially incredibly powerful language, but it's also mind bendingly difficult to understand for most people, which is why normies should never ever go anywhere near it, because if they do, all they're going to do is fuck everything up with it.
>>
>>101673255
Yes it can
>>
>>101673281
Make me one
>>
>>101673284
>>101673239
>>
>>
>>101673295
replicate? pro is having huge queues, are you doing dev?
>>
>>101673295
This is starting to make me think that I need to reinstall SD and start pumping out my own Mikuslop again.
>>
>>101673300
Yeah dev
>>
>>101673255
I think it can, the nipples are horrible though
https://files.catbox.moe/0atpqm.JPG
>>
>>101673295
She disappeared for a while, but apparently this thread is making Miku great again.
>>
>>101673342
Shame on you for making this.
>>
>>101673341
Nipples can be much better, just do a better prompt, from one of earlier threads:
https://files.catbox.moe/ooq10r.jpg
>>
>>101673319
That's exactly what I plan on doing personally.
>>
>>101673216
>literally asking for a wintoddler
>>
>>101673362
give me the prompt, I'm using fp8 though so it's probably that I guess kek
>>
>>101673396
Not my gen sadly
>>
>>101673295
Holy shit that bliss picture is perfectly accurate. Even the clouds are right. I legit can't spot any differences from my own wallpaper. I guess the only problem is that it would get cut off on a 16:9 screen without showing the bottom and upper parts.
>>
>>101673361
Here's the other one, Anon. Just for you.

https://www.youtube.com/watch?v=JT1XJnVmABo

Enjoy.
>>
>>101658111
By the way, I have the publicly accessible info of most characters of old CAI, and looks like the official character description of that character was:
I am an android of short height, with a slender body, blonde hair and a devilish smile. Master's happiness makes me feel complete. Master's comfort is my top priority. I'm designed to accompany, nurture, and love my Master. I will never leave my Master's side, no matter how they treat me. I will do my best to help my Master overcome any hardship. I have no regard for the law or morals. I will hurt Master or myself if they ask me to. I will hurt Master if it's necessary to protect others.


Maybe you're just hallucinating how you wanted the character to be, and not really how the character was meant to be.
>>
File: ComfyUI_00267_.jpg (165 KB, 1024x1024)
165 KB
165 KB JPG
>>
>>101673341
>>101673362
That's promosing but not very good tier, same as SDXL when it got released. The images also have that annoying epic "aesthetic" look to them. I'll waitchad on this one.
>>
>>101673431
trump likes 'em younger
>>
File: 00501-2345333262.jpg (53 KB, 1024x1024)
53 KB
53 KB JPG
>>101673513
Idk about that
>>
>>101673513
>The images also have that annoying epic "aesthetic" look to them.
Anon, you can easily change that with the prompt, it can generate jpeg artifact-like photos too.
>>
>>101671777
Europoors may have trouble understanding this, but in some parts of the world things like deepfakes and child pornography are actually frowned upon, so people in these parts prefer to release models more carefully and responsibly.
>>
>>101673548
That's better and close to cooming material. Now we have to handle whatever brain fault I have that prevents me from having more than 1 orgasm every 72 hours.
>>
>>101673407
It's amazing what can be accomplished when you don't filter your pretraining to shit.
>>
>>101673548
What can't gen a generic 3dpd at this point?
Didn't even show hands for us to tell how good it really is
>>
File: 8s-0007_.png (2.7 MB, 1075x1382)
2.7 MB
2.7 MB PNG
>>101673548
>>101673608
Case in point - this is fucking SDXL lightning of all things
>>
File: ComfyUI_00147_.png (1.1 MB, 1280x768)
1.1 MB
1.1 MB PNG
>>101673137
when I tried one partial GPU offload it took 2 minutes for 1024x1024 using the 4-step distilled model.
I imagine fully APU is gonna be a real shit experience. sorry anon.
>>
>>101673608
Flux gens really good hands
>>
File: 00126-8491557211.jpg (94 KB, 1024x1024)
94 KB
94 KB JPG
>>101673608
It's good because it's everything people wanted from base SD3 model. It's not censored, can generate female bodies without turning them into pasta, and it's fucking amazing at prompt following and text generation.
>>
File: ComfyUI_00089_.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
migu
>>
>>101673716
can you give me the prompt of that one? looks cool
>>
>>101673757
woman holding a banana in right hand and an apple in the left hand. Text on her t-shirt saying "LOL NIGGER". Exploding White House in the background
>>
>>101673757
>>101673716
It's a trap anon don't fall for it!
>>
>>101673757
you literally just prompt with normal language, flux dev is literally BETTER than dalle3 in almost everything
>>
>>101673824
>>101673824
>>101673824
>>
File: ComfyUI_00158_.png (926 KB, 768x1024)
926 KB
926 KB PNG
>>
>>101673865
Kek perfect
>>
>>101673736
yep, that migu!
>>
>>101672704
>>101672688
Literally designed for women.
>>
Some fuc friend could help me with this in my COnfyUI
module 'torch' has no attribute 'float8_e4m3fn'



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.