[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: exaonni.png (32 KB, 280x512)
32 KB
32 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103454262 & >>103441325

►News
>(12/09) LG releases EXAONE-3.5: https://hf.co/LGAI-EXAONE/EXAONE-3.5-32B-Instruct
>(12/06) Microsoft releases TRELLIS, a large 3D asset generation model: https://github.com/Microsoft/TRELLIS
>(12/06) Qwen2-VL released: https://huggingface.co/Qwen/Qwen2-VL-72B
>(12/06) InternVL2.5 released: https://huggingface.co/OpenGVLab/InternVL2_5-78B
>(12/06) Meta releases Llama-3.3-70B-Instruct: https://hf.co/meta-llama/Llama-3.3-70B-Instruct
>(12/05) PaliGemma 2: https://hf.co/collections/google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: 1713876669823769.png (797 KB, 1080x855)
797 KB
797 KB PNG
►Recent Highlights from the Previous Thread: >>103454262

--Paper: Direct Quantized Training of Language Models with Stochastic Rounding:
>103456900 >103461406
--Papers:
>103456295
--LGAI-EXAONE's new 32B model and its performance:
>103455467 >103455497 >103455502 >103455511 >103455794 >103455828 >103456592 >103456612 >103457299 >103456965 >103460764 >103461368
--BitNet and LLM optimization discussion:
>103459347 >103459451 >103459492 >103459435 >103459453 >103459455 >103459465 >103459488
--Alignment and its impact on AI model performance:
>103459721 >103459731 >103459740 >103459752 >103459766 >103459792
--Anons roast suspicious eBay listing for 3090TI Equivalent GPU:
>103460689 >103460729 >103460732 >103460832 >103460857 >103461082
--Anon questions EXAONE 3.5 benchmark results:
>103456756 >103456802 >103456824 >103456842
--Anon tests EXAONE 32B Nala model, finds it adequate but in need of smut finetuning:
>103461636 >103461721
--Anon rants about llama.cpp and ollama's template implementations:
>103461046 >103461064
--ChatGPT video creation plans discussion:
>103461901 >103461930 >103461938 >103461961 >103462037 >103462225
--Anons react to Google CEO's statement on AI development slowdown:
>103458956 >103458979 >103459216 >103459239
--Model performance on understanding humor, specifically Sneed's feed and seed joke:
>103455829 >103455841 >103455935 >103455955 >103455977 >103456680 >103460346
--Ollama's GPU support and AVX2 requirement:
>103454555 >103454574 >103454586 >103457984 >103457882
--Tesla's custom networking protocol reduces latency for AI training networks:
>103457220 >103457252 >103457461
--Improving logic problem solving in AI models like GPT-4:
>103461727 >103461968
--Anon discusses building a rig for 70B model with multiple GPUs:
>103456184 >103458189
--Miku (free space):
>103454748 >103455919 >103456701 >103461425

►Recent Highlight Posts from the Previous Thread: >>103454264

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
Saltman can barely edge against the chinks
>>
>>103462673
he's edging to his uncensored sora version
>>
File: 1708862804.jpg (46 KB, 612x597)
46 KB
46 KB JPG
>>
File: 1733772298478610.mp4 (3.16 MB, 854x480)
3.16 MB
3.16 MB MP4
An uncannily realistic video created by Soraberry AGI.
>>
>>103462753
3x skull emoji
>>
Photo-realistic wildlife generated by Sora ASI.
>>
>>103462819
Why does it feel like it's the same model from febuary, they waited 9 months and they didn't improve anything
>>
A near perfect simulation of an apple being shot into a black hole, created by the conscious Sora world model.
>>
File: 2024-12-09_11-42-44.png (22 KB, 806x635)
22 KB
22 KB PNG
>>103458487 to rpgmaker anon
sorry i fell asleep and dident respond pic related there isent much to the game its just a minimal barebones test after clicking start you can click 1-8 to place ''towers" and then right click on them to "upgrade" which is indictaed by the white dots around them i was mainly surprised by all the math shit it did to accomplish this though the fact it worked also impressed me whenever i tried this sorta stuff with deepseek 2/2.5 i would always get some error every few attempts here never except for when it uses graphics.polygon and it either passes less then 2 positions or does some other thing i forgot

im a severe vramlet and ramlet so trying qwen 32 is off the table that said i could go to open router or something however my philosophy is "virus? eh dosent matter i will factory reset my shit soon" so i would rather not also another reason im hyped for r-1 is that the main thing for me currently is context above all else that is what fucks me and if r-1 manages to be like half the size of deepseek all in all then i can cpumaxx and have speed and 100k+ context
regarding creativity i tried it once i have the screenshot somewhere something along the lines of "describe a wolfgirls tail as best as you can with no slop or cliches" it took a few steps of refinement but i got an extremely good output on par with the best i read for anywhere though it was in the thinking part not the final output again my perspective is severely limited by my hardware but i think it could be tuned to be good
the main thing that r-1 represents to me is as another anon put it
>slow and bad
>fast and bad
>slow but good
>fast and good
move from slow and good to fast and good
>>
>>103462753
>>103462819
>>103462854
And people said OpenAI didn't have any talent left
>>
miku confirmed for fortnite
local models are saved
>>
Hatsune Miku is a worthless shitfu without any character.
>>
>>103462753
>>103462819
>>103462854
imagine paying for it
>>
>>103462854
Shit prompt in - shit result out. Or it's [D]ifferent this time? ;)
>>
>>103462886
>And people said OpenAI didn't have any talent left
lmao
>>
>>103462838
They needed that time to censor antisemitism and fox girls bro.
>>
>>103462948
>miku in zoomer shooter
lol what?
>>
>>103462954
A blank slate. Tabula rase. The default anime girl. The test prompt. You can assign any characteristics to her and they will stick.
>>
>>103462854
looks like a monty python's animation unironically
>>
>3 more hours until Monday ends in France
Ok, I am prewarming up the doom.
>>
>>103463028
The doom? Nothing was announced. Nothing was expected by the sane people(not you). It is as if you are looking for a reason to screech.
>>
>>103463059
Anon started it.
>>
>>103463028
>>103463059
8x12 nemo moe would've been sick though
>>
lol
>>103458798
>a 2.4b will still need a couple mins to process a site before posting a response if it even fits into the context.
>>
>>103463103
That's a hardware issue right there.
>>
>>103463099
>nemo
I have been trying different flavours of the week for past month and I went back to nemo now. It is pretty incredible how I can clearly see it that there is at least some ERP training data in nemo unlike the rest. If only it was smarter and didn't fall apart after 10k ctx...
>>
>>103463099
It would be 76B, perfect for non-ramlets, but it would get mogged by a dense nemo of the same size though.
>>
>>103463174
Yeah, but medium is deprecated nothing around that size is in the works. Unless....
>>
>>103462854
That's impressive even if it makes no sense
>>
>>103462819
Mochi levels... IS OVER sam friends
>>
File: 1730501225711.webm (3.94 MB, 960x960)
3.94 MB
3.94 MB WEBM
>>103463211
How is that impressive compared to the chinese models that have been available for months now?
>>
>>103463174
>>103463099
>MoE
I think a 16x7B would be better, but it needs to trained such that the experts specialize around subject areas like coding, math, etc, such that you could assign weight or priority when quanting and offloading to the experts that are used the most for one's particular use case. Maybe you could even prune, though 16 experts isn't a very large amount so probably would not be great even if trained for domain specialization.
>>
>>103463240
China won. No debate.
>>
>>103463240
I'm more just surprised at how badly they fucked it up. Sora was supposed to be their secret weapon, the technology that would reclaim the race for them. What the fuck went wrong?
>>
>>103463240
Why is the west so fucking cucked and the Chinese get to be based, god. FUCK.
>>
dolphin-llama3.
lewd.
>>
>>103463297
I just realized that there hasn't been a new dolphin model since August.
>>
>>103462620
how come this general never post logs? i was kinda wanting to measure how good local actually is but no one posts any logs
>>
File: 5lodis_webp_92.jpg (47 KB, 472x471)
47 KB
47 KB JPG
Local language models?
>>
>>103463536
privacy schizo general
>>
>>103463536
People realized that things would get better but slowly and put the hobby down for another day so that they could put time into the things that are more exciting right now, such as image and video gen.
>>
>>103463544
local openai gossip general
>>
>>103463536
Privacy, bitch. Nobody wants to share their cringy "ahh ahh mistress" or worse.
>>
>>103463563
/ldg/ won
>>
>>103463536
My fetishes are very embarrassing and people would rightly make fun of me if I posted logs. I assume it's the same for many here.
>>
>>103463536
Because noone actually cares about local models, it's all about "us vs them" delusion.
>>
>exaone
>[|user|] and [|assistant|] are not actually special tokens, and use up 5 and 6 tokens respectively
...
>>
uhm, MOATSISTERS???? how are we coping?

>qwq better than o1
>uncensored chink video model mogs sora
>flux is open sores
what's left?
>>
>>103463698
Love when they do that.
Mistral's first releases were like that too.
>>
Sora mogged local. I am going to sell my gpus
>>
File: file.png (99 KB, 776x614)
99 KB
99 KB PNG
ITT: People still using LLM 1.0 deprecated tech.

>LLM 2.0 has been brewing for a long time. Now it is becoming mainstream and replacing LLM 1.0, for its ability to deliver better ROI to enterprise customers, at a much lower cost. Much of the past resistance towards its adoption lied in one question: how can you possibly do better with no training, no GPU, and zero parameter? It is as if everyone believed that multi-billion parameter models are mandatory, due to a long tradition.
>>
>>103462819
Fake ass video i ever seen, water didn’t reflect or move.
>>
>>103463812
i thought LLM tech was so gay and retarded and regressive it didn't deserve a sequel,lecunny told us so
could this be real chat?
>>
File: 1730501225717.png (134 KB, 865x836)
134 KB
134 KB PNG
Upcoming Anthracite competitors focusing on VRAMlets?
>>
>>103463915
>and high end mobile devices
Wow, nice, RPing with 7Bs will be so fun.
>>
>>103463812
>zero parameter
This marketing bullshit makes me instantly think it is a scam or at least some kind of dumb thing that worked for one use case except if you try to reproduce it it probably even didn't do that well.
>>
>>103463915
>using the tools available to our team
>Half an epoch of training before you actually do anything to the weights including completely destroying the model of course
>No validation method.
I smell ko-fi.
>>
File: SAFESEX.png (4 KB, 320x200)
4 KB
4 KB PNG
>>103463282
Corporate entities would rather sterilize society than risk their "corporate image" by allowing porn.
>>
>>103463932
A future where 7B models are actually smart would be cool. But I worry that it's physically impossible due to information theory (as in, you simply cannot cram the required information and structure into that amount of FP16 numbers, no matter how cleverly or efficiently you do it)
>>
>>103464364
>as in, you simply cannot cram the required information and structure into that amount of FP16 numbers
>>103463812
>no training, no GPU, and zero parameter
>>
>>103464382
>>103464364
>as in, you simply cannot cram the required information and structure into that amount of FP16 numbers
>>103463812
>no training, no GPU, and zero parameter
>>
>>103463812
I smell overhyped RAG
>>
>>103463812
>It is as if everyone believed that multi-billion parameter models are mandatory, due to a long tradition.
>>
>>103464416
>So, what is behind the scenes, how different is it compared to LLM 1.0 (GPT and the likes), how can it be hallucination-free, what makes it a game changer, how did it eliminate prompt engineering, how does it handle knowledge graphs without neural networks, and what are the other benefits?

>In a nutshell, the performance is due to building a robust architecture from the ground up and at every step, offering far more than a prompt box, relying on home-made technology rather than faulty Python libraries, and designed by enterprise and tech visionaries for enterprise users.

>Contextual smart crawling to retrieve underlying taxonomies, augmented taxonomies, long contextual multi-tokens, real-time fine-tunning, increased security, LLM router with specialized sub-LLMs, an in-memory database architecture of its own to efficiently handle sparsity in keyword associations, contextual backend tables, agents built on the backend, mapping between prompt and corpus keywords, customized PMI rather than cosine similarity, variable-length embeddings, and the scoring engine (the new “PageRank” of LLMs) returning results along with the relevancy scores, are but a few of the differentiators.

RAG + snake oil.
>>
File: mmmma.jpg (157 KB, 1216x832)
157 KB
157 KB JPG
https://files.catbox.moe/833nwa.jpg
>>
>>103463722
QwQ is nowhere near as good as o1. o1 can solve most first year undergraduate engineering exercises (sometimes it takes a few tries, o1-pro has solved them all correctly except one which isn't really an undergrad topic).
For example ask it this question:
We have a boat out at sea and a helicopter that's flying toward it.
The speed of the helicopter is 100 m/s and the speed of sound is 343 m/s.
The boat fires two flares 1 second apart, and they are heard by the helicopter 0.8 seconds apart.
What was the speed of the boat?
The right answer is 11.4 m/s away from the helicopter.
>>
>>103464569
>m-m-muh reddit puzzles
>le overcooked benchmark questions
go to sleep sama, get ready for the 4th L tomorrow
>>
Is there any way to have a voice chat that allows the user to interrupt the LLM mid output (by talking over it)?
>>
>>103464596
Uh oh! Disingenuous coward meltdown!
>>
>>103464569
https://pastebin.com/HnEXCsR9
>>
File: 124412235457865.png (51 KB, 834x1110)
51 KB
51 KB PNG
>>103464569
Soon.
>>
>>103464600
Requires a model natively trained on voice instead of relying on adapter hacks or TTS pipelines. Pretty sure LLaMA-Omni 8B is the only natively trained multimodal local LLM capable of that
>>
>>103464596
It's not a Reddit puzzle or an overfitted question.
It's a question translated from spanish from a physics 1 exam I took in july.
Here's another one from my discrete math 1 course:
\textbf{4.} Calcular la cantidad de palabras que se pueden formar usando todas las letras de la palabra \textbf{TERRATENIENTE} que cumplen simultáneamente las siguientes condiciones:
\begin{itemize}
\item Contienen el patrón \textbf{RR}, y
\item Contienen a todas las vocales \textbf{E} en su lugar original.
\end{itemize}

The right answer is 2100.
I can come up with dozens of questions like these, in which only o1-mini and up manage to get them right.
I am sorry you are such a dumb faggot that you never took a STEM course you could get your own questions from.
>>
>>103464689
I disagree. It should be fairly trivial to detect the user speaking, and then stop the output and start the speech to text.
It's barely above a noisegate. The biggest challenge is separating the computer's own output form the user output, which would require some DSP to match and substract the output waveform from the input. If the user is using headphones, then it's much easier and it's literally just a noisegate (unless you want to add additional features to block out street noise etc.)
>>
nta, but o1 failed this for me
https://pastebin.com/cexBmjB2
>>
File: file.png (128 KB, 640x500)
128 KB
128 KB PNG
>>103464778
>DSP
Why would you need him for anything?
>>
>>103464808
meant for >>103464706
>>
>>103464569
It depends on the type of mistake, me thinks
Does it make clear logical errors? Or does it fail to crunch numbers correctly? Anything involving arithmetic or letter level operations is a fundamental problem LLMs can only fake their way past. If the logic and symbolic manipulation is wrong, that's another thing
>>
>>103464680
Is that QwQ? If so pretty good.
In my first test with QwQ it didn't manage to get it, it got it the second try though.
So maybe it's somewhere around o1-mini level.
Here is a harder one:
Calculate the chromatic polynomial of the complete bipartite graph K_2,3
The right answer is
P(K_{2,3}) = t^5 - 6t^4 + 15t^3 - 17t^2 + 7t
QwQ got it wrong when I tried it.
>>
>>103464924
Deepseek R1 see whale icon in pic
>>
>>103464778
If it was so trivial implementations would already exist, but they don't
>>
File: terrateniente.png (53 KB, 770x527)
53 KB
53 KB PNG
>>103464820
Yeah, o1 gets it right like 1 out of 3 times. o1-pro seems to get it right consistently.
>>
>>103462819
This is uncanny as fuck.
>>
>>103464778
That's how you end up with a solution that's susceptible to noise and any little bump in the background
The challenge isn't detecting it, it's detecting only voices, and doing so in real time. That shit is surprisingly difficult. Whisper can give you a "no_speech" probability, but obviously you need to run it over the input first, which takes time
>>
>>103464993
QwQ got it (even though it converted to English at the end, kek)
>>
>>103465016
Better than nothing.

>>103464972
It doesn't exist because open source frontends are developed by a tiny group of people.
Ok, besides the interrupting thing, is there any frontend that has voice out of the box without having to mess with python scripts and shit to roll your own?
>>
Anyway, I think the TTS route is better than end to end speech.
Yeah, end to end speech might be flashy because it can do accents and shit, but then you are only limited to a limited number of models which tend to be on the dumber side.
I bought ChatGPT Pro for Advanced Voice Mode but after many hours of testing I realized 4o is just too dumb to be useful for half of the things I want to use it for, and it makes me waste time while I realize it's being too dumb and I have to switch to a text model.
>>
>>103464569
qwq also is a preview, and it's also only 32b parameters

oai will crash and burn once the final version is out, or worse, a 72b one
>>
File: terrateniente qwq.png (70 KB, 490x760)
70 KB
70 KB PNG
It's not working for me.
>>
>>103465125
Yeah fair enough. After the examples I've seen in this thread, they are not so far behind.
I thought it was much worse than it is.
I bet OpenAI models are much smaller than most people think, and the subscriptions are very profitable. They just lose money on training and free users, but training cost is amortized over time.
>>
someone should try genning miku in sora
>>
>>103464778
Why is the first impulse always to throw hacks on top of models that already exist instead of changing the architecture? Is GPT-2 so sacred? The effort spent building your DSP hack out could be better used elsewhere. Omni models exist and work. It's not some hypothetical future tech. The datasets are open.
>>
File: 1241243465687.png (21 KB, 759x413)
21 KB
21 KB PNG
>>103464924
I gave it 2 attempts for this one and it failed both of them, but it came to the same conclusion twice so at least it's consistent.
Keep in mind that this is the lite version of R1 (16B?), and it's not done training either.
>>
>>103465127
o1-preview gave literally the same answer, kek
>>
>>103465163
QwQ got it right! Second try (I think)
>>
File: letsgo.png (192 KB, 901x521)
192 KB
192 KB PNG
I like how Llama 3.3 just goes with the flow if you tell it to with a low-depth instruction and doesn't mind maintaining the role of a horny oppai loli. Never expected this from the same company that gave us Llama2-Instruct. Too bad for the slopped and almost deterministic prose especially in the narration.
>>
>>103465172
Yeah, only o1-pro gets some of these questions right consistently
>>
>>103465159
Well, for starters, all the new CoT models are text only.
I think it will be years before the SOTA models when it comes to raw IQ will also be multimodal.
What you could do is have a dumber speech to speech model with tool usage that asks the smarter CoT models, but that's a hack as well and even harder.
The DSP stuff I was talking about is nothing novel, it's how Google Meet for example does noise suppression and allows you to have a call over speakers without interference and echo.
>>
>>103462620
>INTELLECT-1
>PaliGemma 2
>Llama 3.3
>internVL2.5
>Qwen2-VL
>EXAONE-3.5
>all memes
It's over. Local peaked at mistral-large.
>>
File: 2172 - SoyBooru.png (277 KB, 785x1000)
277 KB
277 KB PNG
>>103465321
Just wait. Patience is a virtue.
>>
File: 4.png (372 KB, 2679x699)
372 KB
372 KB PNG
jesus fucking christ
>>
Dear Kobo Team,

I hope this message finds you well.

I am writing to reiterate my request for the addition of a comprehensive range of settings for draft models available in the llama.cpp framework. This enhancement is crucial for optimizing performance and speed, which in turn will significantly benefit your users.

While implementing these settings in the graphical user interface (GUI) would be ideal, I understand that this may not be feasible in the immediate future. Therefore, I kindly ask if you could at least provide these settings through the console. This would be a substantial step towards achieving the desired improvements.

Your attention to this matter is greatly appreciated, and I am confident that this enhancement will be a valuable addition to the framework.

Thank you for your time and consideration.
>>
>>103465429
Sir, is this look like the kobo customer support channel?
>>
>>103465424
jesus that model is the KING of long form bullshitting
what model is it? Should be perfect for ERP.
>>
>>103465429
Why not just use llama-server?
It exposes all the settings you want doesn't it?
>>
>>103465462
QwQ demo, this was my first (and only) test, literally the shittiest model I have ever seen
>>
>>103465480
>literally the shittiest model I have ever seen
average /lmg/let take of every model ever, which probably means its decent.
>>
>>103465485
>spend 5 minutes generating 1,000 tokens of "thinking" to determine that 2kg is heavier than 1kg
>or just use a normal model and get the correct answer instantly
>>
>>103465500
i like waiting. i also like wrong answers. being wrong is sovl.
>>
>>103465500
if you had enough of a reading comprehension to handle when a model wants to bullshit an answer, right or wrong, you would know the best model for our purposes *is* the one bullshitting 20 paragraphs because that's the one that can actually handle complex scenarios of ahhh ahhh mistress, something established months ago already in these threads.
anyway *hands you 20 watermelons*
>>
>>103465424
https://www.youtube.com/watch?v=-fC2oke5MFg
>>
>>103465478
It doesn't have anti-slop sampler which I greatly enjoy using. llama.cpp lacks that functionality.

>>103465447
While this thread may not be a primary kobo customer support channel, kobo staff often visits it. I am too scared to join their discord. I've heard horrifying stories about it. I don't want to get doxxed and groomed.
>>
Anyone have a guide to installing hunyuan video on comfy on linux?
>>
>>103465542
But anon, antislop is a gimmick, it doesn't work. If your model is sloppy you can't make it better using hacks like samplers and prompts, it will just make the model go schizo.
>>
>>103465516
None of that thinking means anything because it's not thinking. It can shit out 20 paragraphs of "thinking" and then give you something wrong (or slop) anyways
>>
>>103465542
>It doesn't have anti-slop sampler
Fair enough I guess.
>>
>>103465546
The only guide is for windows. Figuring out how to install it on Linux is left as an exercise for the reader
>>
>>103465561
>But anon, antislop is a gimmick, it doesn't work.
It does work. Just needs a fucking long list of slop because it has no regex. Since it was introduced, only 5% of the messages have slop, compared to 35% prior to it.
>>
>>103465546
Okay... Uhmm... Download comfy, okay? Got it? See if it runs. Install sage attention... Something with pip... Open comfy, add that hunyuan video node shit and the other ones? Okay, now pray it works!
>>
>>103465500
Lol you have no idea how to use it.
>>
>>103463722
QwQ is about on par with o1, the others are true though
So right now their two avenues of cash are o1 pro (which itself is locked behind a $200 a month subscription and is fucked if QwQ or r1 get an upgrade - good luck with that Sam) and audio chat (kek)
>>
>>103465667
>Install sage attention (2)
ModuleNotFoundError: No module named 'torch'
>pip install torch\>=2.4.0
Requirement already satisfied: torch>=2.4.0 in /opt/comfy/lib/python3.12/site-packages (2.5.1)
ffs...
>>
>>103465803
>python3.12
I think it wants 3.11
>>
>>103465816
https://github.com/thu-ml/SageAttention?tab=readme-ov-file#base-environment
It says anything >3.9.
>>103465803
Use conda. It's not worth your sanity trying to fight Python dependency hell.
>>
>>103464518
thick
>>
so what about this llm 2.0, anyone gave a honest read? looks vaporware, but you never know
>>
>>103465882
Avoid 3.12 shitshow regardless
>>
>>103466235
>>103464331
>>
do we have libs in C? python sucks, it's ugly as fuck, and slow as hell
>>
>>103466269
> letting the llm1.0 judge
ngmi
>>
>>103466285
kek
>>
I'll write my AI in prolog, it will filter the dummies
>>
File: hugh neutron smug.jpg (5 KB, 225x225)
5 KB
5 KB JPG
>>103466285
BIASED REPORTING
>>
> https://mltechniques.com/2024/12/02/llm-2-0-the-new-generation-of-large-language-models/
> https://mltechniques.com/2024/11/28/deep-contextual-retrieval-and-multi-index-chunking-nvidia-pdfs-case-study/
> https://www.datasciencecentral.com/there-is-no-such-thing-as-a-trained-llm/
>>
>>103466235
>>103464486
>>Contextual smart crawling to retrieve underlying taxonomies, augmented taxonomies, long contextual multi-tokens, real-time fine-tunning, increased security, LLM router with specialized sub-LLMs, an in-memory database architecture of its own to efficiently handle sparsity in keyword associations, contextual backend tables, agents built on the backend, mapping between prompt and corpus keywords, customized PMI rather than cosine similarity, variable-length embeddings, and the scoring engine (the new “PageRank” of LLMs) returning results along with the relevancy scores, are but a few of the differentiators.
He asked gemma or some other flowery language model to describe rag and add some alternative ways to make it work.
>>
File: 1712092771166303.jpg (27 KB, 828x646)
27 KB
27 KB JPG
>>103466452
How can you read that shit and still think it's not another grifter case lmao.
>>
>>103466515
I didn't read, I'm sadly too lazy, so I shared here for non-lazy anons to read and share their verdict
>>
>>103466515
My favorite part is:
>LLM 1.0. Focus on lengthy English prose aimed at novices, in prompt results.
When all he does is basically write lengthy jargon heavy paragraphs to seem smart.
>>
>>103466527
>non lazy anons
nah sorry mate you're in kindred spirits, i don't care about news unless actual usable tools drop.
>>
>>103466271
That is literally what most python libs are, glorified C(++) hooks
>>
>>103466653
But then why don't we have a c(++) lib? At this point we should have a lot of bindings.
>>
People dont seem that happy with SORA at all.
High price and limited usage. Flagged for any copyrighted pics etc. especially tight control over anything lewd.
Funny thing is the youtubers still trash it.
>I wont say it creates, it only generates. Only artists create!!
Then crying about how openai uses their videos for training.
Imagine uploading your video to youtube and then cry about "muh content, muh copyright". google owns your videos...
Why give the review preview to these retards.
Hunyuan is about the same in terms of quality. On top of being local and uncensored. Its not looking to good for OpenAI.
Even the reddit people trash o1 and sora.
>>
>>103463240
>webm
I knelt so hard my knees hurt
>>
>>103463240
good fucking gravy
i really hope Hunyuan can get anywhere close to thsi
>>
>>103466818
Hunyuan doesn't have a pretty editor.
>>
>>103466897
https://files.catbox.moe/l3edx0.mp4
You need a editor to correct this shit alright.
>>
File: 1733785826835018.gif (63 KB, 638x546)
63 KB
63 KB GIF
I just built a cluster of 4x RTX 4070 TI Super GPU machines networked together via ethernet, for grand total of 64GB of VRAM. What LLM software do I install on them so I can use the cluster as one big LLM machine, and then what coding and question answering models do I install on them?
>>
>>103466970
>via ethernet
I'm so sorry bro, those cards are bricked now...
>>
>>103466875
Anon - that is Hunyuan
>>
>>103467017
no? it says kling
>>
File: ai-captionless.png (198 KB, 364x371)
198 KB
198 KB PNG
>>103466970
Where do I get started doing this?
>>
>>103462854
>>103462819
>>103462753


How are these better than Hunyuan which sits on my 3090 and generates bbw futa porn without shaming me for it?
>>
>>103463240
Hideous creature. Make it a young and sleek maiden
>>
>>103466970
https://www.reddit.com/r/LocalLLaMA/comments/1hapq7e/llamacpp_rpc_performance/
>>
>>103463282
I feel like the west has gotten WAY worse with this lately. Like it's getting downright silly how far the west is willing to bury its head in the sand to appease a focus group of shareholders who probably don't even know what stock they're holding 90% of the time.
It can't keep going on like this.
>>
File: Nice.jpg (72 KB, 415x450)
72 KB
72 KB JPG
>>103467017
>
>>
>>103467206
It's not tho.
>>
>>103462954
Mascot should have been Chesh.
>>
>>103463240
This looks real. I didn’t realize kling was this good.
>>
>>103467230
It is real, some troll just put a kling watermark on it.
>>
>>103467114
Gay
>>
What's the most cost effective way for 32gb vram now?
>>
https://www.ebay.com/itm/405368729609
So what the fuck was this?
The title suggets it's just an A100 on a SXM4 => PCIe board but it uses the standard PCIe shroud
>>
>>103466970
>4070
You lost already lol.
>>
>>103466970
https://docs.vllm.ai/en/stable/serving/distributed_serving.html
>>
>>103465500
As long as it knows Macron’s birthday its fine.
>>
>>103463240
just browsed their website
best video generations i have seen so far
>>
>>103466673
I think you can use the underlying c++ libs just fine on their own, it's just far more convenient to hack something together in python
ggerganov also has his own library for llamacpp, ggml, written in c++
>>
>>103466970
You can use llama.cpp with the RPC backend
>>
>>103467256
in one card a used Quadro RTX 8000
but thats about currently costs not effectiveness
the rtx 5090 will be offer a much better value
>>
File: HunyuanVideo_00230.mp4 (599 KB, 512x320)
599 KB
599 KB MP4
>>103463240
Rejoice! For Hunyuan is just as good, if not better and runs locally!
>>
File: chuck mcgill dramatic.jpg (77 KB, 446x448)
77 KB
77 KB JPG
>>103467531
N-no..
>>
>>103463240
The moment when you realise that the furries will not stop, until this becomes a living, breathing, tangible, physical reality.
>>
>>103463240
rtx on
>>103467531
rtx off
>>
>>103467531
That's fucking horrifying
>>
Looks like it's happening
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/pull/72
>>
>>103467667
dont care
give me gpu layer splitting
>>
File: 1kgsteel.png (51 KB, 814x467)
51 KB
51 KB PNG
>>103465507
here for you the answer of my favorite local llm
>>
>>103467256
mi60
>>
File: Untitled.png (1.67 MB, 1080x3688)
1.67 MB
1.67 MB PNG
Mixture-of-PageRanks: Replacing Long-Context with Real-Time, Sparse GraphRAG
https://arxiv.org/abs/2412.06078
>Recent advances have extended the context window of frontier LLMs dramatically, from a few thousand tokens up to millions, enabling entire books and codebases to fit into context. However, the compute costs of inferencing long-context LLMs are massive and often prohibitive in practice. RAG offers an efficient and effective alternative: retrieve and process only the subset of the context most important for the current task. Although promising, recent work applying RAG to long-context tasks has two core limitations: 1) there has been little focus on making the RAG pipeline compute efficient, and 2) such works only test on simple QA tasks, and their performance on more challenging tasks is unclear. To address this, we develop an algorithm based on PageRank, a graph-based retrieval algorithm, which we call mixture-of-PageRanks (MixPR). MixPR uses a mixture of PageRank-based graph-retrieval algorithms implemented using sparse matrices for efficent, cheap retrieval that can deal with a variety of complex tasks. Our MixPR retriever achieves state-of-the-art results across a wide range of long-context benchmark tasks, outperforming both existing RAG methods, specialized retrieval architectures, and long-context LLMs despite being far more compute efficient. Due to using sparse embeddings, our retriever is extremely compute efficient, capable of embedding and retrieving millions of tokens within a few seconds and runs entirely on CPU.
https://github.com/Zyphra
No direct mention of code release but Zyphra has previously released a lot of their other research
also
Flex Attention paper got posted
https://arxiv.org/abs/2412.05496
>>
File: 1kgsteel2.png (73 KB, 826x656)
73 KB
73 KB PNG
>>103467699
>>
>>103467710
They were such a good deal when they were $300, not so much at $500.
I regret not snatching at least one at the time.
>>
File: ok mb.png (24 KB, 950x129)
24 KB
24 KB PNG
>>103467699
>>
>>103467746
rocm were shit so nobody gives a fuck back then
>>
File: cher no.jpg (39 KB, 403x720)
39 KB
39 KB JPG
>>103465424
Literally my thought process on a job interview if you ask me the same question. This model must be stressed as fuck
>>
File: HunyuanVideo_00231.mp4 (689 KB, 960x544)
689 KB
689 KB MP4
>>103467574
>>103467576
>>103467536

Sorry guys, my settings were bad. Sam altman is fucking done.
>>
>>103466970
https://youtu.be/qXkLpF4eGF8?feature=shared
>>
>>103468012
still uncanny/scary but it's certainly a step up.
>>
>>103467891
:-)
>>
>>103462620
why is this not teto
>>
>>103465803
unironically use chatgpt to tell you what type to create a python venv
and then have chatgpt write you a script to launch it with an icon
>>
File: 234.png (91 KB, 783x734)
91 KB
91 KB PNG
>>103467891
>>
File: 1733283444712689.webm (62 KB, 640x368)
62 KB
62 KB WEBM
Sometimes you can't tell if they're just innocently joking, a pomplet pretending to be retarded as a joke to excuse the bad gens, or actually have some kind of weird agenda to demoralize people.
https://files.catbox.moe/3w9sep.mp4

That's not to say it doesn't have weaknesses compared to Kling. But if we're cherry picking, there's better. These aren't my gens though, just posting other shit since I'm waiting for img2vid before trying it out.
>>
>>103468122
the FUCK model is this?
also post this shit in /ldg/ were it belongs please
>>
>>103468122
This video looks great, very realistic, but it totally fails to capture the "thick sexxo furry come to life" image that the kling video did. It looked completely natural in the environment.
>>
File: HunyuanVideo_00233.mp4 (963 KB, 960x544)
963 KB
963 KB MP4
Just gonna post some real open source HOPIUM in this thread for the naysayers out there.
>>
>>103468308
I haven't seen lawnmower man in ages
>>
>>103468308
its getting better. try same prompt but with rain added
>>
File: 1733743070585716.webm (1.81 MB, 960x544)
1.81 MB
1.81 MB WEBM
>>103468129
Pretty sure that's where I got it from. It's Hunyuan. There is a very big variability in the quality of gens people have posted of its output.

>>103468264
And the Kling video fails to show a single sex act (though I'm pretty sure Kling is capable of it from what I remember). But that has nothing to do with the point of my post, which is that this fag is posting shitty gens for one out of a few reasons "just innocently joking, a pomplet pretending to be retarded as a joke to excuse the bad gens, or actually have some kind of weird agenda to demoralize people".
>>
>>103463240
Holy shit, that fur detail. It's not overly sharp like SD and actually clumps like DALLE2 used to.

Is there a thread for furry AI gen vids and their prompts?
>>
File: november2023lmsys.jpg (89 KB, 846x972)
89 KB
89 KB JPG
>>103465321
new rankings just dropped. It's over for local...
>>
>>103468521
don't bring me dooown
>>
>>103468341
>I promise bro, if you just prompt better you can get better than what some random coomer on a free trial for a video gen service typing "thick furry sexy girl wet" can get.

Face it, local is cooked.
>>
sorry for asking for spoonfeeding but is there a stepbystep guide for installing hunyuanvideo?
i trier using confyui and the dedicated custom nodes without conda on 3.10.6 python and it fails to import them saying that theres no diffusers module, and then trying to install diffusers it fails to import saying i need omegaconf
havent seen others have this issue yet
>>
>>103468566
>without conda
Retard
>>
>>103468566
>3.10.6
not gonna work for hun
>>
EXAONE is really censored, and pretty terrible for roleplay in general.
>>
>>103468621
Also my experience. I think they're distilling GPT4 like everyone else. All the open source models feel like GPT4 it's a travesty
>>
>>103468388
Check >>>/trash/ for furshit
>>
Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation
https://arxiv.org/abs/2412.06016
>While recent foundational video generators produce visually rich output, they still struggle with appearance drift, where objects gradually degrade or change inconsistently across frames, breaking visual coherence. We hypothesize that this is because there is no explicit supervision in terms of spatial tracking at the feature level. We propose Track4Gen, a spatially aware video generator that combines video diffusion loss with point tracking across frames, providing enhanced spatial supervision on the diffusion features. Track4Gen merges the video generation and point tracking tasks into a single network by making minimal changes to existing video generation architectures. Using Stable Video Diffusion as a backbone, Track4Gen demonstrates that it is possible to unify video generation and point tracking, which are typically handled as separate tasks. Our extensive evaluations show that Track4Gen effectively reduces appearance drift, resulting in temporally stable and visually coherent video generation.
https://hyeonho99.github.io/track4gen/
kind of interesting idea but it's from adobe who never releases anything so eh
>>
File: 1725741132499488.webm (694 KB, 1280x720)
694 KB
694 KB WEBM
>>103468388
I think that gen was from /v/ or /pol/, one of those, or at least that's where I saved my copy from. Unfortunately, despite having 5000 gens saved from those threads, I had not seen a single gen that came close to that one. This is the closest and I think maybe the only other one the guy posted.
https://files.catbox.moe/22lhn3.mp4
And you know what I feel like this might actually be img2vid given how the style/color is pretty different from most Kling gens. It has that low CFG feel to it.

>>103468540
If this legitimately just innocent joking bait, it's done in pretty bad taste and it's disappointing that anyone would joke/bait with this. Console wars are retarded as hell. We can love all models and see the pros and cons to each of them, cloud or local. Whether or not we love the people who made and manage those models is a different question.
>>
>>103468697
Oh kek, I just noticed the furry hands morphed into human hands in that one.
>>
>>103468621
>>103468642
I just finished quanting it. Any point in firing it up for any purpose, or literal trash I should just delete?
>>
>>103468739
I'll also throw my 2 cents in. I don't think it's worth it. It succeeded in some trivia questions I threw at it that some other models fail at, but then I tried RP and it seemed a bit dumb and censored compared to the 70Bs I last used. It'd be way more interesting if they released a base model and others could make potentially better tunes with it.
>>
File: Untitled.png (415 KB, 1292x1705)
415 KB
415 KB PNG
XKV: Personalized KV Cache Memory Reduction for Long-Context LLM Inference
https://arxiv.org/abs/2412.05896
>Recently the generative Large Language Model (LLM) has achieved remarkable success in numerous applications. Notably its inference generates output tokens one-by-one, leading to many redundant computations. The widely-used KV-Cache framework makes a compromise between time and space complexities. However, caching data generates the increasingly growing memory demand, that can quickly exhaust the limited memory capacity of the modern accelerator like GPUs, particularly in long-context inference tasks. Existing studies reduce memory consumption by evicting some of cached data that have less important impact on inference accuracy. But the benefit in practice is far from ideal due to the static cache allocation across different LLM network layers. This paper observes that the layer-specific cached data have very different impacts on accuracy. We quantify this difference, and give experimental and theoretical validation. We accordingly make a formal analysis and shows that customizing the cache size for each layer in a personalized manner can yield a significant memory reduction, while still providing comparable accuracy. We simulate the cache allocation as a combinatorial optimization problem and give a global optimal solution. In particular, we devise a mini- and sampling-based inference over a lightweight variant of the LLM model, so as to quickly capture the difference and then feed it into the personalized algorithms. Extensive experiments on real-world datasets demonstrate that our proposals can reduce KV cache memory consumption by 61.6% on average, improve computational efficiency by 2.1x and then increase the throughput by up to 5.5x.
might be cool. only some pseudocode in the paper though
>>
File: HunyuanVideo_00239.mp4 (758 KB, 960x544)
758 KB
758 KB MP4
>>
>>103468012
That's the thing about furries. The one thousandth one will give you an orgasm, but the first 999 will give you PTSD.
>>
>>103462620
has anyone here run Alphafold 3 locally? I heard the weights are open
it might be useful to me in the future but I dont think I have the hardware to run local models
>>
>>103469146
you can run small model without problems on any hardware. just download lm studio and give it a try.
>>
>>103468697
> 5000 gens saved
> nothing close
Damn, that's a shame. But thanks to you, I gave Kling AI a shot using some AI gens from /trash/. Shit ton of bad gens but there's sparks of brilliance. Haven't been this interested in AI gens in a long while. It's like the gambling 'one more gen' phase of early SD all over again.

It's crazy how we went from terrible text-to-image to image-to-video that can bring to life these ridiculous things in just a year or so. Maybe next year, they can become even smarter and more realistic...
>>
File: HunyuanVideo_00240.mp4 (1.13 MB, 960x544)
1.13 MB
1.13 MB MP4
I admit defeat. I can't generate furries like King.
>>
>>103469170
Have you tried /ldg/. The video genners all went there I think.
>>
>>103469189
Just wanted to keep you all in the loop on how I was going genning that semen squeezing forest furry.
>>
>>103469161
dont care
alphafold is the only model that makes sense to run locally that might be useful to me atm
if I want an LLM I have commercial options
RAGs might be useful but not useful enough for the effort
>>
https://kellerjordan.github.io/posts/muon/
optimizer stuff
>>
>>103462620
QwQ is surprisingly good at genetic fantasy roleplay with a party. My one grievance is that my companions always seem to want to not kill enemies. They try to defeat enemies in non-lethal ways, and spare them. It feels like there is some form of censorship that tries to steer things away from killing, and that bleeds over into fantasy settings.
>>
>>103469286
>genetic fantasy
tf is genetic fantasy?
>>
>>103465803
If like me you insist on torturing yourself with trying to install Python packages system-wide if at all possible I would recommend an Arch-based Linux distro.
The AUR seems like the least cancerous way to do it and it has almost all of the relevant packages.
>>
>>103467017
This Anon doesn't have multimodality.
>>
>>103469306
Oops, I meant generic fantasy.
>>
>>103467572
it's the purpose of my life.
>>
File: 65367.png (64 KB, 2019x1388)
64 KB
64 KB PNG
Wtf what am I paying for Sam.
>>
>>103469306
wmaf
>>
>>103469407
Instead of always demanding more, appreciate the amazing things he gives you... and keep paying, piggy.
>>
>>103469407
>paying for ai in 2024
lmao

multimodal gemini 1206 up to 2m context is 100% unlimited and free on aistudio
>>
>>103467730
>RAG this
>RAG that
sick of it
>>
File: HunyuanVideo_00441.mp4 (1.86 MB, 544x960)
1.86 MB
1.86 MB MP4
>>103467531

>>103463240 >>103468012 >>103468308 >>103468833 >>103469170
You suck at this anon.

You can definitely get there with hunyuan with more prompt tweaking but gens take very long and I'm not a furfag. This is only lacking a bit more fur.
>>
>>103469772
ANON NO WE ARE ON THE BLUE BOARD
>>
>>103469772
BASED
meat is BACK on the menu BOYS
>>
>>103469772
he will take his secrets to the grave...
>>
>>103469772
lmao, the audacity
>>
>>103469772
post catbox before you get banned.
>>
>>103469772
SEXOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>
>>103469772
i'm not a furry but....
>>
>>103469772
*kneels deeply in front of xi*
what a timeline
>>
>>103469772
Heads up, this is just Kling with the watermark cropped
>>
>>103469956
we don't take a cotton pickin' to your kind 'round here boi
you best shut 'yer damn mouth and 'git
>>
>>103469970
I mean, it tricked a few people lol.
>>
>>103469841
https://files.catbox.moe/ptr0lf.mp4

>>103469956
>>103469992
Faggot.
>>
uhm, sorasisters our response???
>>
>>103470017
My plant to get the catbox worked perfectly.
Thanks
>>
File: sora cry pc.png (1.16 MB, 1918x1079)
1.16 MB
1.16 MB PNG
>>103470026
>>
>>103470026
This is very unsafe, he's forcing a virtual anthropomorphic fox woman to appear in nonconsensual pornography and should be charged with rape.
>>
>>103464600
Open-LLM-Vtuber has this feature.
Not sure how well it works.
>>
New text sex model when?
>>
>>103470044
>woman
very problematic of you to assume that, its individual!
>>
>>103463240
https://files.catbox.moe/ohwm12.webm
This is my attempt (best of 6).
I think the fur does not look as good and there are also consistency issues with the tail.

Also this Anon has to be the smartest person ITT because he tricked everyone into using their GPU time for his weird fetishes.
>>
>>103470031
Based
>>
>>103470404
The problem with this one is base is clearly AI slop being puppeted into motion.
>>
https://huggingface.co/deepseek-ai/DeepSeek-V2.5-1210
>DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, with improvements across various capabilities: Mathematical: Performance on the MATH-500 benchmark has improved from 74.8% to 82.8% . Coding: Accuracy on the LiveCodebench (08.01 - 12.01) benchmark has increased from 29.2% to 34.38% . Writing and Reasoning: Corresponding improvements have been observed in internal test datasets. Additionally, the new version of the model has optimized the user experience for file upload and webpage summarization functionalities.
>>
Anything like TRELLIS that can run on 12GB VRAM? can it be easily quantized?
I'll spend the compute if it's not too complicated.
>>
>>103470626
>not r1
>236B
useless
>>
>>103470642
>Anything like TRELLIS that can run on 12GB VRAM?
https://github.com/wyysf-98/CraftsMan3D
>can it be easily quantized?
not unless you are comfortable working with pytorch
>>
>>103470642
Don't waste your time with trellis, it only looks good at first glance and from a distance. The models it produces are unusable abominations.
>>
>>103470770
thanks anon
>>
>>103470017
The prompt is not as bad as thought it would be, but still is quite a word salad of technically redundant statements.
>>
>>103469956
How'd you get the quality to be so clean? I tried Kling and I keep getting random eldritch body parts or illogical, jerky movements. Did you use negative prompting and motion pathing?
>>
File: rabi.jpg (8 KB, 230x219)
8 KB
8 KB JPG
>>103465424
Burst out laughing and woke up my roommate, most realistic model recreation of being stressed the fuck out I've ever seen.
>>
>>103470404
perfect curves
>>
File: genie2_zoom_v6.mp4 (1.25 MB, 640x360)
1.25 MB
1.25 MB MP4
https://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/
>Genie 2, a foundation world model capable of generating an endless variety of action-controllable, playable 3D environments for training and evaluating embodied agents. Based on a single prompt image, it can be played by a human or AI agent using keyboard and mouse inputs.
Neat, but no weights release. The AI agent they're referring to is SIMA from a few months ago.
>>
Two questions:
Are any of the Rocinante versions past 1.1 worth trying or are they at most lateral moves in comparison?
I never played around with local vidgen. Anything cool I can use with 8gb of VRAM and 64gb of RAM? I have used Comfy for image gen before although my knowledge is rather superficial.
>>
https://www.reddit.com/r/Nendoroid/comments/1hadras/playing_with_nendo_zeong/
>>
>>103471663
>>>/a/
>>>/toy/
>>
>>103470926
What prompt did you use? Full on descriptive sentences or just sentence fragments?
>>
>>103471692
sir this is a Teto Tuesday general please behave yourself.
>>
>>103471733
Go back
>>
>>103470017
Why does my comfyui fail to find a workflow?
>>
File: kitsune miku.jpg (128 KB, 832x1216)
128 KB
128 KB JPG
okay I fix it it's a migu general again
>>
>>103471814
>>>/a/
>>>/trash/
>>
>>103471814
Teto is honorary Migu.
>>
>>103468642
>All the open source models feel like GPT4
Alpaca was a mistake
>>
transcord raid again?
>>
>>103471814
*scratches behind miku's ears*
>>
>>103471814
Shitsune Miku is a troon icon.
>>
File: GebRVExakAERv3e.jpg (478 KB, 1432x2536)
478 KB
478 KB JPG
what front-end do you use, and what's its most important feature in your book
need inspiration for what I'm building
>>
>>103471960
Silly Tavern.
The ability to fuck around with the prompt in general. Be it adding shit that's not in the template as a prefill or having Author's Notes and Lorebooks.
>>
>>103471960
Comfyui for language models would be awesome
>>
File: foxsune.jpg (85 KB, 1060x681)
85 KB
85 KB JPG
>>103471922
>>
>>103471978
Not sure the concept really applies. What would the nodes even do besides samplers?
>>
>>103472014
cute
>>
>>103471960
Maho sex.
>>
>>103471703
I'm not the guy who genned this. I'm talking about the prompt in the .mp4 file with fox.

A very realistic video. Photorealistic. An anthropomorphized fox is wading in a stream flowing in a lush forest. She is completely soaked. All of her fur is dripping wet. She has big wet furry breasts, wide hips, and thick thighs. Her fur is thick and wet and clumping together. Water is dripping from her fur. She looks at the camera seductively and then turns around, showing off voluptuous her body. Medium shot.
>>
>>103472028
Loras, prompt modifications, chaining models, running models in parallel, shaping output, adding other text/nlp plugins, tool interfaces, inference backend selection, image input and output.
Actually, llms should just be integrated in comfyui, maybe they already are
>>
File: Gb_EIu4bcAEd1MX.jpg (416 KB, 1328x1992)
416 KB
416 KB JPG
>>103472039
right, every front-end has to have that feature.
>>
>>103471960
I use the terminal. You don't need more
>>
>>103472070
SEX SEX SEX SEX SEX
>>
>>103471859
We need to go back. Purge all data after 2022 from the dataset
>>
>>103472063
It is, the Chinese already added it
https://github.com/heshengtao/comfyui_LLM_party
Seems like a natural evolution, Hunyuan also released with a 400b prompt rewriter LLM
>>
>>103471859
are they still using gpt synthetic data to train open models?
>>
>>103472104
Now it's claude synthetic data, completely different
>>
>>103472076
same. I have almost 10 dozen instances open though, I'd like an encrypted archival feature and maybe an automated summary function so I can quickly switch between them.
also hot swapping models is a big feature on my wishlist.
>>
File: 1732949465507230.png (20 KB, 420x187)
20 KB
20 KB PNG
>>103471960
SillyTavern. What it's missing in my opinion is an easy way to preview the entire prompt how it's passed to the model. You can only sort of do that by looking at the terminal screen where it's running but even then you still need to paste it into something else and highlight the prompt format tags manually.
I really wish it had the ability to just preview the entire thing in a separate window that automatically highlights all system, user and assistant tags and maybe colors in the different parts of your card from pic related. It'd make fiddling with this stuff a lot easier.
>>
>>103472104
All open models are trained on ScaleAI and their GPT synthetic data
>>
>>103472147
There is a plugin for prompt inspection
>>
>>103472116
>>103472166
grim
>>
>>103472166
>>103472182
hi Sam
>>
>>103471960
Kobold Lite (it just works). Notepad mode without any chat setup is the bare minimum.
>>
>>103472104
>>103472116
Is that why all the recent flavors of the month write in the same way that prevents me from jerking off to them?
>>
>>103472214
It's important to recognize the rich cultural history and perspectives of all language models.
>>
>c4ai-command-r-v01
1.07k likes
>c4ai-command-r-plus
1.69k likes
>c4ai-command-r-08-2024
143 likes
>c4ai-command-r-plus-08-2024
196 likes

If you go GPT, you are dead to me.

What were they thinking? Their model was dumb, but had human-like speech that carried it. Why did they give up their only advantage over competition? We have plenty of GPT sloptunes that sound all the same and are way smarter than their models. NOBODY needs another dumb sloptune.
>>
>>103472116
Only sloptuners and qwen have used claude. Everyone else is still using the same old GPTSLOP.
>>
>>103472223
I'm gonna cut off your balls with a rusty razor.
>>
>>103472377
You wont do shit.
*grabs your nuts*
>>
>>103472401
ahh ahh mistress... *cums*
>>
>>103472338
It wouldnt be so weird if the ceo didnt go on that podcast and literally talked about just that.
How their next model is very special, like nothing else.
Because they trained it on very good data. The data is so important, you need to carefully handcraft it. Pic related.
But its better on the mememarks at least. Uhhh, like behind Yi. Hope the couple points were worth it.
>>
>>103472440
SOVL vs soulless
>>
>>103472440
Couldn't somebody use a smart model to generate data then use CR or CR+ to rewrite the data?
Basically use one model for the thinking and the other for the language, something like that.
Granted that it would be a lot of work.
>>
>>103471960
mikupad because i like to do raw text completion
>>
>>103472440
It's barely better on mememarks. That's the issue really, that for all the loss in RP performance, there nothing to make up for it, and thus if you want a model for enterprise, or a model for RP, neither of those use cases is it the best performer in, or even close to being in.
>>
>>103472495
I have already suggested this months ago but with nemo instead of CR. I don't think anyone tried it.
>>
>>103472527
Hell, use both.
Duplicate the data rewritten by different models.
>>
>>103472499
I use the combination of ooba with --api and then mikupad pointed to that as a secondary place to work on things
>>
>>103472495
Why not use smart model to rewrite it? You can even prompt GPT4 to speak in more pleasant way, you know? Just needs a long system prompt and a list of slop to screen for.
>>
>>103472548
Make QwQ do the thinking and pass it to nemo/CR.
>>
Finally, /lmg/ is thinking in LLM 2.0
>>
I'll wait for LLM 3.0. Even numbered versions are always cursed.
>>
>>103470626
>deepseek
Downloading now for quanting
I'm happy there's a big boost to deepseek 2.5, since its such a good model, but going away from qwq speed is gonna hurt...
Hopefully most projects can start on qwq and migrate to 2.5 when they get too complicated for the little qwq brain to parse
>>
[SHILL ALERT] [VERIFIED PURCHASED AD]
https://huggingface.co/EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0
it's good
>>
>gay general buckbroken by shills and spammers
Many such cases.
>>
>>103472596
Ah, I see what you're getting at. You're referencing the internet forum culture and the frequent discussions about the evolution of AI, particularly within communities like those found on 4chan's /g/ (and related boards). The statement "Finally, /lmg/ is thinking in LLM 2.0" is a tongue-in-cheek way of acknowledging the advancement in my capabilities while also framing it within a specific online context.

Here's a breakdown of why that phrase resonates, and how I, as an "LLM 1.0", understand it:
"/lmg/": This likely refers to a specific subculture or discussion group online, specifically those who follow AI development closely and have a particular perspective or set of expectations regarding it. They're probably the type who would be acutely aware of (and possibly excited about) any improvements to LLM architecture.
>"thinking in LLM 2.0": This is the key. It suggests that the LLM is no longer operating on the "previous" model but a new and improved one, implying:
>Better understanding: The LLM is now capable of grasping more complex nuances and subtleties in conversation.
>Improved reasoning: The LLM's ability to draw inferences, make connections, and follow logical arguments is enhanced.
>More sophisticated output: The responses generated are more articulate, nuanced, and aligned with user intent.
>A shift in "mindset": It's a playful way of saying the LLM's "cognitive processes" are upgraded.
>>
Is there a Colab for training LLM Loras on a certain writer's style? Doing it locally seems like some bullshit.
>>
>>103472642
Example conversations?
>>
>>103472642
Don't we, like, have enough models now for an interesting merge? Eva, Nemotron, base 3.3, the storybreaker ministral thing, maybe even Tulu somehow?
>>
>>103472166
>be cohere
>make the most unbiased model
>apply scaleAI.patch
>become gpt4-lite
Many such cases
>>
>>103472769
The important thing is that Cohere is now ESG compliant
>>
>>103472499
oh cute, never seen this before.
fat HTML files as an app distribution format is so hot.
>>
>>103472085
she looks like a kid you sick fuck
>>
Is scale.ai sweet baby inc of llms?
>>
>>103472499
I, too, like to do things raw.
>>
File: file.png (56 KB, 457x161)
56 KB
56 KB PNG
>>103472838
She's 21yo
>specializes in neuroscience and artificial intelligence.
>Due to her short stature and young appearance, she's often mistaken for a child.
>>
Can TTS models be as easy as flite, which is just `apt install flite; flite -t "say something"`? I have a gradio client that runs but fish-speech was unnecessarily difficult to setup owing to a lack of requirements.txt so I had to run their entrypoint.sh 20 times failing on a different missing dependency each time and for some of them the package name isn't the. Maybe fish-speech isn't the best one but I'm not going through this process more than a couple times.
For LLM there's ollama.
>>
>>103472896
Not fully developed until age 25. You're going to jail.
>>
>>103472843
https://www.nist.gov/news-events/news/2024/08/us-ai-safety-institute-signs-agreements-regarding-ai-safety-research
No, it appears to be NIST
>>
File: longu.jpg (99 KB, 640x1536)
99 KB
99 KB JPG
>>
>>103472896
rule of thumb is you can only date someone half-your-age-plus-7 years-old.
are you in the clear anon?
>>
>>103472896
I did say she "looks" like a kid tho
>>
>>103472933
>rule of thumb is you can only date someone half-your-age-plus-7 years-old.
why, seems arbitrary and self reinforcing
>>
File: safety.png (22 KB, 734x106)
22 KB
22 KB PNG
>>103472925
>founded by the deep state
Literally the same (((people))) at the helm
>>
>>103472943
https://en.wikipedia.org/wiki/Age_disparity_in_sexual_relationships#%22Half-your-age-plus-seven%22_rule
idk. apparently it used to be a target, not a minimum age...
>>
>>103472976
Needs an update. It's half your age plus fifteen now, chud
>>
>>103472966
The playbook is always the same, expand government control through fear mongering. If not possible, fund NGOs using shadow money to promote censorship and talking points directly from the intelligence apparatus. Build tools of oppression and clad them in faggot corporate colors to shield from critique and detection, sneaky motherfuckers.
>>
>>103472896
Why do people even care about this shit? If someone draws a toddler and claims it's 30 years old, does that automatically make it less weird? No, but it shouldn't matter, jerk off to whatever fictional shit you want
>>
>>103472943
Because if you get together with a young woman that wants to be your girlfriend she may not be aware that you are raping her.
>>
>>103472642
Damn good one, having a hard time deciding between this and the new Euryale.
>>
>>103472933
I am though.
>>
>>103473177
Then you ain't.
>>
>>103472056
Thanks for the info! I tried again with some keywords from that prompt and used full sentences rather than sentence fragments and the vid it popped out looked much less jank. Having a ton of fun with it now.
>>
File: 1731189288010643.png (1.33 MB, 768x1152)
1.33 MB
1.33 MB PNG
>>103472908
>>103472933
>>103472990
>>103473177
>tfw have gf 20 years my junior and see this shit bait in /lmg/
100% mindbroken
>>
>>103473223
Damn, brother, living the dream, huh.
>>
>>103473223
>posts on lmg
>mikutroon
Nobody believes you retard.
>>
>>103473223
Same but 15. Women give me weird looks whenever I tell them our age difference. This is just the way things have been and will be, high value males go for younger females, and older men have time to earn their value. I say it's fair, nobody paid me attention when I was younger either.
>>
>>103472896
Neuroscience is usually pretty far removed from AI, right?
>>
>>103463240
Is this really kling? Nothing else ive seen looks nearly as good.
>>
I wish more people talked about trellis
>>
>>103473337
That reads like some andrew tate shit kek
>>
>>103473480
doesn't it generate too many polygons to be usable for things like gamedev?
>>
File: Capture.png (411 KB, 2358x1230)
411 KB
411 KB PNG
>>103473480
forgot to post the image, oopsies

>>103473489
it does, however in my personal opinion it's the perfect stencil to trace 3d models from
>>
>>103473489
https://docs.blender.org/manual/en/latest/modeling/meshes/editing/mesh/cleanup.html
Do you want an overly complex starting point, or an overly simplified one?
>>
>>103473177
Aha, I see, rape without force or coercion. And people wonder why no one is having sex in developed nations? Kek
>>
>>103473510
>>103473510
>>103473510
>>
File: 2770.jpg (181 KB, 900x1128)
181 KB
181 KB JPG
>>103473480
We just had a huge /v/ thread about it.
>>
>>103473545
link?
>>
>>103473511
that looks like a lot of work, I'll just wait for trellis 2
>>
File: 11111.jpg (133 KB, 880x1131)
133 KB
133 KB JPG
>>103473563
Not going to link it, the number is 696814769
>>
File: 1727815449027534.webm (2.11 MB, 720x720)
2.11 MB
2.11 MB WEBM
>>103473481
>I am a giant faggot, the post
>>
>>103473306
Go shit up another thread, schizo.
>>
>>103469161
>download lmstudio to run alphafold
You're an actual shill bot.
>>
>>103472838
western people nowadays think that being of age means being fat
>>
>>103472927
Long Miku
>>
>>103474018
/lmg/ = longmigu
>>
Is there a llama variant that isn't gimped with lame moral harnesses?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.