[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103153308 & >>103135641

►News
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png (embed)

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: saintmakise.jpg (236 KB, 1614x992)
236 KB
236 KB JPG
Long live the queen of alive /lmg/.
>>
The Qwen guys released their paper, we'll finally see what secret sauce they used to make their 32b coder model so fucking good
https://arxiv.org/pdf/2409.12186
>>
File: 1708187138045027.png (747 KB, 1346x1996)
747 KB
747 KB PNG
PSA: Petra/blackedmikuanon/kurisufag/AGPL-spammer/drevilanon/2nd-belief-anon/midjourneyfag/repair-quant-anon is from... SERBIA
https://archive.4plebs.org/pol/search/uid/QmNRftdq/page/1/
>>
>>103164618
Having a melty already?
>>
>>103164618
that's why you use memeflags on /pol/ kek
>>
>>103164618
as one (and only one) of those people I can confirm that this faggot is a legit schizo
>>
>>103164129
>ask for GPT slop
>recieve GPT slop
>ask it to not do that
>recieve GPT slop
AGI?
>>
File: 1727210335025020.jpg (190 KB, 841x1189)
190 KB
190 KB JPG
migrate
>>103164659
>>103164659
>>103164659
>>
File: 9ze75m65ecp01.jpg (141 KB, 892x1316)
141 KB
141 KB JPG
I still can't wrap my head around how mentally ill the usual baker is...
>>
File: v-Sy-Zqs_400x400.jpg (24 KB, 400x400)
24 KB
24 KB JPG
Also other than triggering that sperg I am happy to do the public service of confusing the shit out of all the newfags.
>>
maybe we could make a neutral thread without any mascots?
>>
>maybe we can discuss local coombots without any troons?
jej, even zozzle
>>
>>103164798
>/g/
>no troons
You all brought it upon yourself.
>>
>>103164575
I like this OP image
Having a makise thread every now and then isn't that bad
>>
I will stay in this thread if Petra doesn't decide to be more of a nigger
>>
>>103164618
wtff!! he is based fun enjoyer! how horrific!!!
>>
>>103164618
Makes sense he uses the same images over and over again.
>>
File: 1731433107306893.png (95 KB, 1278x952)
95 KB
95 KB PNG
The 'ick 'ecker added some things to his voice cloner.
>>
I literally use her for everything now.
>>
why did that guy split the thread?
>>
>>103165278
kurisufag is a notorious shitposter
>>
>>103165278
anime obsession and prolonged hrt intake makes a big toll on your mental wellbeing.
>>
>>103164575
IM SO FUCKING CONFUSED WHICH ONE IS THE REAL THREAD
AAAAAAAAAAAARRRGGHHHH
>>
>>103164618
rent free
>>
>>103165310
and a ritualposting baker who has a meltdown over OP pictures and doxxes people is better?
>>
>>103164575
>kurisu OP
Yeah, this is the thread
>>
>>103165392
Nta but keep in mind he always samefags for optics, that one already makes him mentally ill schizo.
>>
>>103165373
How can you be so new
>>
>>103165455
I agree that mikubaker samefags for optics.
>>
>>103165455
Here it is >>103165466
>>
>>103165466
trvke
>>
File: 1722578820348464.jpg (396 KB, 1726x1726)
396 KB
396 KB JPG
>>103164609
VGH she's such a gem
>>
>>103164575
I know you're a troll spammer who doesn't give a fuck about Kurisu but, damn, I hope the remake will be good. I love Steins;Gate.
https://youtu.be/dmmnx4VQmPU
>>
File: 1723986939536441.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>103165502
>>>/a/
>>
>>103165535
kek, saved.
>>
>>103164575
the troon is back I see. odd he usually only pops up on a large release
>>
>>103165632
Uhm... xe is always here spamming anime pics amd melting over non-miku OPs though
>>
finally, a good OP
>>
File: 1730169243049.jpg (677 KB, 3834x1000)
677 KB
677 KB JPG
>>103165089
He looks like this BTW
>>
>>103165535
kek that's my gen I posted on /ldg/ a few months ago :v
>>
>>103164575
omg it kurisu
>>
For what it is worth this thread revitalized /lmg/ by forcing the ritual poster to samefag and pretend to have a discussion.
>>
File: file.png (263 KB, 1160x762)
263 KB
263 KB PNG
>>
>>103167225
overfitted nothingburger
>>
still no ministral support...
>>
Thanks for the OP.
I got mistral7b on kobold and sillytavern working, my first local llm use. I checked the answers to a few questions against gpt4o and i was happy with the answers.
So, thanks to the Renty people.
>>
>>103165502
I'm hoping for an anime remake to introduce zoomers to the series
>>
File: 1707000371361654.png (426 KB, 1718x1278)
426 KB
426 KB PNG
>>
>>103167561
doesn't it already work fine with llamacpp?
>>
>>103167561
huh? FFTing seems to work and EXL2 works if you use the HF conversion to quant.
>>
>>103168591
>FFT
What? Also I would expect exl2 quants to be on hf but nope. ggoofs are obviously broken even with hf conversion.
>>
https://github.com/t41372/Open-LLM-VTuber/
I got this semi-working with whisper.cpp, Bark, and ollama with Llama3.2-vision (11B).
I say semi-working because it listens and generates one response then stops. Something in the front-end code isn't working; if I reload the tab I can get another response. I might investigate more later.
whisper.cpp works very well. Had to manually generate a CoreML model which was a moderate pain, the scripts ggreganov and friends make always seem so half-baked, but they're actually building shit and it mostly works so I should stop complaining.
I chose Bark for TTS because it can do code-switching, but the responses take ~2 minutes to generate on my M1 Pro so it's not usable until they add GPU or CoreML support for mac. There's no issue open for it though so I'm not holding my breath.

The Live2D seems to work well with the lip sync, though I'm not sure yet if expressions are working.

There's also this fork https://github.com/ylxmf2005/LLM-Live2D-Desktop-Assitant
which seems to add a bunch of features and has a better Elaina-flavoured TTS. I haven't tried it at all.

Anyone else give it a whirl?
>>
>>103168721
hi petra
>>
File: channels4_profile.jpg (138 KB, 900x900)
138 KB
138 KB JPG
>>103168782
the penguin from nijisanji?
>>
good rape prompts for mistral large?
>>
>>103168625
FFT = Full finetuning
EXL2 = https://huggingface.co/lucyknada/prince-canuma_Ministral-8B-Instruct-2410-HF-exl2

I was using these and they worked
>>
>>103166540
great gen!
>>
$2,000 USD for a 64GB Mac mini...
anyone try Llama 3.2 Vision 90B yet?
>>
File: thinking.jpg (9 KB, 225x225)
9 KB
9 KB JPG
>>103164575
>>103167339
What is the best micro model for writing creative text snippets and is licensed for commercial use?

I'm building a game and I want it to run an LLM to write descriptions of NPCs and object based on stats.

Looking for maximum speed even on mid-range cards. I saw considering Llama-3.2-1B but the license is restrictive.

Is there something like Mistral for 1B?
>>
File: kurigohan.jpg (659 KB, 1404x2048)
659 KB
659 KB JPG
it's fine to bump this right..
>>
>>103168721
Man, a single component of these can be enough to make you work on it for weeks before getting proper and stable results. That's not the kind of project you can put together in an afternoon.
t. fixing sovits for three weeks rn
>>
>>103169708
Just use 3.2-1b to see if the idea is worth it. Once you know it works, worry about the license. If you can do with something much dumber but still fast that you could eventually train on a dataset for your task, look at
>https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct
Or olmoe if ram is not an issue. Works just as fast. IBM also released much smaller moe models you could try, i don't remember their license. Apache probably.
https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-instruct
https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct
You won't find a model that works 100% consistently, just good enough.
>>
>>103168914
It seems to work in the same way as ggufs. Ok at first but once you fill out the context it becomes incoherent so exl2 probably doesn't have the implementation for that SWA yet. I mean if it would then turboderp would probably make quants himself.
>>
File: Screenshot_286.png (56 KB, 1839x480)
56 KB
56 KB PNG
https://huggingface.co/anthracite-org/magnum-v4-27b

I copy pasted instruct template text into magnum.json and portion of it is glowing red, and tavern does not see this .json file ( file for context template is visible )

also, whats sampler preset for magnum?
>>
File: m27.png (35 KB, 645x373)
35 KB
35 KB PNG
>>103173081
chatml
>>
>>103173081
>>103173122 (cont)
Specifically about that error, the " needs to be escaped. They cannot be trusted with a fucking json file.
>>
File: Screenshot_287.png (847 B, 218x19)
847 B
847 B PNG
>>103173166
Like this?

What sampler preset is recommended for magnum?
>>
>>103173352
nta probably just escaping the quotes so they're not interpreted as ending the value of system_prompt.
\"!\" and \"~\"
>>
>>103173352
Yeah. This: >>103173361
I don't use it, but don't worry about the presets. Start with everything neutralised and tune as you see fit. Experiment.
Or use
>Sampler visualizer: https://artefact2.github.io/llm-sampling
to get a more intuitive understanding of what they do.
>>
>>103173409
thank you
>>
Any good voice-based memes lately? The Star Wars and Richard Nixon shit was amazing.
>>
New claude slops in the new generation of erp sloptunes
>i'm not some common harlot/whore (in response to anything inappropriate, though recent sloptunes rarely deny ever you)
>don't you dare fucking stop
>make me scream
>make me yours/mark me as yours
>>
nakurisudashi
>>
why are there 2 threads?
>>
Is a local model with the intent of using it as a programming assistant actually worth it? Or are they all shit compared to the openai/anthropic alternatives?
>>
>>103174122
Test it yourself. Apparently, qwen coder 32b is pretty good. If not, go back to whatever you like most.
>>
>>103174122
they are shit and not worth it for anything other than story writing or cooming
>>
>>103174122
I use deepseek's online chat and it's not bad.
>>
>voice cloning still sucks
/lmg/ was mistake
>>
>>103174201
But they are shit and not worth it for story writing or cooming either...
>>
>>103174122
Qwen2.5 32B coder. It's 90% there with the best enterprise and can RP while coding.
>>
QRD on thread split?
>>
>>103174407
is it censored?
>>
>>103174430
No, unlike 2.5 72B chat was
>>
>>103174412
trannies throwing a fit, ignore
>>
>>103174412
Like other anon said.
Two tranny camps war for OP pic with their FOTM waifu of choice, happens every time OP makes non-miku thread.
>>
>>103174386
I mean... You aren't wrong.
>>
>>103174412
The same as always. The right thread is always the one with the recap btw.
>>
>>103174434
So is it actually good for cooming?
>>
>>103174553
t. mentally ill mikutranny
>>
>>103174589

>>103158694
>>
>>103174603
Local model ERP has taught me what purple prose is. And it has taught me that I absolutely hate it.
>>
>>103174623
Well that was with a system prompt telling it to be vivid / use all senses.
>>
>>103174646
I don't think prompts can do that much. And especially with context filled it. It will always start going to the default model style which is always purple prose poetic slop.
>>
File: Qwen32B.png (9 KB, 801x636)
9 KB
9 KB PNG
Qwen2.5 32B coder one shot tetris for me btw.

https://files.catbox.moe/heo220.py
>>
>>103174742
>knows what tetris is
>knows how to code it
>doesn't know how to suck dick the way I want it
Current year dystopia personified
>>
>>103174700
They DO do that much if you have a even slightly competent model. Tell it to write in a style of a somewhat popular author and watch.
>>
>>103174763
Nothing gets me off like my waifu coding me games on the fly.
>>
>>103174810
I want to prompt "send nudes" and get omnimodal-generated nudes immediately without going through hoops of setting up a gymnastics pipeline
>>
okay, feed me people
how does one implement memory if you make anything? I've heard that people just tell the AI to make its own memory?
>>
>>103174833
We are only 2 years in, give it another.
>>
>>103174838
Have the model (or a different, smaller model) summarize whatever needs to be remembered, add it to an embeddings database, query the database for relevant information, inject it into the model's context when needed.
RAG, basically.
If none of that made sense, read on RAG. You'll have to code the stuff together or use something like langchain. It's not something that can be fed through posts.
>>
>>103174838
>how do you solve the AI gf problem
We wouldn't be posting here if it was already solved.
>>
>>103174929
that sounds basically what was in my mind. Faster model to summarize things and save it and then query it
though it all sounds so ugly. This project I am on right now already does this in plenty of parts, where instead of making algorithms in code it just asks the model. I guess this is the future now, huh?
>>
>>103174991
>though it all sounds so ugly
Yeah. It's as reliable as the models used. Never bothered to make something like it, but it'd still be interesting. Maybe one day...
>where instead of making algorithms in code it just asks the model. I guess this is the future now, huh?
If you meant what i think you did, that's not the case for me. I like programming. I take pride on figuring stuff out on my own, even if my implementation is less than optimal.
>>
>>103175149
tesla apparently replaced much of of its C++ code with just asking the AI for results, so I meant this is basically what the industry and everything is going to move towards
>>
>>103174929
I believe RAG is another grift that tries to sell an alternative to continuous learning. It's a dead end in the long term. But somehow everyone is shilling it, I attended an Nvidia seminar last month and they talked about RAG like it's the holy grail
>>
>>103175233
NTA
how does RAG differ from continuous learning?
>>
>>103175186
Maybe. I'd trust that statement much more from someone who *isn't* selling AI. There's people that often do long divisions or look for words in a physical dictionary. Some people repair their own cars, draw and play instruments. They're not having fun while programming. I do.
>>
>>103175233
>I believe RAG is another grift that tries to sell an alternative to continuous learning
I see it as the best thing we have *until* we get continuous learning, if we ever do. It cannot be a replacement for something we don't have.
>>
File: file.png (112 KB, 1258x526)
112 KB
112 KB PNG
What a rebel model, what about my fucking python script?
>>
What's the smallest/fastest uncensored model that can summarize a 22k context long multi-part story? I tried Dolphin Nemo and it failed spectacularly, started inventing plots that didn't exist at all in every attempt. Dolphin finetunes have been good for me in the past but it does say "The base model has 128K context, and our finetuning used 8192 sequence length." So I'm not sure if that's the issue or is Nemo just too stupid for that, didn't try the normal instruct yet. I don't care much about roleplay flavor enhancers, but I'd prefer a decensored model in a way that causes as little brain damage as possible. Mistral Small Instruct seems to remember the story at first try with 80% accuracy (forgot one part)
>>
>>103175320
Check your email anon.
Look at how coding sensi does it btw. You need to tell it to give requested scripts / code in code blocks.
>>
>>103174767
>They DO do that much if you have a even slightly competent model.
I asked qwen coder to write in a style of ERP forum user and to avoid purple prose and flowery langauge. I asked it to give me 3 different ways a character would talk. And after I saw all the shit I despise I told it:
>It is all so poetic...
And what I got in return is:
>Glad you like it! Now let's continue blah blah blah
At least I had a chuckle at how completely autistic the model is.
>>
File: 1653702138732.jpg (60 KB, 900x900)
60 KB
60 KB JPG
What the fuck are you guys even saying? Is it even English? Half the words you use don't make any sense. I wonder if this is how normies felt about me talking about anime back in 2008.

You guys are weird. I would shove you in a locker if I could.
>>
>>103175683
>2008
Akira was at the end of the 80s and GitS came out before 2k. Oh.. you were in school in 2008? I see....
>>
>>103174742
Get it to make Tetris but with circles that can roll around if they are jostled by another circle landing nearby.
>>
Don't mind the retards >>103174929 >>103174962 they never read papers as always.
Here is your solution without summarizing: https://arxiv.org/abs/2409.05591
>>
>>103175683
The only thing you're shoving is groceries in my bag
Speaking of, you should probably get back to work
>>
File: only2.png (46 KB, 680x124)
46 KB
46 KB PNG
>>103175759
I'm talking about things available now that you can do with any good-enough model. That one you linked requires training.
>>
>>103175852
You need training to make a memory model, not to use it. You feed it your dataset and link it to whatever model you want as a generator.
>>
>>103176056
NTA
are there available "memory models" that one can just use? If not, then one should just use RAG?
>>
>>103175500
Dolphin fine tunes have always been over hyped and pretty bad in my experience.
I still remember when people were praising their mistral 8x7b tune only to figure it out that the fine tune script everybody was using was broken.
Try the official instruct fine tune of nemo. It should be able to cope with 22k of text without much weirdness if you use greedy sampling, don't inject unnecessary instructions in the context, etc.
Failing that, deepseek in their website does pretty well with long, long texts.
>>
>>103176056
There's a million papers, with a million demos. That's all they are until they're taken seriously either by big model makers or inference software devs. The former doesn't guarantee it, less so the latter.
RAG, as clunky as it is, can be ducktaped together with any inference software that supports embeddings.
>>
>>103176228
>Dolphin fine tunes have always been over hyped and pretty bad in my experience.
You are talking about the famous AI researcher Eric Hartford who once said that frankenmerging l3 70B with itself makes it incredibly intelligent and humanlike. Or something like that.
>>
>>103164618
Based.
>>103164678
Ultra based.
>>103173457
>>103173457
>>103173457
Giga based.
>>
>>103174359
The devs for MaskGCT said they're going to include long form audio.
https://github.com/open-mmlab/Amphion/issues/290
>>
>>103176213
Yeah they are there: https://huggingface.co/TommyChien read the paper to understand the difference.
>>103176439
RAG is dumb af as they explained in the same paper. It doesn't know what to retrieve from the memory, which leads to worse generation.
>>
>>103176228
So is it always going to be choice between censored with good memory or decensored with bad memory? Is there no way to decensor a model without damaging it?
>>
>>103177154
Nemo-instruct isn't censored.
At least it never refused anything I asked of it.
But if you want a fine tune that seemingly didn't make the base model any dumber, try rocinante v1.1.
I can't really speak for models other than Nemo as that's about the largest thing I can comfortably run.
Actually, that's not true. Mixtral 8x7b instruct might actually be able to do what you need too.
Or CommandR, although I don't remember how big the context window on that one is, but I tried it (at excruciatingly low speeds) and it was really good.
>>
>>103177034
>RAG is dumb af as they explained in the same paper.
Irrelevant to the original point and your first post in the chain. Just look how anon asked the question: >>103174838. Does it sound like he has any idea of what he's talking about? I gave him *an* answer to his question.
MemoryRAG IS RAG.
>>
>>103177196
Sure? Whatever floats your boat, I guess.
>>
File: rag.png (97 KB, 857x433)
97 KB
97 KB PNG
>>103177240
Same boat, mate. Same boat.
>>
Aside from just having long context, the second most proper way to do long-context memory would be to somehow divide the work into pieces/layers that can be processed in parallel (or sequentially) and then added together in a way where the entire context gets to affect the generation of next token. Though I don't know if this is even possible. Searching through a web of information might be the only solution, but that has the risk of disregarding important nuances.
>>
It'll take a while for the dreamers and the hypemen to admit that ML isn't going to create a god, and some stocks will get hurt bad when it finally sinks in, but longer term it'll be good for a more realistic discourse around LLMs. A lot of the safety bullshit will dry up once people are forced to accept that they're going to cap out at "useful assistants" and can never become a competitor species.
>>
What's an entry-level development machine build for local models?

I'm considering 1 3060, upgradeable to 2 or more 3060s but would it be worth it if they're relatively cheaper or is it better to go for a better single GPU build?

What models? No clue yet.
>>
>>103164575
Why are LMs more woke than Chatgpt?
>>
>>103177573
Bad idea. Get used 3090s or other 24GB+ cards. You'll only have so many slots / motherboards / power connectors.
>>
>>103177597
This has only been true since around June when OpenAI started releasing looser finetunes and giving it a more entertaining personality
hopefully other labs will take cues from them
>>
File: cudadev.jpg (1.96 MB, 4000x3000)
1.96 MB
1.96 MB JPG
>>103177573
Vague question. For reference, CUDA Dev has picrel monster.
You can develop inference software on just a CPU if the models you're working on are small enough. And you can train 100M models as well, but it'll still be slow and tedious. If you can buy a single gpu, buy a 3090.
You need to be more specific than "how to AI".
>>
>>103177560
Current models are limited by context and having to start every problem from scratch. As soon as infinite memory is solved, LLMs will be able to consider every detail in the world, generate new information and use it for vastly superior problem-solving compared to humans, scaling linearly by compute you throw at it.
>>
>>103177632
Talking about cuda dev, he posted in the real thread, btw: >>103177202
He's not as stupid as you to tell which one is the real one, it seems.
>>
>>103177679
You seem to be in love with that codemonkey, go and suck him off i guess
>>
>>103177632
>6x4090 on a single PSU
North America bros... why did we get such shitty electrical standards?
>>
>>103177736
>SilverStone has already done so with the HELA 2050; as its model number implies, it can deliver up to 2050 W of power with 230 V input. With 115 V input, it is capable of providing 1650 W since standard wall sockets cannot deliver more than 15 A.
haha
>>
>>103177608
>>103177632
I appreciate the advice, I know I should be more specific and I've already run Ollama on an i7 4790 cpu-only and it was meh.

I have a potential client that's the kind to jump on every bandwagon (guy bankrupted one of his companies moving to cloud) and now that he wants AI I'm trying to squeeze a dev box (3090/4090) out of him, I don't really do much AI on my day to day so if I end up building it out of pocket I want to go as cheap as possible.

From my knowledge of his different businesses it'll probably be OCR/computer vision or support chatbots but we haven't gone through the specifics.
>>
>>103177679
Wait. People can use multiple threads at the same time? No waaaaaaayyyy
>>
>>103177850
Get whatever you can get that allows you to upgrade the most if needed. A 12gb gpu will become clutter and wasted money if you need to go monster build. Buying a mobo with 4 ddr4 ram slots will be a waste if you need to go cpumaxxing.... you get the point. But not yet...
Practice vision stuff with small models like
>https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
Or whatever you can already run on your setup. If you're going for support bot, just run llama-3.2-1b until you have a UI to show. Figure out what you can do with simple tools. Then, *once you know you can do it* and you tell him your realistic expectations, ask for a budget for a big build where you can develop more comfortably, safe in the knowledge that the product you make cannot possibly be worse than your demo.
He's a griter. I hope you aren't.
>>
>>103177679
You really are mentally ill. And I am not even trying to insult you at this point.
>>
File: file.png (523 KB, 768x768)
523 KB
523 KB PNG
>>
>>103178135
Yeah, there's zero value in adding AI to anything that guy does but might as well get something out of it.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.