[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1759753823006.jpg (90 KB, 1200x982)
90 KB
90 KB JPG
I have a quad 3090 setup with qwen code 30b. With 30k context. I send one message with qwen code cli. I tell it to review my files and explain a project aa a test. It gets to 3 files before the context is out. This was with maybe 500 lines of code total. What gives? A chat bot can go on forever, but the moment I try to use an agent it's pretty much worthless. I'm at the point of trying to set up context and rag to try to mimic even a fraction of what gemenis app builder does. I don't want the cloud. I just want my llm on my hardware. Hundreds of gigs of ram and vram. I should be able to at least have it review some files. Is this shit just ass or is it me who is the brainlet?
>>
first of all 30k context is absolutely miniscule for what you're asking of it, much smaller models can remain coherent with much bigger contexts. second of all the big chatbots have code that intervenes when the context fills up and summarizes what's happened so far. That's why after a while they will still forget who said what and gaslight the shit out of you.
>>
>>107652113
I mean the context goes like 16k 30k 60 then 128. Idk if 30k is really that small to review a few cs files.
>>
>>107652098
>quad 3090
Ayo nigga gimme one you don't need all 4.
>>
>>107652098
yea this happens with legacy cards
>>
>>107652098
I guess you want to increase context to about what the remote LLMs would have.
>>
>>107652098
Try to see if you can find out how much of the context window it's using as it goes and/or what it's putting into the context window. I know the first is somehow possible as I know of tools that will show you that stat, but I'm not sure how to do it exactly since I use proprietary tools internal to the company I work for



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.