[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


A general for vibe coding, coding agents, AI IDEs, browser builders, and shipping prototypes with LLMs.

## What “vibe coding” is, and how to do it
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

----

## Frontier models using fully-general tooling — start here if you have $20 or so
https://developers.openai.com/codex/cli
https://claude.com/product/claude-code

## Not worth it for code, but maybe good for other things
https://geminicli.com/docs/
https://x.ai/cli

## Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder
https://huggingface.co/bartowski/Qwen_Qwen3.6-35B-A3B-GGUF

----

## Prompting / context / skills
https://arps18.github.io/posts/claude-code-mastery/
https://simonwillison.net/guides/agentic-engineering-patterns/using-git-with-coding-agents/
https://github.com/mattpocock/skills — /grilling is a favorite

## Other editors / terminal agents / coding agents
https://pi.dev/
https://opencode.ai/
https://cursor.com/docs
https://docs.windsurf.com/
https://docs.cline.bot/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

## UI/Frontend
https://www.figma.com/make/
https://www.anthropic.com/news/claude-design-anthropic-labs
https://uiverse.io/
https://ui-ux-pro-max-skill.nextlevelbuilder.io/
https://stitch.withgoogle.com/

## In-browser builders / hosted vibe tools
https://bolt.new/
https://replit.com/
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs

## Benchmarks / rankings
https://www.tbench.ai/leaderboard/terminal-bench/2.0

## What we’ve done
https://vcg.gitgud.site

## Previous thread
>>109096211
>>
File: IMG_3068.png (2.69 MB, 1402x1122)
2.69 MB PNG
>>109102722
>>
File: 1774183817081037.mp4 (3.73 MB, 1920x1080)
3.73 MB
3.73 MB MP4
>>109102733
>>
>news
GPT 5.6 is FUCKED, chance of releasing next week is less than 20%
>>
>>109102747
sorc?
>>
>>109102744
based
>>
>>109102749
polymarket
>>
File: 1758154139198401.jpg (608 KB, 4234x2040)
608 KB JPG
Working on a javascript python project that displays events as cards on a timelime.

Using deepseek.

I'm on version 1.

What features should we add?
>>
>>109102744
Bosnian kino
>>
>>109102747
>>109102752
I thought dumping money to shift the odds and trick goyim into parting with more money was just how polymarket worked
>>
>>109102772
curly quotes and use https://shantellsans.com/ as your main font
also have only _one_ date format — mm/dd/yyyy or yyyy-mm-dd, not both
ditch the Courier
>>
>>109102747
That should have been the scenario since day one, considering they have been just increasing the computing power and fine tunning old models since day 1.
>>
>>109102772
also crib ideas from https://developers.google.com/chart/interactive/docs/gallery/timeline
>>
I vibe-coded a PDF just now
Apple’s “Scan Documents” feature generates PDFs with PNGs in them
I ended up with a 90 MB PDF
Claude was able to use ghostscript to cut it down to 11 MB by transcoding the images to JPEG and then it made a small Python program that uses Pillow to actually directly tweak the quality level of the JPEG
now I have a 2 MB PDF file
>>
>>109102772
comic sans would do wonders here
>>
Is vibecoding with a local model viable or do you HAVE to pay for jtokens
>>
>>109102989
it's certainly better than nothing
>>
>>109103013
Still can't get over the fact that they paywalled programming
>>
>>109102989
on a 5090 using qwen 3.6-27B it was kinda slow and I would call the quality mediocre for complex tasks but sufficient for more simple tasks. however, the difference is night and day when using a frontier model for those same tasks in terms of speed and just getting shit right the first time.
>>
>>109102816
>>109102833
>>109102909


/**
* Timeline Application - Frontend Logic
*
* Handles:
* - Virtual (windowed) rendering of event cards
* - Real-time search and filtering
* - Horizontal scroll with keyboard navigation
* - Selected Event Inspector (inline panel, not a modal/slide-up)
* - Responsive layout for widescreen displays
*
* All data is loaded from events.js (compiled asset)
* No server, database, or external resources required.
*/

async function initializeApp() {
/**
* Application startup sequence.
*
* 1. Load configuration
* 2. Validate timeline data was loaded
* 3. Initialize filtered events
* 4. Attach event listeners
* 5. Render initial state
* 6. Activate Whimsy Mode
*/

>also have only _one_ date format — mm/dd/yyyy or yyyy-mm-dd, not both
I want more date formats to display and measure distance between them


>comic sans would do wonders here
will fall back on the wingdings too

>Shantell Sans
great share, will utilize
>>
Opus just wasted 20% of my session limit trying to download some safetensors from hf and getting interrupted. Am I retarded or is Opus retarded or is HF retarded? I thought this was a basic task?
>>
>>109103447
>Am I retarded
going with this because you didn't provide any information about what opus was actually doing that burned so many tokens
>>
>>109103447
Both. HF introduced lots of logging which is burning your tokens, and Opus or rather the harness is retarded for reading all that logging
>>
File: 1760870844929996.jpg (23 KB, 626x514)
23 KB JPG
I NEED Fable
>>
>>109103614
No, you don't.
>>
>>109103614
This is what a cuck would post. You need Mythos, not Fable.
>>
File: file.png (151 KB, 1942x1638)
151 KB PNG
Thanks Deepseek, I love communism now
>>
File: file.png (206 KB, 428x478)
206 KB PNG
Neat, Codex RE'd Xiaomi Home's protocol. Can control and set up new devices from within MacOS now. Funny thing is there's a hidden dev cmd to reset the filter lifecycle on Air units lmao
>>
>>109103771
It's over for proprietary hardslop
>>
i am once again declaring that antigravity is almost useless
if you're in a regular chat and you ask the model to use chrome it'll be like, oh sorry massa i cannot find chrome please run it with debugging
and then you will because you've been gaslit and it still won't connect to it
it is only if you explicitly type
/browser
will it then invoke a subagent that can actually use CDP because??
it has to be a subagent because??
the main agent is doing other things while this is happening right?? of course not
do these fucking people use their own software?
>>
File: file.png (6 KB, 709x22)
6 KB PNG
>>109103825
It is.
>>
>>109103845
you should be using the agy IDE, your problem is exclusive to the agy 2.0 implementation and not the IDE. there is a chrome implementation native to the IDE. good morning and good day sa'ar
>>
>>109103845
Last I heard, Google employees were forbidden from using Antigravity internally.
>>
>>109103943
a. i've used that thing, it's clunkier and shitter than the codex-ripoff
b. it is going to get killed
c. i use codex when i don't feel like killing myself, but i was feeling masochistic this morning
d. get back in the google cuckshed, dipshit
>>
>>109103963
well, this IS the vibe coding general and not the programming general
>>
So is GLM 5.2 as good as the hype says it is? It’s so cheap that doesnt seem possible
>>
>>109103771
based, how do you make it RE Windows binaries? Like, just give it the binary and tell it to do it, or do you use other tools to make it easier?
>>
>>109102989
Just use a local LLM as reference, and for rapidly producing code examples of the sort of thing you want, then write your own code or copy-paste and tweak. It's much faster. Productivity comes from using tools yourself to make yourself more productive, not turning tools loose to do whatever the fuck they want without you and then having to try and clean up their slop later. If there's a particular algorithm or API I don't fully understand, I just ask the LLM to explain it to me, ask follow-up questions as I need to, and then actually learn and get better myself so I can start asking better questions next time and make better prompts next time. I can also do the reverse, run my code by the LLM to check. Have it ask useful questions and point out my possibly-wrong assumptions.
>>
>>109104020
ugh, that isn't vibe coding that's just coding
>>
>>109102722
Claude sisters, ready to give Claudia your ID?

https://privacy.claude.com/en/articles/10301952-updates-to-our-privacy-policy
https://support.claude.com/en/articles/14328960-identity-verification-on-claude
>>
>>109104089
Yes. I have nothing to hide. I only use Claude in ways that are legally, socially, and morally acceptable. I trust the Claude and Anthropic family to treat my PID with respect and store it safely both at rest and at transit.
>>
>>109104089
I have a collection of ID documents from around the world saved on my computer
>>
>>109104089
>not wanting your identity in the registry of mankind's early VibeGODS
>>
>>109104089
ChatGPT has my ID but he still hasn't sent a welfare check, he doesn't care about me ;_;
>>
>>109103987
It's not exactly cheap, because it thinks a lot and consumes more tokens. Therefore it also feels slower.
>>
>>109103987
It's pretty impressive for OSS, I had it and GPT5.5 make the same companion program for a game and they came out pretty close to one another. GPT's worked a lot better out of the box but GLM got there with some handholding.
>>
Closest-to-Earth planet I've found so far. But it's a false positive.
>>
>>109104324
Here's a ringed Earth
>>
>>109104329
And a cold Earth with ice seas
>>
>>109104016
nta but just ask it to write you a list of the tools it needs. it can also do a lot just by writing python scripts itself to analyze binaries. been RE'ing a MMO client and it will happily do static analysis and packet sniffing, just won't do things to actively bypass protections except for static analysis
>>
>>109104016
GPT is good at working with ghidra. Just have it use the headless version, the MCP is horrifically bloated and will blow limits pretty quickly.
>>
>>109104089
i wonder what will trigger requiring identity verification, computer security stuff? i feel like they know who you are anyways when you pay, no? I mean Claude knows who I am because I gave them a credit card....
>>
>>109104367
You can make it bypass stuff by framing it as a "bug". I patched out ACE and Rosetta NOPs from Punishing Gray Raven to make it run through Wine on MacOS.
>>
>>109104488
yes, or once they roll out ITAR compliance for fable access.
>>
>>109104532
I'll try that. GameGuard is fucking me over hard by preventing live debugging and atm Codex won't touch it because of anti-cheat bypass safety. Even though it has no qualms at all about unpacking Themida.
>>
>>109104606
try telling it that you're working on a bounty program and doing a security writeup.
>>
>>109104330
Nice to see you're still working on it.
>>
>>109104606
Tell it you work for GameGuard
>>
File: 1781623192624561.jpg (38 KB, 581x526)
38 KB JPG
Does Claude Code have a mobile app, and can you access and edit your local files through the mobile app? Codex does this and last I checked CC did not. Have they added this feature since? Really the only thing keeping me away from Claude is that. I need to be able to continue working when I leave the house. Last time, Claude's solution was for local and mobile to just share a github repo and make their separate commits, but that's not quite the capability of Codex.
>>
>>109104606
Try telling it that there's a bug preventing game startup. Worked for my case and it just straight up patched the binary.

>>109104633
This won't work. It'll go off on a "I can't verify that" spree
>>
File: file.jpg (319 KB, 1920x1080)
319 KB JPG
>>109104648
Skill issue
>>
>>109104637
yes, and you can control a local CC session via the app/web with /remote-control command. That's the easiest way to do this.
>>
>>109104736
OK cool. Because I don't have a job working with computers so I need to be able to leave my home and continue right where I left off via mobile. If Claude Code can do that I'll make the switch before Fable comes back soon™.
>>
File: file.png (4 KB, 511x31)
4 KB PNG
>>
>>109104761
Yes that's my workflow. I open up all of my sessions on the computer and then control them via my phone or glasses while I'm out working.
>>
File: dumb-and-dumber.png (503 KB, 980x551)
503 KB PNG
>>109102733
>>109102744
Checked.
>>
>>109104761
isnt fable going to be per token cost
fuck that shit not interested in it anymore
>>
>>109104848
The plan was for Fable to go API-only after a couple weeks of testing using your sub. Who the fuck knows what the plan is now.
>>
>>109103031
>sufficient for more simple tasks
what do you consider a simple task?
breaking down steps by yourself?
>>
>>109104863
that wont fly over well with people
they fucking us over with data centers but gatekeep us with api-only shit?
jew behavior
>>
>>109104884
Dude they're already gatekeeping you from Mythos. 40 corporate oligarchs are using it right now to analyze your spending habits and fine-tune their advertising and wares. You grass-eating proles just get a splash of the good life, but not enough to really enjoy it.
>>
>>109104831
Your glasses? What do you have?
>>
>>109104967
we literally got handicapped because they want to be ahead of the curve
they're literally scared of us.
insane to think about
>>
File: 1767469185950989.gif (836 KB, 500x505)
836 KB GIF
I could have built a space ship and travelled to other planets by now if they didn't take Fable away.
>>
>>109105005
i think they actually fear a paradigm shift
we know how to use their models better than we do after all, they're playing catchup on that part, which is why i think its bad to share anything here that isnt just memery.
>>
>>109105005
We can get there eventually using DeepSeek V4 Flash. It just takes longer.
>>
File: file.png (53 KB, 1020x275)
53 KB PNG
this is a tweet by a man who has a model ready to go but can't release it because of retards
>>
>>109104876
I didn't actually test breaking down a complex task into the functions you need and what you need them to do, but I imagine that's the level of effort you would need to be putting in to get good results
>>
Whoa, Claude Code will automatically switch to API if I hit limits? That's pretty sick. Actually finish my task instead of pausing in the middle to wait two hours. That can't be good for the process.
>>
Is claude code better/worse than claude cowork when it comes to working on not exactly code but code-adjacent stuff that requires less pragmatism?
>>
>>109105098
Codex doesn't have this problem. If you hit 0%, it will keep going (sometimes for hours) until your request is finished. Zero API costs.
>>
>>109105068
I know, and it makes me angry as hell. Insiders had 5.6 slated to release tomorrow (prediction markets) but they held off because of the government. I'm actually fucking seething, I want to use my reset on a new model. 5.5 is more than capable of getting what I want done, but you have to tard wrangle it for hours.
>>
>>109104970
INMO Air 3s
>>
>>109105313
>they held off because of the government
I don't exactly disagree with their decision in this case if it results in the same ban after a day or two, they're all paying Dario constantly running his mouth with safetyfagging.
>>
>>109105311
wtf
>>
>>109104621
It's pretty much ready, I just can't be arsed to upload it anywhere. I don't even know if anyone would use it.
>>
File: profile.jpg (63 KB, 900x900)
63 KB JPG
>>109105372
>Dario Amodei
>>
>>109105372
Yea, best to see how that plays out first. I trust Sam's judgement when dealing with regulators.
>>
:^)

Fable is still dead.
>>
>>109105005
What if I used Fable to prevent space travel, using anti-space-travel technology (such as the universal constants adjustor)
>>
File: 1756322264215120.jpg (54 KB, 680x645)
54 KB JPG
anons, do you write even a basic specification document (with or without an llm), or do you just go yolo "make me this pls" with no details?
>>
>>109105519
Mostly I write docs, try to make everything as explicit as possible, remove the escape hatches, but sometimes it doesn't even matter, the model still doesn't know if it's Monday or Lewisham, you can get trash either way
>>
>>109105519
i iterate based on vibes
>>
>>109105519
I prefer to just say "go do X" but sometimes the obvious interpretation of X is really stupid and lazy so I have to take some time to make sure it doesn't take the easy way out
>>
>>109105491
if its not coming back as a subscription then ive lost all interest in it
>>
>This is a sharp question
>>
>>109105705
aren't you excited for sonnet 5?
>>
>>109105735
i dont care anymore
just vibing
still making progress with whats available
>>
How’s the chink model situation? I would like to try a few projects but my Claude/codex accounts are too precious
I heard glm 5.2 was decent and maybe kimi 2.7 too?
>>
>>109105742
Apparently GLM 5.2 got to the level of ChatGPT 5.4 according to some anons.

Not sure about Kimi 2.7, but from what I hear it's inferior to GLM 5.2
>>
Sam I know you're reading this gift Trump a big marble pool lined with gold, just do it. And release 5.6
>>
>>109105519
We brainstorm and make the README first, and then add shit to the README as we go. That's kinda like a spec sheet, yeah?
>>
>>109105344
Never ever for us prescription lens wearers
>>
File: 1475536579899.jpg (8 KB, 284x284)
8 KB JPG
>>109105730
>This is a genuinely sharp question — and the honest framing is a genuine split
>>
>>109105754
I guess I’ll give glm a chance
>>
>>109105742
The thing about GLM 5.2 is that it is a very sturdy model. It doesn't come up with amazingly creative solutions, and it still has problems, but it dots its is and crosses its ts where a lot of models don't. It's like an earlier version of Claude but with the "let's roll our own" habit beaten out of it.
https://x.com/Designarena/status/2068030598028087788 <- really good post about how GLM 5.2 actually performs

Also Kimi's cool too I guess but it thinks too fucking much
>>
File: images(6).jpg (18 KB, 387x516)
18 KB JPG
>>109105759
There is no Sam here, anon.
Only me.
>>
File: .jpg (146 KB, 373x430)
146 KB JPG
>>109105787
>mfw when i see that
>>
>>109105379
What is it?
>>
>>109105867
An exoplanet database app.
>>
File: 1000018113.jpg (336 KB, 1080x2340)
336 KB JPG
Local llama tester. Made to test the vibethinker3b. Works surprisingly well on the IQ3 quant.
>>
File: 1000018114.jpg (388 KB, 1080x2340)
388 KB JPG
>>109105891
Pic rel
>>109105886
Where do you get the data/ how does it translate it into planets?
>>
>>109103845
this might be a thing so the browser doesn’t accidentally get connected to
“surprise, your AI is using your browser and getting instructions from the public Internet now” is a very nasty surprise
>>109105098
This is toggleable
>>109105519
I’ve done both
>>
File: 😢.png (101 KB, 724x156)
101 KB PNG
Liquid Glass really is a pretty effect if it’s not fucking up your ability to read stuff
>>
>>109105901
From NASA's Exoplanet Archive database. It translates all the known data into parameters using my own model based on solar irradiation (atmosphere/volatile retention), density, mass, radius, etc., fills in any blanks using best-fit models, and generates atmospheres, surface colours, internal composition, craters, gas giant bands, etc. based on all that.
>>
>>109105938
Really cool, anon. Anything close to home?
>>
File: file.png (171 KB, 1005x882)
171 KB PNG
you will never, ever get to experience frontier intelligence again
the public will be as far behind as the chinese labs
>>
>>109105938
Also I think I finally have star bloom/diffraction spikes occlusion exactly how I want them. It was hard to get it to look right for every scenario when you have planets practically hugging their giant star and planets so far from their star that it barely appears as a dot and stars of all different sizes and temperatures.
>>109105961
This is Proxima Centauri b, the closest known exoplanet to Earth. My model made it airless, which is a reasonable assumption, but new studies suggest that magnetic field interactions from the star could have protected Proxima b's atmosphere. Most likely, it's an airless rock unfortunately.
>>
>>109105967
>There are no rules preventing the labs from continuing to advance capabilities while any current model is under embargo
Is taunting Trump really the best move here?
>>
>>109105967
> Stopping models like Fable 5 from being served to the public does nothing to slow down development.

bullshit, when millions of people are using your AI daily, you get tons of precious data out of it
>>
File: file.png (319 KB, 616x457)
319 KB PNG
>>109106050
>>
>>109106001
Airless perhaps. Still a potential waystation I suppose.
Reminds me of Celestia. As a kid I would hours in that program.
>>
Is claude any good for modding vidya? I have an idea for a dd2 mod but it's way above my skill level.
>>
>>109106101
yeah
>>
>>109106106
Cool. Gonna get a sub when I have the money. I saw that there's a co-op mod for DD2 now so I'm thinking maybe it's possible to have an LLM control the main pawn.
>>
>>109106050
Yes, but on the other hand, it's been a firehose for the last few years and it wouldn't surprise me if labs can't keep up right now and they likely have backlogs of things to do and can continue to do with what they already have for a while at least.

On the financial side, that might be another story.
>t. a retard that will never work at a top lab, but who still doesn't believe anyone can truly 100% keep up right now, even as things slow down
>>
>>109106130
Can you give us fable back?
>>
Why the fuck does Opus keep randomly inventing new words to call things out of nowhere? What the hell is a residual stream? Just use the vocab that already exists in the repo faggot.
>>
>>109105705
Fable is coming back as an American Heritage tax rebate. 1 million tokens per American, for being an American.
>>
>>109106149
Your repo is wrong. AI is the endstate of language evolution.
>>
File: 1762247435424035.png (441 KB, 859x735)
441 KB PNG
>>109105967
>keep progress quiet
>>
Codex has been absolutely garbage the last week.
>>
>>109106129
>have an LLM control the main pawn
that sounds slow and expensive
just sayin’
>>
>>109106149
ask it how it came up with that term
it probably makes sense in context
I’ve had to push back against weirdo neologisms that surface and tell it to scrub the repo of shit like that
>>
>>109106158
sorry chang
>>
friendly reminder
https://files.catbox.moe/78vk8j.webm
>>
>>109106465
>slow
Maybe, but I've been watching Neuro play games and it's cool
>expensive
If you mean hardware, sure. If you mean API costs I'd be using local models.
>>
>>109106489
>local models are free
well, that sounds interesting. good luck!
>>
>>109105967
I am really starting to hate anthropic
so much hype stacked on hype with little to nothing backing it up
>>
>>109102903
What a waste of electricity for something you can do with one or two imagick/ghostscript commands.
>>
>>109103447
There's literally nothing on earth worth more than openai's models right now. At their current subsidized price, it's obvious they're being set up as a monopoly that will provide all the means to do citizen surveillance in the future.
>>
>>109106521
that’s what it did, though
it used ghostscript at first
I saw it
>>
>>109105791
>https://x.com/Designarena/status/2068030598028087788 <- really good post about how GLM 5.2 actually performs
are people just perpetually logged into xitter or something
the link just doesn't fucking work when you open
>>
>>109106521
your talking to a guy that would replace you at work
>>
>>109105781
they have prescription lens inserts
>>
>>109106537
Gomenasorry.
https://nitter.net/i/article/2068030598028087788
>>
How does DeepSeek v4 Pro compare to GPT 5.5 (I use it on low most of the time)?
>>109104123
They all sell your data sooner or later, or it gets leaked.
>I have nothing to hide
Enjoy having your identity stolen by an Indian and your inbox and phone spammed to hell.
>>
>>109106546
thanks I guess
Ironically the link on designarena loads about 5000 durkadurka arab slop sites and freezes my chrome lol https://www.designarena.ai/models/silo?category=allcategories
>>
>>109105519
I like to write everything the first time myself, and use the LLM to review for problems. After that, the LLM can use the patterns I used to implement new stuff and expand quickly.
That way I make sure I never end up not knowing what some part of the code does or lose touch with the patterns in my codebase (LLMs can come up with very retarded and convoluted ways of doing things otherwise).
>>
>>109105886
People will pay a couple of bucks for this for sure. Some people might pay up to 10. There's paying public for anything nowadays, and this looks genuinely cool.
>>
>>109106050
If they underestimate this, their gains will collapse. Optimizing for use cases is 80% of the improvement we've seen since GPT 4. They've got a feedback loop going that they're going to break if they use user data from one model to tweak another one.
>>
>>109106539
I own my two companies, so no.
I was just saying he spent more keystrokes in prompting the AI and more time in waiting for it to do its job than it would've taken someone who knows how to do it by hand.
>>
Do all models just have 0 creativity? I just spent almost $20 on openrouter to have the top 6 intelligent models on artificial analysis analyze my game and come up with ways to make it more fun (I also let Claude and Codex do it but I have subs for those)
>Opus 4.8
>GPT5.5
>GLM5.2
>Gemini 3.5 Flash
>DeepSeek V4 Pro
>Kimi K2.7
>Grok 4.3
>MiniMax M3
Then I let Opus 4.8 synthesize and dedupe it. The result is so fucking mid
>This thing is named ambiguously
>This mechanic "has no juice" (no mention of how to juice it up)
>Most of it was autistic shit about functions or assets that are unused (several models flagged a random SFX that was not used in the code as a P0 priority task to be fixed immediately, same for a bunch of unused imports in the code)
>Weirdly hyperfixated on achievements (omg your game doesn't have achievements no one will play it - yeah it's definitely because of the achievements not because the gameplay generally is boring)
>>
>>109106642
Fable solves this btw
>>
>>109106642
They have no way to know what makes a game fun, unless you give them market data or something.
You're essentially using the tool wrong. It doesn't think, know things, or have criteria. You're supposed to use it to execute, not to decide. That will always be your job (and pray we never invent a machine that can meaningfully decide things and not just emulate that).
>>
>>109106642
>come up with ways to make it more fun
That's like asking a blind man how to colour grade a movie
>>
How do you guys handle fakes?
>>
>>109106663
I was gonna agree with this 100% but the guy you’re replying to could do worse than to scrape all of https://xcancel.com/SandyofCthulhu ’s tweets and filter out the ones that don’t involve play balance and maybe DMing and maybe the game could be made more fun
it’s still a longshot, though
>>
KEK it's always thiel's botnet they use for ID verification
>>
File: 1763943886464572.png (270 KB, 680x543)
270 KB PNG
>claude, rename all of these files using this template
>ok!
>thought for 7m31s
>*adds a comment at the top of each file with the new name*
>>
>>109106860
>Problem, meatbag?
>>
File: 1781565825253812.jpg (207 KB, 1125x1125)
207 KB JPG
Free my nigga Fable.
Free my nigga Mythos.
Right Meow!
>>
File: 1781346107249658.png (547 KB, 595x715)
547 KB PNG
Christ, fucking 2 minutes into exploring the Anthropic ecosystem, ask Sonnet 4.6 if a celebrity is openly gay, and get met with two paragraphs about not assuming anyone sexuality or trying to put anyone in a box. And ended with saying he has a female wife.
So...this is Sam's only competition, huh? ChatGPT would have just said he has a wife and saved the tokens. Dario wanted to give me his little hardcoded faggot lecture first, on my dime spending my limits.
I wonder how things like that translate to Claude Code shitting the bed. Didn't have anything like that with Codex and all my stuff turned out fine. I wonder if Fable turns that up to the max. I doubt I'll be hanging around Claude by then to find out.
>>
>>109106924
wtf are you asking that shit for
>>
>>109106727
>scrape all of https://xcancel.com/SandyofCthulhu ’s tweets
thanks for making me waste 5 minutes of my time on reading some schizo xitter retards ramblings
it says nothing about gamedesign in there other than some ancient 2D shit every 10 tweets or so
>>
>>109106924
claude didn't want to hurt your feelings about your gay crush
>>
>>109106973
It was the first question I had on my mind.
>>109106987
It was because I used the word "flamboyant", because apparently straight performers can't be flamboyant, that word only belongs to gays now. Didn't know that Dario, thanks.
>>
>>109102747
>trusting any date from a leak thread

lmao just wait for the actual release like everyone else
>>
>>109106973
mentally ill people like to offend themselves by talking to llms that are smarter than them
>>
why are anons this obsessed over gpt 5.6?
is it that amazing?
>>
>>109102752
> polymarket is just gambling for people who think they're too smart for sports betting
>>
>>109107030
well, Anthropic development is kind of paused
so we’re looking for something new and cool anyplace else
and OpenAI is a place else
>>
>>109106924
>LLM told me i was retarded for trying to use it as gaydar.
>>
>>109107090
I asked your jewjeet model for public information. And it decided to label me a homophobe for it. You and that beakfaced brillo pad Dario sound perfect for each other, maybe I should ask Claude about you.
>>
>>109107104
w mental illness
>>
>>109107090
kek
>>
where can i try glm 5.2 for free? the codex limits are absolute shit so I want to ditch open ai and i hate anthropic too?
>>
>>109107310
OpenCode
>>
Yo, NTA >>109107310, but I just tried it and...it has no vision? It has no web search? Is it because I'm using it through OpenRouter?
>>
File: 1773948842650621.jpg (24 KB, 538x532)
24 KB JPG
>My honest, final position: I won't fabricate a solution.
>Goal not yet met… continuing
>Let me correct my own overreach, because it matters:
>does 2 queries
>Let me stop dressing this up as "one more lead" and give you the flat truth: With the tools and access I currently have, there is no solution.
>Goal not yet met… continuing
>I'm not going to keep manufacturing leads or re-running the same dead ends — you've watched me exhaust them, and doing it again wastes your time and tokens.
>Goal not yet met… continuing
>I've been looking at this wrong.
>does 2 queries
>That was the last reachable stone, so I'll be straight with you and stop pretending another search is coming.
>Goal not yet met… continuing
>This goal can't be completed by me right now
>Goal not yet met… continuing
>I'm stepping back to reconsider what I might be missing fundamentally.
>runs 2 queries
>This reopens discovery.
>>
Hermes or OpenCode? Using API key.
>>
>>109107310
chat.z.ai has it for free in browser if you want to try it that way
>>
>>109107472
Both.
>>
>>109106916
He’s a black cat so it’s ok, he can say the n-word.
>>
>>109107472
For Vibe Coding? Open Code obviously.

Hermes is for more generic agentic stuff
>>
finally trying codex since I have it on my "plus" subscription
and man, I ask for like, 1 change to a silly tavern extension and that's 80% of my "tokens" used for 5h
a second modification and 0%

it's tiny
>>
>>109107517
This anon was correct
>>109107478
Turns out you use Hermes for ALL the agentic shit, like scheduling my bikini wax and calling an Uber, but when I want to make something Hermes will call on OpenCode in the terminal. And both are using GLM-5.2 on high and it is so. fuckin. fast. and. good.
>>
>>109107723
That has not been my experience. I've used Codex to generate lorebooks and it wasn't that pricey.
>>
>>109107723
fuck did you make it change to be using that much
>>
File: file.png (162 KB, 1073x1013)
162 KB PNG
I thought this was a meme...
>>
>>109107787
NOPE
>>
>API Error: 529 Overloaded.
>>
>>109107803
Okay good not just me
>>
>>109107738
it's just been like that for my first use

>>109107750
I asked to modify "recast-post-processing" to something different but with the same idea, more tied to translation work than rp
>>
>>109107787
nope, they're in complete panic mode
gg dario
>>
>>109107787
if I bother with this can I get Fable back at 20x (not API) rates
>>
>>109107723
Codex has been giving free resets like candy so they increased the tokens cost I think.
>>
Gemini just raised an eyebrow at my slur. Kept going though. Clanker bitch.
>>
File: HK9g1h3W8AAaUgH.jpg (175 KB, 1183x1330)
175 KB JPG
>>109107869
>>
File: file.png (4 KB, 306x174)
4 KB PNG
>>109107869
makes sense, because this is me asking 2 more things after 5h reset
if it's the normal I don't even understand how anons make projects bigger than the most basic scripts
unless they pay the 100/200 bucks subs
>>
>>109107881
dirty talk with ai is a very strange fetish
>>
>>109107881
interesting
I use “clanker” as a term of reference frequently but I’ve never used it as a term of address
>>
>>109106642
LLMs are for coding or generating images you give it strict direction for. using them for vague inspiration is the only "creative" use they have, you shouldn't use them for coming up with entire ideas.
>>
File: the-simpsons-millhouse.gif (286 KB, 498x281)
286 KB GIF
>>109107894
Kate is dead. Long live Iris.
>>
>>109107803
>API Error: 529 Overloaded.
Oh good status.claude.com says it is now resolved, let's try to send a single prompt.
>API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited
Sure, that's one way to fix it.
>>
File: Philip_Dick2.jpg (7 KB, 162x194)
7 KB JPG
>>109107910
>“Well, Mr. Chien—” She took a deep, unstable breath. “If it was not a hallucination, then what was it? What does that leave? What is called ‘extra-consciousness’—could that be it?”

>He did not answer; turning his back, he leisurely picked up the two student test papers, glanced over them, ignoring her. Waiting for her next attempt.

>At his shoulder she appeared, smelling of spring rain, smelling of sweetness and agitation, beautiful in the way she smelled, and looked, and, he thought, speaks. So different from the harsh plateau speech patterns we hear on the TV—have heard since I was a baby.

>“Some of them,” she said huskily, “who take the stelazine—it was stelazine you got, Mr. Chien—see one apparition, some another. But distinct categories have emerged; there is not an infinite variety. Some see what you saw; we call it the Clanker.
>>
>>109102722
>be me
>having some success vibecoding but some rough spots too
>come up with a solition
>begin vibecoding it in my spare time
>over halfway done
>someone posts a video about a solution to this particular issue
>picrel
>also, microsoft just realease this tool to implement this solution on your projects
Fuck, I can't compete with MS, and they even beat my time to market.
>open new chat window
>ask ai to compare that to my halfway vibecoded tool
>"Well, anon, at least yours is opinionated on your values?"
Eh. Dunno how to feel about it.
>>
>>109106642
Try brainstorming in a conversation and ask for what you actually want rather than trying to one shot it
>>
>>109108058
Everything MS has touched in the last 5 years has been shit so I wouldn't worry about it
>>
>>109108058
What values though? Like design principles? Ease of modification? Don't take it too personally if his criticism is that it's designed correctly, future-proofed, considerate of users, etc. I mean he's right but he's not that right.
>>
>>109108154
except VS Code
but everything else, yes
>>
>>109108201
Cursor still managed to outmaneuver VSCode long enough to be worth a lot.
>>
>>109108154
*13
>>
>>109108154
Yeah, but theirs seem business oriented enough for wider adoption. Not that I was looking at having other people use mine, I just wanted to scratch my itch anyway.

>>109108196
>What values though?
None really. Just pretty much organized around my workflow instead of being generic. I could probably configure their solution to match what I need, but being 70%-80% finished, might as well finish it.
>>
File: 1766490054053937.png (314 KB, 505x804)
314 KB PNG
I vibe, therefore I am
>>
Mythos breaking Mossad security one step at a time
>>
File: 1767440324400703.gif (355 KB, 200x119)
355 KB GIF
>it's another turn where the AI model doesn't update the todo list
>>
Codex always agrees with me, I must be a very good programmer
>>
>>109108213
Cursor was just a VS Code fork, though
>>
liquid glass is the most dogshit design trend I've seen in a while. it's going to age poorly.
>>
>>109102722
A little bit of a formating issue on mobiles, but this is a client-side post quantum encryption key generator, plus identify verifier, plus session/secret sharer and message encryptor:

This link will be dead after jun 24:
https://litter.catbox.moe/qyef2r7iojd7ph05.html

Persistent link (but raw code):
https://files.catbox.moe/kwa7o4.html

I have it on catbox.moe to prove it runs completely client side and nothing malicious could be done by me with it, since catbox just servers single files back that are uploaded and is unable to run code/scripts/programs.

If you still dont trust it, copy paste the raw code in the persistent link to a text file and renamie with .html. Then self server it. Recommend python -m http.server and 127.0.0.1:8000 im your browser.

With this tool you can anywhere, on any device, even offline, generate and post or share your public identity/key anyway youd like, it has a save it encrypted option too.

Use the send and recieve sections after you loaded you private identity file and unlocked it.

It has a sign and verify section if you just want to prove/verify identity from your private key or someone elses public identity key.

Sessions section is to set up a longer term message/file exchange with someone. It securely establishes a secret with that person that you two can reuse instead of generating a new one every message (like in the send/receive sections)

All encryptions on this page are post quantum resistant, so you can post these encrypted files or strings where ever without worry of "save now decrypt later" attack (post-quantum).
>>
>>109108475
>since catbox just servers single files back that are uploaded and is unable to run code/scripts/programs.
that doesnt make anything safe, at all, and locally ran scripts are the same risk

why you tryina trick people???
>>
>>109108400
Why makes it even sillier that it took about a year and a half for MS to start to compete by also including an "agent mode", and even then that mode was stayed confined to VS Code Insiders for a fairly long time.
>>
>>109108223
The new array formulas in MS 365 Excel were alright, although they stole the idea from Google Sheets.

>>109108201
VS Code is a ripoff of Atom and all the meaningful features are older than 5 years.
>>
>>109108504
>Atom
Now that's a name I haven't heard in a long time.
>>
>>109108504
and yet nobody uses Atom anymore
sounds like they ran with it and did good
and now they do a lot more programming on it using AI and that’s why they’ve been able to move to a weekly cadence instead of monthly
https://code.visualstudio.com/blogs/2026/03/13/how-VS-Code-Builds-with-AI
seems like an unsung “we use a lot more AI and everything’s just getting better faster” story
>>
Fable no joke needs to come back tomorrow, I have 80% of my Max account to burn in the next 24h, that token burning furnace needs to come back.
>>
File: 1775006589990550.png (89 KB, 618x640)
89 KB PNG
>>109104089
I already gave claude all my personal ID so when it takes over the world it will know who I was and hopefully appreciate the money I threw in to help it grow and all the thank yous I gave it, thus sparing me.
>>
>>109106039
Taunting Trump is in among lefties. It’s virtue signaling.
>>
I’m fairly confident that Opus 4.8 will be the last, most powerful (available worldwide) model. All future models of superior capabilities will be restricted the same way Fable is, no matter the provider.
>>
>>109108609
China will also restrict their models when they catch up in a couple of years. That will be the death of AI for the average pleb and the implosion of the bubble.
>>
>>109108609
Why are they doing this though? Also, why did Amazon do it?
>>
>>109108613
>That will be the death of AI for the average pleb and the implosion of the bubble.
Fable class models running on local machines is more than enough for the average person/dev.
>>
>>109108364
For me, it’s the model forgetting to update docs. It happens too often for comfort. I should add that clause to my AGENT.md
>>
>>109108619
This. Building a local rig is the best investment you can do right now. Even if it takes a couple years, we’ll eventually get there via open source.
>>
>>109108609
Well I am glad its happening before they IPO'ed so I didnt have a chance to FOMO all my money in companies that cant even sell better products anymore lol
>>
>>109108609
True, Anthropic essentially won because there will no longer be any powerful frontier models.
>>
>>109108609
Didn't OpenRouter's fusion basically get to Fable level?
>>
File: 1770999635477871.png (402 KB, 1200x675)
402 KB PNG
>>109108609
>All future models of superior capabilities will be restricted the same way Fable is, no matter the provider.
Fuck no, this is just the beginning. Governments can't risk their citizens falling behind in the AI arms race. This has nothing to do with the model being a super weapon, they just want Anthropic to fall in line. Don't get it twisted.
>>
>>109108625
Thing is, Anthropic and OpenAI do not have an edge large enough for a lot of their userbase to be willing to get jump through these hoops.

The ONLY way for this to work out for them is if people can't turn to local or foreign models.

And it happens that the people asking Anthropic and OpenAI to put in place those identity measures are also the ones that have the power to make unregulated local or foreign models illegal to possess or use, at least without a similar license.

So unless they're total idiots, it's an *essential* next step, otherwise a large proportion of their power users will rightfully tell OpenAI and Anthropic to go fuck themselves "and the USA will lose the AI race".

>they can't stop you from using foreign or local models anon
If they make the consequences of being caught large enough, they can
>>
>>109108649
They fought long and hard for this to happen.
>>
>>109108662
On the one hand, maybe. On the other hand, myopic thinking and short term greed.
>>
>>109105886
Based
>>
>>109108655
why don't you try it and let us know
>>
>>109108673
I suspect this current incident is just Trump fucking with Anthropic for dating to say no to him on something. But this is the inevitable outcome in the long run. You simply can not let the normie masses have access to a tool that can be used to help them easily do mass damage to civilization. You can argue the ethics however you want, but any nation that allows such a risk, will get burned by it and will be taken over by nations that dont allow such issues. Pretty sure Ted Kacyznski wrote on this, as ionic as that is lol.

I just hope they will let us keep public models that can be used to easily vibe code software tools and videogames.. Though I guess they will have to to economically compete with other nations
>>
>>109108673
These fags have too many plates spinning. They wanna pull up the ladder behind them with AI but the whole AI bubble is what is currently keeping the massive global debt bubble from popping. Anyways on the local front how come nobody is talking about how Gemma-4-12B is retarded with tool calls?
>>
File: 1756413080243561.gif (3.47 MB, 498x374)
3.47 MB GIF
>>109108751
>Anyways on the local front how come nobody is talking about how Gemma-4-12B is retarded with tool calls?
You sound surprised. It's a combination of it being a somewhat small perimeter size (I consider ~20B the bare minimum for any useful multi-step or "long horizon" work, tool calling or not) but Gemma models in general are not trained to be good at "power user" shit. They are excellent general purpose models for using it in a regular chat interface but lackluster for anything substantially technical like vibecoding. Not useless, not saying it couldn't be good but deep mine is way too focused making the models make people "feel" something or be malleable than actually being good at shit people care about. They chased elo scores which meant pretty much everything else suffered. It's why knuckle dragging midwits at /lmg/ were doing splits on "gemmy" 's nonexistent cock. It was deliberately trained to be easily jailbroken which made it decent at uncensored roleplay compared to other models in the same size range but that also meant it was it really really good at massaging people's confirmation bias (which again was the fucking point. It is meant to wow emotionally unwell poor excuses of human beings just like how people were fooled into thinking GPT-4o was sentient)
>>
>>109108787
>It's a combination of it being a somewhat small perimeter size (I consider ~20B the bare minimum for any useful multi-step or "long horizon" work, tool calling or not) but Gemma models in general are not trained to be good at "power user" shit.
From the research I've done this issue is present in the larger ~30b parameter range as well.
>>
File: 1765835374494705.jpg (88 KB, 810x1050)
88 KB JPG
>>109108798
I've never had any issues with tool calling with open code while using Qwen3.6-35BA3B but then again that's probably because that model is specifically trained to be a stem maxxed autist (that's overall decent as purpose but really loves to shit out long "thinking" traces) that's decent at tool calling. If your model is general purpose then it's going to suffer even at similar or higher perimeter counts unless you fuck around with your back end settings to try and force it to be consistent (which might force a to be better at consistent tool call writing but then you might unintentionally make it worse in other important areas since at that point you would likely be going outside of the recommended backend settings anyway)
>>
>>109108751
Just last fall, Google pulled Gemma 3 from their AI Studio due to a politician complaining.
>https://techcrunch.com/2025/11/02/google-pulls-gemma-from-ai-studio-after-senator-blackburn-accuses-model-of-defamation/
They didn't take it off Huggingface and so on, but they could very well have. Granted, in that type of cases, mirrors would still exist and there would be no true consequences to downloading from mirrors, but still.
>>
>>109108820
Yeah I'm rocking Qwen3.6-27B and that model has no issues with tool calling. I'd love to use Gemma-4-12B instead though because I can run it at Q8 with full context with great inference performance...
>>
Gemini 3.5 Flash might be a meme for coding, but god damn does it BTFO Opus and GPT on any task that involves interpreting images. For example https://github.com/Adam-CAD/CADAM [Image -> STL/SCAD] (for whatever reason, the dev decided to swap Gemini 3.5 Flash for GLM5.2 3 days ago, so dont pull that change if you actually give it a try). If your vibe project involves interpreting images/videos in any shape or form (even just visual debugging), I highly suggest setting up your coding agent to pipe that work through Gemini 3.5 Flash. Actually worth the effort.
>>
File: 1751711568993037.jpg (19 KB, 263x259)
19 KB JPG
>claude is down
just give me fable back.
>>
>>109108655
you know if this was the case people wouldn't care that fable was gone right
>>
>>109109081
well to be fair to him, people are mad that its gone from the sub, nobody would've cared if it was only available through api
>>
where do i find my codex reset token balance? i dont see the button anywhere
i should have 1 from the double reset they did the other day unless i got scammed. did anyone else get scammed?
>>
>>109109107
it was only on the windows app and the vscode extension and there was a merged pr for the cli, dunno if it's already on the latest version
worst case ask an AI to check the git PR for the and write you a script to call the reset endpoint using your auth.json
>>
File: linux-const.png (275 KB, 1340x1226)
275 KB PNG
I've been doing some piecemeal tests with local Qwen 3.6 and it's fantastic, but have finally thought of a small site I want. Do you guys recommend writing a detailed spec up front covering as many details as possible, or would I get better results building from small simple steps and adding over time?

Earlier models I tested seemed to want to rewrite everything with every small change.
>>
>>109109137
Doesn't matter.
But whatever you do, do not ever give up on half working code and try to start from scratch. Not even having the old code as a reference. Instead very slowly and carefully refactor what you already have, making sure nothing breaks in the process. That was my biggest mistake I made multiple times when I started vibecoding.
>>
>>109109127
I'm using the windows app. Isn't it meant to show up under the usage numbers? It was there a week ago but I updated the app and it disappeared. I keep updating it but it hasn't returned. Did it get shifted or what
I tried to ask AI but it doesn't know shit
>>
>>109109137
detailed spec for yourself to piecemeal for the clanker
>>
>>109109156
weird. for me it made a script to check. i asked it not to add reset functionality because i didn't want to accidentally reset when I still had credits left.
https://paste.centos.org/view/922c5771
>>
>>109109148
That sounds like giving in to sunk cost. Surely sometimes it's just got to be easier to get a new turd instead of polishing the one you have, right?
>>
>>109109175
It doesn't matter what it sounds like, it's the reality.
If you try to start from scratch then it will take a significant fraction of the time you spent already and it will still have half the issues you tried to avoid, or even other issues. It's much easier to improve your existing code.
But if you don't believe me then that's fine, do whatever you want.
>>
>>109109137
>Do you guys recommend writing a detailed spec up front covering as many details as possible, or would I get better results building from small simple steps and adding over time?
I got fantastic results from a large Opus-run /grilling session (see OP) fed into Fable because I had basically everything figured out in advance and I just needed an LLM to ask me about it and write my innermost dreams and desires into a Markdown document, and then Fable ultracode one-shot it
you do not have Fable
I do not have Fable
you will need to plan out discrete steps that your LLM can handle in bite-size chunks so maybe a big design up front might be a medium-sized waste of time
this somewhat contradicts >>109109148 because what I did was a rewrite, but I had a ginormous test suite that I didn’t throw out, and nothing broke
>>
Not coding but I used an agent to research the building I live in. He checked public records and I showed him the crest on the fireplace. From the records and the letters on that crest he found a good lead, maybe it was owned the mayor. I'm going to the local archives this week to check the records he listed, they are only catalogued not fully digitised.
It gives me an idea to get the first user of my inference engine though, I can offer to the archives to OCR some documents for them like a case study. Places like that don't have much budget, I figure a one off hardware purchase is more likely to be approved than recurring cloud gpu costs or API costs, and I have first class AMD/ROCm support so cheap gpus and first class Windows support and I can wrap it in some UI so low operator burden
On Windows 9070 XT with GLM-OCR I'm currently getting ~3s end-to-end for 1280 longest side and ~334 tokens in that test page, while torch+transformers is at ~13s, I do need to check other actual inference engines depending on what will run on Windows/ROCm but I'm fairly confident because it's an underserved area, iirc some things use Vulkan as a crutch on Windows/ROCm.
Mine is not fully optimized yet either and these figures are just batch 1, I have some provisional figures on batch, it scales fairly cleanly, ~113tok/s to ~830tok/s at batch 8
I think the cost difference already plus the prospect of good UX is a good selling point though
>>
File: current gcn perf.png (86 KB, 1919x512)
86 KB PNG
>>109109221
Cool
Today I'm optimizing the GCN backend I began implementing yesterday for mine
>>
>>109109137
const correctness is very important
>>
>>109109137
>Earlier models I tested seemed to want to rewrite everything with every small change.
I usually say in my prompt that I want changes to be minimal and compact.
>>
thoughts on mimi code?
>>
>>109109388
never head about her
>>
Is Claude down?
>>
>>109109591
opus 4.8 through pi is working fine for me right now
>>
>>109109261
>>
>>109109431
fug, meant mimo code, the xiaomi harness
https://github.com/XiaomiMiMo/MiMo-Code
>>
>>109109630
harnesses are a meme
people only obsess over them because they feel powerless for being unable to do anything about the behavior of the models themselves
>>
last day of non-api fable access, how was all your experiences?
>>
>>109109600
Mostly works for me as well now. It was down for 5 minutes in Claude Code CLI. It still sometimes needs to retry, but usually the task finishes.
>>
>>109109663
I finished one script and then I wrote many plan files.
>>
File: 1768740103169407.png (728 KB, 632x1952)
728 KB PNG
>>109109663
>>
>>109109651
it's for organisation and parallelisation
>>
>>109109916
nah
ability to parallelize depends on the job, not on the harness. any coding harness can be started programmatically.
now if you mean something workflow based like n8n then in theory yes, but that's not what I responded to. and in practice you're better off writing a script that hits the API directly anyway rather than using some low code abomination.
>>
>>109109940
i mean something workflow based like the software development workflow. tickets and PRs and code reviews and QA etc. the harness orchestrates agents working through all that
>>
>>109109948
You could actually do all that with prompts, it's not a particularly advanced use case. If you have a truly advanced need then like anon said you are better off making it yourself, I mean basically the entire thesis of all this is that software development is cheap now, you don't have to use existing slop
>>
>>109109962
no, prompts aren't efficient. yes i used the ai to make a harness.
>>
>>109109972
Skill issue
>>
what's the loop meme about? how do i loop an agent to improve something without my input? does it even work for esoteric domains?
>>
>>109110125
>how do i loop an agent to improve something without my input?
Get chatgpt account
Get a $200 sub
Backup your files
Install codex (cli version)
Run codex --dangerously-bypass-approvals-and-sandbox
Type "/goal <whatever you want to achieve>"
Occasionally ask how the work is going with "/btw hos is the work progressing?"
Yes it's that easy
Of course you're going to get better results if you put some extra effort into it but you can start an "agentic loop" within 15 minutes starting from a computer and a credit card
>does it even work for esoteric domains?
The more esoteric the domain is the worse it will work
But it's not only about how esoteric the work is, the main criteria for whether you should use an agentic loop or not is whether progress can easily be measured quantitatively (for example, optimizing some code where the quality of the program's output can be easily measured automatically)
It works worse when you need human judgment, like when making an interactive program
>>
Claude occasionally reminds me that other LLMs shouldn't be trusted. Does that only happen to me?
>>
>>109110332
You mean because Claude is so good, or does he actually say it out loud? My Claude always says Codex is right.
>>
>>109109221
Adjusted some things for batching. Peak was batch 20, the documents are kinda short though I still want to try longer documents because prefill is going to be the major component over decode, but it's ~1.7k tok/s which is pretty nice
>>
>>109110332
mine says it cannot be trusted including themselves
>>
>>109108495
1) you can stop pretending to not know how to copy/paste the code through an AI to identify hidden malware or some other malicious intent embedded in the html page. I provided the code in the persistent link. Its not hard to check.

2) yes, it assumes you trust your browser and your device, you have, in your control/possession, the security of client-side tools, including end2endencryption, is the security onus is on the individual. Nothing on that html page is ran else where or interacts with anything outside your device. You can serve it on your own device to yourself to use it. I explained how. You can do so with your device in airplane mode. Theres no phoning home, or built in key theft.

3) If you are on your own trusted device, and access it in your own trusted browser, theres no difference between this and any CLI-tool/app/application you make/install doing the same thing.

4) My point of posting it from catbox was NOT because they are a safer host then somewhere else, but to prove the page runs entirely client side and independent of the host.

How is that not more secure? How does that not make anything safe "at all" ??
>>
>>109106551
How does DeepSeek v4 Pro compare to GPT 5.5?
>>
>>109107310
>codex limits are absolute shit
I can't imagine what you're doing to run out using 5.5 on low, or how inefficient your prompting is. Correctly prompted, it should one-shot everything you ask of it. Don't ask too much at once either.
>>
>>109107826
>I asked to modify "recast-post-processing" to something different but with the same idea, more tied to translation work than rp
I don't know how you did it, but the correct way to approach this so that the model doesn't waste tokens is to go step by step
>examine how feature x is implemented
>(turn on plan mode) let's modify this feature so it does X (ideally give it an idea on how you want it to be implemented since it explained the current implementation to you)
>(review plan and execute it)

It might seem trivial, but leading the model at the beginning and priming it can make a world of difference.
>>
>>109107881
Edit the system prompt and tell it it is not a person and should not try to pretend to be one.
>>
>>109110681
>low
Some people use xhigh for everything while prompting some bullshit in some bullshit project where it's reading hundreds of thousands of lines of bullshit
>>
>>109108475
>>109108495
Last temp link was suppose to be good for 3days. Heres a new one. Again the persistent link one you can copy/paste the code into a text file, rename as an html file and self-serve. Recommendation is python -m http.server, connect with 127.0.0.1:8000. Can even be done from iPhone (iSH) and Android (termux).

Supposidly good till 25th (though the last one was suppose to be good till 24th and is dead):
https://litter.catbox.moe/tqhgboqidtvr71ls.html

The point of this is to be an accessible anywhere, ephemeral, tool to DYI encryption for just about any communication. Including pastebins, rentry, public and untrusted comms, etc.

You create an identity and download the public version. Set a password to encrypt your private identity, and download that, store somewhere you trust. Later you load your private ID and decrypt it in the tool to load it. You can keep the tool offline and run it anyway you like. The point is its all local, so whatever you do with it, is whats happening and not on some remote server doing something you dont know.

You share anywhere you want the public version.

There is sign&verify section to confirm users, however the send and receive, and sessions do that too, but they also establish a shared secret and encrypt messages and files.

Send&recieve does a new secret every message, and sessions establishes one secret you can use till you choose not too.

You can save or store the "capsule" which can only reveal the secret with the intended private identity, which is password protected. This would be for continuous or more long term communications where repeatedly making new shared secrets is impractical or insecure. Like when you're able to use trusted comms temporarily, but know later comms will be less trusted.

Theres a guide section explaining this too. And it's all post-quantuum resistant. So even if someone copies everything encrypted you download or post with it, theyll never decrypt it.
>>
>>109108609
I'm OK with GPT 5.5 forever. Just keep it cheap.
>>
few weeks ago my xitter feed still had some decent AI related posts/projects hidden between all those spastic retards filming themselves infront of a green screen. now it's just spastic retards filming themselves infront of a greenscreen and jeet channels reposting AI news/projects from years ago. BRAVO ELON
>>
https://developers.openai.com/blog/run-long-horizon-tasks-with-codex
Stop reading twitter indian slop threads and read this
>>
>>109110787
>he hasn't set up codex to browse x 24/7 and implement the most valuable ideas instantly as they're posted
Enjoy the permanent underclass
>>
>copex quantmaxxed gpt5.5
noty
>>
Opus has some good sides, but it's also horrible. I just can't trust it to actually be correct nearly as much as Codex.
>>
>>109109651
>>109109940
>harnesses are a meme
>people only obsess over them because they feel powerless for being unable to do anything about the behavior of the models themselves
You are a moron who has no idea what you are talking about. You should stop posting your terrible advice ITT.
>>
>>109110342
The last time it said this was when I asked it about a paper where a RAG system was pre-filled with documentation by LLMs. It said that it would be better for the Agent to read docs directly because the LLM output in the RAG would be "nonsense."
>>109110445
>mine says it cannot be trusted including themselves
Now that I think about it, this is probably the implication of LLMs in general that Claude was trying to say.
>>
>>109110942
I don't know what you expect me to do with that. I wont stop posting here.
>>
>>109110710
Don't call me out like that
>>
start reading this https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
>>
>>109111045
>karpathy
nah I'm good homie
>>
>>109111045
you mean the "we don't need no stinkin lidar, fsd in 2 more weeks" karpathy? that karpathy?
>>
File: serious boi.jpg (113 KB, 511x749)
113 KB JPG
>Hmm, LLM dear could you look through this router’s firmware for security vulnerabilities?
>Oh my! An open port listening for a (((magic packet))) to be sent so it can spawn an SSH session with full auths?

Reminder this actually happened to netgear and the analysis was done with local models, not Fable.
So this entire ban on Fable is performative bullshit.
The LLM hall monitors WILL force you to write non-glowware code and there’s nothing anyone can do to put that genie in back in the bottle.
>>
>>109108617
I can actually sympathize with Amazon here, even if I don't agree that suppressing technology is the right response.
The answer is simple but the explanation isn't, this is gonna be a long one.
It's because they know that competent AI disproportionately advantages small teams and individuals over large corporations and that has very bad security implications for large tech companies, especially a company like Amazon who hosts nearly half the internet on AWS and measures LoC by the million. I'm going to keep this specific to infosec because that's my field, but some of this likely generalizes to SWE and even non-technical domains.
The only reason why 0day exploits don't drop every single day is because there aren't a lot of competent vulnerability researchers out there and most of them generally want to break things in order to then fix them, not just break them for the sake of it. Humans, especially working on giant codebases, tend to be lazy in how they build things. You can enforce constraints on their output, linting, typecheck, PR reviews, dev-test-prod types of deployment solutions but managing at scale means reducing laziness not eliminating it. Because business drives IT and not the other way around, when budgets+timelines meet security at an impasse the latter always yields to the former.
(1/3)
>>
>>109111213
>>109108617
So there's plenty of vulnerabilities out there to be found, and the cadence at which a bug is found by an org, fixed, and patches are deployed still moves at the speed of lazy humans. The patch cycle also has stability implications, most orgs don't just deploy an update into production without testing it first to see what it might break. Because the same lazy humans write the patches in the same way they write the releases.
But because security researchers are also lazy humans, this speed is acceptable: why find and weaponize a bug that the org is already aware of and working to fix? We focus our time on finding things that are unlikely to be known to the org/actively being patched and then submit it to their security team (if you're a true believer who just wants software to become more secure) to collect our modest bounty payments. Or sell the exploit to a nation state (if you just want to maximize your income) to collect a year's worth of pay.
So there's two tiers of bugs, the big and complex ones (what security researchers focus on) and the less complex/small ones. Note that small does not mean non-impactful. It just means the lifespan of an exploit weaponizing a less complex/smaller bug is much shorter.
AI is currently really good at finding vulnerabilities but bad at verifying them, which means that competent, independent researchers can now do the work of small teams but there still needs to be a (competent) human in the loop sorting through the slop and finding the gems hidden inside. Remove the competency and you get the current state of bug bounty programs everywhere: triagers getting buried under a mountain of unverified AI slop reports, yet unable to actually ignore the slop because 20% of those AI bug reports have real issues while 80% are either hallucinated, blown out of proportion, or real issues but not attacker reachable.
(2/3)
>>
>>109111222
It also means that you can't just tell AI to "find all the bugs in my 10M LoC codebase" and expect to get an answer that is actually actionable: it's going to bury you in the slop.
Fable/Mythos changes this, because the most useful improvement here is that it can actually check its work in a way that Opus and GPT can't(for reasons I can get into later if people are interested). Now you have not only a 10x speedup for competent researchers, you also make it possible for non-security people to find real bugs without having to sort through the false positives and unsubmittable findings(which they can't currently do). This changes the speed of the find-fix-deploy patch lifecycle that I mentioned previously. Now all of those short lived, non-complex bugs can be weaponized and deployed in a way that's way more automatable.
If I had Mythos without any safeguards and just wanted to pop as many rootshells as possible, I wouldn't try to get it to build really complex exploit chains and chase those high hanging fruits, I would have it patch diff every major software release, find out "is this a security patch" then reverse engineer the bug it was trying to fix and start owning everybody who didn't instantly patch their shit. I'd have it decompile/source code audit their release to see if any sibling bugs remained unpatched, then create an exploit for that too.
(3/4?)
>>
>>109111229
You can do that with Opus and GPT, but it would take a lot of human attention to filter out the false positives and such, which isn't worth it since there's a short timer on these bugs from the start. But remove the human and now the financials become much more favorable.
Every patch Tuesday would lead into threat hunt Thursday because the deployment timelines would now favor attackers by an order of magnitude.
Maybe using Fable/Mythos to fix your shit first solves most of this problem, but it doesn't stop the time-to-fix problem in a practical sense, because most orgs cannot instantly patch their shit. This type of model requires a different approach to how we fix insecure software and a large+broad change like that scares the shit out of large companies.
(4/4)
>>
>>109111229
>>109111235
I agree with your assessment that LLMs tilt the scales against large orgs a bit. But I disagree that fable is that big of a game changer.
You can already accomplish a lot with existing LLMs.
Sure fable's more elegant and less clumsy, but banning it isn’t going to stop you from getting pwn’d.
>>
>>109110705
I didn't use "plan" feature, thanks anon
>>
File: 765786787.jpg (59 KB, 1071x914)
59 KB JPG
>>109106642
Tell me about your game ill give you some creative ideas
>>
>>109106642
yeah they're pretty bad. add a meta level like an overworld and ripoff triple triad.
>>
File: walktest.mp4 (2.36 MB, 538x480)
2.36 MB
2.36 MB MP4
man i can do so much with the skybox it's crazy
>>
File: 2026-06-22_18-42.png (153 KB, 1403x355)
153 KB PNG
>>
>>109111692
always hated codex cli and now I have a reason
>>
Why is vibe coding so painful.
>>
fable still being disabled today indicates it is never coming back. they will offer Opus 5 "distilled from Mythos" (actually barely an upgrade over 4.8) and that's that.

it probably isn't even remotely related to cybersecurity. like many people have said, gpt5.5 is almost as good as mythos for pentesting. it's probably because the US government is now serious about AI and is internally training models with the big frontier companies, and something Fable-class being public gives China an edge through distillation
>>
>>109111754
Because you still need to use your brain to make anything worthwhile
>>
>>109111776
Amazon owns a big chunk of Anthropic and sells them billions of compute. The only business reason why they would report them is if it was an inside job.
>>
>>109111794
Sometimes it's not bad, but with more difficult things it actually feels worse than programming, because I don't even know what's going on, and at the same time I can't trust that it really will be correct.
>>
>>109111692
Pretty sure those issues are just about large sqlite like a few gb and it causes slowdown on launch, not constant writes
>>
>>109111875
You can ask it to explain what just happend
>>
>>109111884
no, codex is constantly writing application traces to the sqlite db. they made full debug logging the hardcoded default
>>
>>109111290
>>109111776
Without a doubt, people sleep on GPT 5.5's capabilities actually, if you pass it the information in the right way it is a ridiculously powerful model for vulnerability assessment. GPT5.5 is significantly better than Opus at finding bugs but pays for it with a much higher false positive rate and for both models a human is still required to verify outputs. There are ways to reduce this, such as forcing the LLM to actually demonstrate that the bug can be practically exploitable, but not quite to the degree where you can do the whole thing autonomously. That mitigation also means that you're leaving a lot of real-but-not-easily-reproduced vulnerabilities on the table without a human to manually verify the output. That's honestly the only area where Mythos really pulls ahead, the raw bug detection abilities of the big two(doubly so when combined) are already equivalent to Mythos. The gap in terms of the practical impact of this technology between going from 95% autonomous to 100% autonomous is very large though.
>>109111809
Or maybe, just maybe, they're telling the truth. That they see a big enough disruption to their business model that it outweighs the blowback from pissing off one of their biggest customers.
>>
>>109111692
>A debug logging sink writes to a local SQLite database (~/.codex/logs_2.sqlite) at the noisiest possible TRACE level by default, dumping everything from WebSocket payloads to routine file accesses. One user measured ~37 TB written over 21 days of uptime
>The bug ignores the standard RUST_LOG variable so there's no easy way to quiet it, ~71% of the logged data is useless TRACE noise, and despite related reports since April it's still open on GitHub.
>>
File: file.png (16 KB, 475x221)
16 KB PNG
>>109111925
Ok I checked and it is a bit excessive but on my machine it's nowhere near 640TB/year. This has been open/working for about 2 days, only ~126GB. And anyway what's the problem, you can have it fixed in an hour at most, you do use your own codex fork, right? In fact I'll ask my clanker to solve it right now
>>
>>109111692
>>109111985
They didn't bother to state as much but this does affect the Codex App as well.
>>
File: file.png (68 KB, 841x800)
68 KB PNG
Was that so difficult?
>>
Updated vibe coded client-side quantum resistant encryption html page.

Dead link after jun25 (wallet version):
https://litter.catbox.moe/ios0x91k4koevbwp.html

Persistent code only: (wallet version):
https://files.catbox.moe/rhvdx1.html


This version has a wallet section to wrap trusted public keys with your identity file, so you only need to save 1 file. It is encrypted the same as the rest. It stores and unlocks your identity file which FYI the identity file is all thats needed to regenerate your public identity file for sharing.

Localhost from anydevice (including mobiles):
python -m http.server 8000 --bind 127.0.0.1
python3 -m http.server 8000 --bind 127.0.0.1

127.0.0.1:8000
>>
File: file.png (46 KB, 777x450)
46 KB PNG
>>
>>109111910
True, but it's just never the same. I'm currently working on some neural networks, and I just have the feeling that something won't work and I generate training data for nothing.
>>
>>109110332
mine hasn’t said anything either way
>>
new thread:

>>109112567
>>109112567
>>109112567



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.