I've been planning on writing a C compiler called "Cephyr" for two years now. It has gone through iteration after iteration. If you look at my Github (/chubek) you'll see several of them, abandoned.That was, until I was introduced to the OCaml CIL library, and to be specific, the Goblint-CIL fork of it.And, given my affinity to shove Lua up the ass of every program I plan on writing (see chubak.neocities.org/bluwren-dossier, my extensible static analyzer) I'm now planning on generating several dossiers on Cephyr, given my 'extend with Lua' scheme.I'm currently working on the proposal + the prompt for the dossier:https://pastebin.com/trEu542iTell me what you think. I've gotten much better at generating dossiers. For example, I've been creating "ultimate" documentation" for libraries I wanna use. They include the official documentation (which for OCaml, I use `odoc` and Pandoc to convert to plaintext) and the source code, plus other things. This is one for Shexp (S-Expression based Shell execution for OCaml).https://pastebin.com/UgS3QeJbSonnet 4.5 remains the *best* tool for generating dossiers, but I've been experimenting with Kimi 2.2.
>>107610194Failed to mention, I feed these documentation to the LLM for factual code generation.If you want an LLM to write correct code, give it documentation.For libraries, the language, the algorithm, etc.For example, I have a text file called "~/aleph/txt-literature/literature-on-concurrency-and-garbage-collection.txt". It's a file that contains several books and papers on the titular subjects. I use `pdftotext` to convert them to text, concatenate them, etc. This is the Fish function that does most of the work:# Defined in /home/chubakpdp11/.config/fish/functions/convdir2txt.fish @ line 1function convdir2txt set -l ncat 0 set -l nprod 1 for f in * set -l bnm "$HOME/$argv[1]-$nprod.txt" ebook2txt $f $bnm set ncat (math $ncat + 1) if test (math $ncat % $argv[2]) -eq 0 set nprod (math $nprod + 1) end endendPS: I got a tetromino in retarded new captcha! Wish I snapped a screenshot...
# Defined in /home/chubakpdp11/.config/fish/functions/convdir2txt.fish @ line 1function convdir2txt set -l ncat 0 set -l nprod 1 for f in * set -l bnm "$HOME/$argv[1]-$nprod.txt" ebook2txt $f $bnm set ncat (math $ncat + 1) if test (math $ncat % $argv[2]) -eq 0 set nprod (math $nprod + 1) end endend
If you wanna *quickly* write a C compiler, heed my advice:- Download the Fraser&Hanson lcc book, and the source code for lcc, feed it to several LLMs, ruminate, think, understand;- Fire up a language with Tree-Sitter bindings, use it's C grammar to parse the language;- Apply the lcc pipeline, *make it your own*;The only thing that would be on your way is `lburg`, which is the "Code Generator Generator" for lcc, based on BURS algorithm.https://dl.acm.org/doi/pdf/10.1145/131080.131089https://dl.acm.org/doi/pdf/10.1145/203095.203098But unless you're writing your C compiler in C (which is retarded, don't use C for anything but embedded code), then most languages support pattern-matching. So just use the pattern-matching facilities of that language to implement BURS, instead of using `lburg`.Done, you now have written your respectably optimizing C compiler.
You know, it would be a journey to extract Knuth's LR C parser from CWEB's source code.Anyone up to it?
>>107610194Use case? Just a neet with too much time on his hands? Open up GCC and copy their implementation.
>>107610405"Use case" meme uurgh.As it says in the proposal, it's pedagogical.Also, when implementing a C compiler, the most important thing you *don't* wanna do is to copy GCC. Massive piece of shit source code. I like their JIT though, and I plan on using it in my ECMASdcript implementation.Also, I have a job, in program safety. It does not pay much, but it's at least PLT-heavy and most importantly, the furthest thing from webshit.I'm thinking about writing a paper on the use of polyhedral analysis in UAF detection.
>>107610194LLM psychosis
>>107610510I use LLMs to expand my knowledge and learn new things. About complex subjects that people go to school for. Some people use LLMs to generate HTML templates, and these are people whose job it is to write HTML templates.
>>107610530No, you use a sycophantic autocomplete engine as a feedback loop in your delusions of grandeur
>>107610543Funny thing about 'sycophantic', because one of them (Grok) just shat over one of my ideas:https://gapgpt.app/share/ea2ef490-624b-419c-838f-24f1ab1cb6edJust face it, anon, LLMs are tools, that are getting better over time. I apologize if your sysadmin shell scripting job is taken over by them, and you are no longer "The Big Cheese" because you can write two lines of ECMAScript. But dem's the breaks.Use them to learn new, much more complex subjects. Something beyond webshit.