[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Human language is too vague and open to interpretation to work precisely with LLMs.
Should we create a new language to be more precise? A new type of code? What would it look like?
>>
>>108570709
Idris.
>>
File: file.png (195 KB, 1025x971)
195 KB
195 KB PNG
i actually agree, but also you reminded of this incredibly interesting story if anyone has a little time to read it https://www.newyorker.com/magazine/2012/12/24/utopian-for-beginners
summary in pic
>>
>>108570709
Markdown
>>
>>108570809
yeah i remember this, john quijada didnt deserve to be the one to have this great idea. he kind of worked on ithkuil ploddingly and noncommittally anyway and it's basically dead now
>>
>>108570809
Actually a very interesting story. I didn't know about Ithkuil. But that's the kind of concept I was thinking about for a LLM-language.

I asked my LLM how this request would be translated to Ithkuil:
> Generate a picture 500x400px 72dpi of a mountain with a forest at the bottom and snow on the tip

Apparently, in the ithkuil logic, it would be encoded to:
> I request that you intentionally cause the creation of a visual representation with specified dimensional constraints, depicting a mountain such that forest occupies its lower region and snow its upper extremity.

For a final result:
> Ašţal-rrüļ 500x400px-72dpi ţļëiţröu
> mraļtëxţuřaň äskaļļiţřöu
> žňolţuřaň elwëiţřöu

If this language were to become the dominant LLM language, I can only imagine the gatekeeping of the people who are skilled at it lmao
>>
>>108570933
llms are bad at conlangs for obvious reasons, it's probably not right about that
>>
File: 1659376944990960.png (222 KB, 700x700)
222 KB
222 KB PNG
>>108570709
thats called programming humor, and retarded subhumans dont understand it ether and will get angry if you tell someone exactly HOW to do something
>>
>>108571289
Human language on its own is retarded. That's why we have code which is closer to maths
>>
>>108570709
Take http://unison-lang.org/, https://github.com/unclebob/AIR-J, https://gtoolkit.com/ and https://github.com/HigherOrderCO/Kind, https://hazel.org/, put it in blender, mix thoroughly, drink. Maybe season with Odersky's latest publication on using effects to sandbox sideffects.

TL;DR language's native representation will be hash-consed depdently typed terms that can be bidirectionally reprojected into any visualisation form (concise annotated AST for LLMs text or graph or whatever bespoke UI is the best fit for the given domain for a human - writing a plane control software? How about a flight some built into your program to test it) and the LLM (or the human) can't write wrong code with because all edits are semantics-preserving transformations of the program and dependent types are a proof it doesn't do whatever the fuck it wants.

I would write one if I was smart enough, but alas.
>>
File: 1767004690394713.png (1.17 MB, 1920x1080)
1.17 MB
1.17 MB PNG
>>108570709
>Human language is too vague and open to interpretation to work precisely with LLMs.
No they aren't. You're a useless halfwit that got filtered.
>>
>>108571825
It is. That's why all softwares are written in a code derived from maths, and not in full sentence with our monkey language.
And that's why you're stuck doing AI slop image gen thinking you're in blade runner.
>>
>>108571909
You made this post to sound smart. You can't even name a use case for this excuse of yours to reinvent the wheel.
>>
>>108570709
Yeah it's called CODE
retard
>>
>>108571955
I'll use words you dumb monkey prompter will understand:
> Make it bigger
> No less big
> Okay fair enough
> Now it put that element at the bottom
> No, I mean give it some space
> try at 240px from bottom?
> Don't remove that background, put it back
> ...
>>
>>108571289
>carton of milk
>gets a jug
>>
>>108571975
>>108571975
Why don't you implement this magical fix of yours yourself then since you're so smart
>>
>>108571983
You don't need to be smart to see our language is a barrier to efficiently work with LLM.

No one sane would use a hammer with words. You just take it and hit with the force you judge necessary.
The most accurate way we found to communicate with machines is code, so that's the logical next step when using AI as a tool.
>>
>>108572015
You're not even understanding what you're talking about. When you talk to an LLM the LLM does not actually see the text you input. Your text gets converted into tokens (numerical representations) and that's what the model "sees" and then predicts what "should" come next. So if you want more efficiency you talk to the model in pure token form, which means we should be figuring out ways to more effectively turn the input into pure tokens or just use pure tokens. So why the flying fuck would you suggest we talk to the models and code next, which in itself is untokenized data? The biggest bottleneck with speed is memory throughput anyway and the inherent limitations of the transformers architecture, Not whatever schizo shower thoughts nonsense you're talking about.
>>
>>108572047
> So if you want more efficiency you talk to the model in pure token form, which means we should be figuring out ways to more effectively turn the input into pure tokens or just use pure tokens.
You've read the thread, GG. So what would it be? In a hypothetical 100% deterministic model.
>>
>>108572130
That already exists. It's called tokenization. You can expect people to interact with their computers purely in binary or purely in machine code even though that would technically on paper be the most efficient way to do ANYTHING on a computer. The unequivalent would simply be interacting with it and pure tokens. I'm sure you understand why that's not practical currently and never will be. If we want these things to be faster you have to make more efficient hardware and possibly invent more efficient architectures or improve the already existing ones
>>
>>108572185
There's a middle ground between writing binary code, and:
> "align item 50px 25px"
and
> "Put that element at top 50px and right 25px of the border"

There's too much useless noise and potential misinterpretation. It must be eliminated, it can't stay like that.
This is the prehistoric age of interaction. In 10 years we'll look back and have a good laugh.
I can't imagine that we'll interact with AI without some form of framework to standardize our requests for serious tasks.
>>
>>108572312
Pound the fuck are we going to implement that then, Mr shower thoughts? For this thought experiment to work the average user would have to understand how LLM tokenization works in the first place, or we would have to somehow invent a way to instantaneously convert thought or speech into the most efficient message the model would need to receive.
>>
File: sddefault.jpg (20 KB, 640x480)
20 KB
20 KB JPG
>>108572407
90% of developers don't give a shit about binary or hex logic, they just use a bearable language that makes the bridge.
I don't see why that wouldn't be possible here too.

Or maybe we are forever stuck with the current state. Always having to ask it politely to behave
>>
>>108572503
Lets backtrack a little bit. What specific current implementation and user experience with LLM's you currently dislike? Do you dislike the fact that whenever you sent in a prompt the inference engine has to actually process that shit and it takes time?
>>
>>108570709
>Should we create a new language to be more precise
It's called code lmao.
Language as spoken by humans will ALWAYS have ambiguity. People have tried and failed to create natural-language programming languages. The same would apply in reverse to spoken language.
>>
>>108572513
A simple example here:
>>108571975
This should be translated in some sort of code/framework/device with pencil, whatever. Something precise.
Talking to it like we do with useless words like "no" "bigger" "try" "okay" "thanks" makes zero sense in term of production. It just adds randomness, errors, misinterpretation. Our language isn't adapted.
>>
>>108572539
So you're frustration is that sometimes the LLMs will fuck shit up with tool calling or very specific tasks. You propose if we bridge the gap between words / text and the actual tokenization that will somehow make them perform better? What kind of tools have you been using that led you to think the English language is shit for these kinds of tools?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.