[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: Chat-GPT-logo.png (37 KB, 768x404)
37 KB
37 KB PNG
Why is AI so useless at data processing? I ask it to do some fuzzy matching on two lists and give me the confidence score of how close the match is, and for about 5% of the data it ends up matching to the wrong item while giving a 100% confidence score, when there's a perfect match that it ignored. This sort of thing is one of the only real productive use cases I can imagine for AI, and it fails miserably at it.
>>
>>107613076
What model? Is it a reasoning model, if so what kind of reasoning budget are you using? Are you running each item separately or trying to batch them?
>>
>>107613076
it's a scam
>>
>>107613107
I don't know what any of that shit is. I'm just writing prompts at chatgpt.com. The specific thing I was trying to do was this: I have two lists of movie titles with their release years, but with slight variations in spelling and punctuation in some of the titles and some of the years are off by +/- 1. So I asked it to map one list to the other, and it ended up mapping some to the wrong movie, even when there was an exact match with title and year in both lists. I've had similar results when trying to do other kinds of data sets too. It can never give me an accurate result no matter what kind of data processing I'm trying to do and it ends up being just a complete waste of time because I have to manually go through the entire list to find the mistakes.
>>
>>107613168
>I'm just writing prompts at chatgpt.com
I had a feeling that was the case. That's the least powerful and least effective way to use LLMs.
If you can program, write a script to do this. Experiment with different ways to present the data to the model. Given lists A and B, I would probably make len(B) calls where each call includes the full list A plus one entry from B. Ideally cache the prompt after the last entry of A to save on cost. Then do something about any leftover entries.

If you can't program then you could try vibecoding it with something like Gemini.
>>
>>107613076
youre asking a language model to do math and score data
>>
>>107613213
It would take less time for me to just go manually go through my list of 7000 entries than experimenting with how to program something like this. I could just filter for the non perfect matching columns in Excel and only have a few hundred entires to have to manually fix. The only reason to use AI was to save me the time of having to do manual work and it fails at that.
>>
>>107613076
Useless task
>>
I agree with OP
"AI" should be able to do this by now.
>>
>>107613271
AI absolutely can do this.
The consumer-facing website chatgpt dot com apparently cannot do this, which makes sense because it's limited in the tools it can access and is optimised to aggressively save on tokens and to use the cheapest model it can.
Using a different client and forcing use of a high-end reasoning model might be enough.
>>
>>107613076
Ask it to write a python program that does the task and to execute it on your data and it will always succeed.
Skill issue.

But don't worry, language models will soon learn how to do this by themselves, in the background, without even telling you it's happening and then they can do tasks everyone thought would be impossible for LLMs.
>>
>>107613454
They already do. OP is just used the free demo that's not intended to actually be useful for anything.
>>
>>107613239
The only reason to use AI was to save me the time of having to do manual work and it fails at that.
Skill issue gramps. You're a promptlet and you don't know how to use LLM effectively, plain and simple. Git gud.
>>
>>107613076
Just ask it to give you a python script to make that task. Most certainly it will nail it in the first try.
>>
>>107613168
Ohhh I see now you're just stupid or 12 years old.
>>
>>107613076
Are you making it guess the probabilities? You're supposed to make it write code retard??
>>
>>107613076
>give me a confidence score
NIGGER ITS NOT THINKING
ITS JUST MATCHING ONE WORD AFTER ANOTHER BASED ON STATISTICAL LIKELIHOOD
It's fucking 2026 almost
>>
Lol at the elitist retards ITT
That's actually not an easy task for LLM

The problem is that you have a bag of problems to solve:
1) the first line of list A might have a movie mispelled in title but correct year,list B the opposite
2) two movie can have same name, different year and it's actually two movies
3) sometimes the mispellings are just abbreviations
4) every time there is a mismatch you should track both titles on the side so that you can rematch them

You should really try to simplify the problem before feeding this bag of hair to the LLM
I would start with List A, forget the year, only the titles.
Would ask the LLM to check if every movie in this list is an existing movie, if it has an alternative title, or a more correct spelling. Do the same for list B
do some line by line manual scaffolding and reordering with excel
at this point you should have movies titles in a lockstep and you would have more chances in asking LLM to check the year (don't ask probabilities, just ask the correct year). After that go to another LLM and ask to verify the years again. Any non correspondance you should check by hand
>>
File: Timeline 22.mp4 (1.69 MB, 720x576)
1.69 MB
1.69 MB MP4
>>107613119
Dunno, I love it desu. Absolutely no coding experience and it took me four hours to make a 4chan app for iOS just telling Gemini what i want, copy pasting where it told me to in Xcode, and having it change bits here and there that I don't like.

>can download indiviual images/videos or the whole threads images to Files in a thread-specific folder
>full mp4/webm support with audio
>can favourite threads and come back to them
>can switch between list and grid view for boards

If I worked in coding I would be shitting myself right now
>>
>>107614531
That’s actually insane
>>
>>107613213
lotta cope for pointing how next word prediction isnt' great at basic things anon
>>
>>107614531
It's really great you could do it. And, yes it's impressive if you have never coded (it's a very very very simple program tho). But if you were working in coding (and not a jeet or a useless codemonkey), you can take advantage of it too. I work in industrial automation and I'm very happy using LLMs in my workflow.
>>
>>107614531
they are shitting themselves, everyone uses AI and wonder where the axe will fall next. Their entire future pivots on how well they are able to leverage AI to be 10x more productive than the retard shmuck that can't use AI well
>>
>>107614531
You made a crappy front end. Why do you feel so proud? You ever run into the acronym WYSIWYG? Congratulations AI has almost caught up to Geocities.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.