/g/ - Is there an ai agent I can use to search my entire - Technology

Anonymous

05/20/26(Wed)01:28:26 No.108863866

File: 1769315636523.jpg (65 KB, 960x895)

Anonymous 05/20/26(Wed)01:28:26 No.108863866 Archived

Is there an ai agent I can use to search my entire meme folder and find me the meme/screencap with the text in the image I want? I have like 160,000 meme images

Anonymous
05/20/26(Wed)06:19:48 No.108864944

Anonymous 05/20/26(Wed)06:19:48 No.108864944

yes this blog steps you through it

https://danielvanstrien.xyz/posts/2024/11/local-vision-language-model-lm-studio.html

Anonymous
05/20/26(Wed)06:33:50 No.108865023

Anonymous 05/20/26(Wed)06:33:50 No.108865023

I'm thinking of a straightforward program that's just a window or sliding pane on the desktop where you can just ask "find the image with X and Y" and it'll open it for you in Explorer.
It would just plugin some cheap to run gemma model with vision, do an indexing of the folders you want and nothing else.

I might actually vibe code something like that if it's possible and it doesn't exist.

>>108864944
I skimmed over it, reads like he's just trying to sort files through the LLM, not actually searching through them.

Anonymous
05/20/26(Wed)06:40:22 No.108865065

Anonymous 05/20/26(Wed)06:40:22 No.108865065

>>108863866
Excire, but its paid

Anonymous
05/20/26(Wed)06:42:35 No.108865078

Anonymous 05/20/26(Wed)06:42:35 No.108865078

>>108865023
>he's just trying to sort files through the LLM, not actually searching through them
i want you to have a critical think here about how each of these tasks are achieved and see if you can perhaps spot the reason this method can achieve the requested aim

Anonymous
05/20/26(Wed)06:49:16 No.108865120

Anonymous 05/20/26(Wed)06:49:16 No.108865120

>>108865023
>model with vision
Too expensive for what it is, unless you want description of visuals in the image, not just text recognition.
>>108863866
Probably nothing you can just use right away, I can bet money that with the current state of things (security, vibe slop, bloat, 3-letter fags of all sorts, unemployed scammers etc) it is easier to do it yourself.
Will take like 10 minutes to install docker, zed editor, disable telemetry, create devcontainer config a spin up a container, where you grant all access to claude code and prompt it to do the thing.
If you're too lazy to do that, then maybe you don't really need it.

Anonymous
05/20/26(Wed)06:50:23 No.108865125

Anonymous 05/20/26(Wed)06:50:23 No.108865125

>>108863866
Do you relly want AI training off of your rare pepes?

Anonymous
05/20/26(Wed)08:40:12 No.108865701

Anonymous 05/20/26(Wed)08:40:12 No.108865701

r5y7u8olp'[

Anonymous
05/20/26(Wed)08:52:25 No.108865753

Anonymous 05/20/26(Wed)08:52:25 No.108865753

>>108863866
I don't know, whenever I need to post a Pepe I just search it on iFunny or knowyourmeme. If I were you I'd try googling "ai image describer site:github.com" and see what's what.

Anonymous
05/20/26(Wed)11:13:32 No.108866611

Anonymous 05/20/26(Wed)11:13:32 No.108866611

>>108863866
>Article 13 compliant frog
hahahahhhAHAH

Anonymous
05/20/26(Wed)12:17:27 No.108866968

Anonymous 05/20/26(Wed)12:17:27 No.108866968

Yes, use an embedding model (I think CLIP large could work), it basically transforms image and text into very big vectors, if you convert a text query to a vector the vectors more similar to this one will be the most similar images. Though you'd need to run each image through CLIP and store the embeddings for each, with that amount of images it'd need at least a few hours of processing. I use it for a 8k image folder and it works well.

Anonymous
05/20/26(Wed)12:24:31 No.108867010

Anonymous 05/20/26(Wed)12:24:31 No.108867010

put all images in one big pdf file