Google Translate is now LLM-bsaed
>>109003199Finally
it's extremely prone to hallucinations
>>109003199Don't they have a "cross-attention" architecture specifically for translating? Just make that better instead of prompting gemini flash
Same with search. I often search for song lyrics, and unless I'm very clear in the request that these are lyrics, it gives me advice instead of information about the song.It's fascinating to see trillion dollar companies fumbling so much. On the one hand it's magical that this is even possible, on the other, it's clear that they're all scrambling. Both genius and amateur hour.
>>109003310yandex is now better with song lyrics. i can even search the exact title of a youtube video and it will show me "related" or "recommended" videos first
>>109003199Always was. You are obviously young. You used to be able to make it say creepy phrases by feeding it gibberish.The transformer architecture was invented by Google for Google Translate.
>>109003332the underlying architecture has changed. i've been feeding everything from valid sentences, to slang, to utter bullshit and mojibake into translate for years. something has changed about the way it hallucinates. it's much less literal. it clearly engages in chatbot behaviors, such as em-dashes, or even talking about the text in some rare cases instead of attempting to translate it.
Are LLM translations actually good now?Can you take foreign subtitles and translate them to english and actually get a good result?
>>109003199It's been for a while. I knew it was for sure when translations stopped being too literal, it's a subtle shift.
>>109003199>esta hermoso>si>te cachoyou illiterate ngroid
google translate is a function x -> emdash for all x
>>109003199Yes and apart from some pathological cases, it seems to work better for it. It's been performing way better at translating casual posts that contain idioms, slang, and misspellings.
>>109003310Jewgle is completely jeeted and broken now.I search verbatim text from reasonably popular sites and it can't find shit other search engines find easily.
>>109003199transformer was literally invented for google translate..it's been like a year since they used smaller repurposed gemini
>>109003382i think so. i couldn't make sense of a certain piece of text when i used yandex translate, so i plugged it into a chatbot and it gave a good translation which was pretty accurate and didn't read like a literal translation
>>109003802Google's moat is its infrastructure (and the fact that now other services like cloudflare gatekeep most of the internet against bots) but if they break their service too much, someone else can always take the lead in search. Seems unthinkable now, but those are things that happen if a company loses its way too much. The bigger question would be whether most people prefer regular search or those AI summaries.
I should look into that AI manga translator again. Surely they have local LLMs working by now...
>>109004055Yes, and GPT-image-2 or Nano-Bana-2 can likely make the translation fit in the pages in an acceptable way. It is still recommended to first have a better model translate and just use those for text insertion though.
>>109004128The tool already does text extraction and typesetting well enough, the only thing missing was decent translation. I was hoping LLMs would do better.The tool probably can't feed images to the translation model though and japanese relies a lot on context so idk if it'd be good.
>>109004151Gemini is more than likely a better translator than Google Translate. Feed the page to Gemini Pro.
>>109004151>>109004162To continue, most frontier LLMs have vision now. Some local ones do too. You'll want to give good instructions, but you don't even need to extract the text beforehand. If you ask a current LLM to OCR a full page of text, it would likely give sub par results, but for the amount of text that's on a manga page it should be more than able to do that perfectly fine.
>>109004170That's nice to know. The tool I mentioned (manga-image-translator) hasn't been update in a year, so it's not likely to have gained any significant features since I last used it.I would still prefer something fully automatic and local though so I guess I'll just wait lol.
>>109004190It is pretty automatic. Here is with a random image from https://global.discourse-cdn.com/wanikanicommunity/original/4X/d/f/3/df3ad7cc55b6398b6c1b5da53580d946e734e0e7.jpegYou'd have to build a pipeline and work on the prompt a bit so that it doesn't feel the need to preface with anything, etc., but it seems to work to me.
>>109004190>>109004224Source image
>>109004190>>109004224Translation
>>109004231pretty dogshitwestern ai hates rtl comics. they do not understand manga and it is very hard to make them understand.
>>109004229>>109004231>見えないのなら | これではどうだ?>If you cannot see it... How about *this*? | If you cannot see itPerfect for gorgeous looks, can push asap.
>>109004235>>109004237Might be. I don't speak moon runes. The images models are no good for translation I guess, so what was said here >>109004170 regarding that it's better to have a smarter model translate first seems to stand. I still think they're neat for text insertion. The SHWACK and the DA might be dumb, but it's still cool.
>>109004235>>109004237Here it is again asking the model to translate it first and only insert the translation after. Don't know if it's better or worse but it's different.
>>109004248You can tell from >>109004237 that the translation doesn't make much sense. As a matter of fact it should be >見えないのなら/If you cannot see it>これではどうだ?/How about *this*?the reason it fucks up here, I happen to know, is because it hates when comics are rtl not ltr>>109004276same fundamental fuck up: it does the lines ltr instead of rtl. I was trying to coerce gemma4 to give me the correct panel order for a manga page and it couldn't. A larger model should be able to do it but I can see it struggling here.
>>109004055https://github.com/zyddnys/manga-image-translatorThis one? I could never get it working myself.
>>109003382even with bullshit models you can get pretty ok translations, much, much, much better than something like tesseract. ive translated a couple of german and french academic texts with gemmatranslate:12b and they seem pretty good but still require a lot of manual review
>>109003254with this we can finally understand wingdings
>>109004589I got to work after downloading 15GB of libraries, but since it cannot handle context a lot of the translation ended up nonsensical.Then I updated python and it stopped working.
Speaking of gemma
>>109003199So Yandex translate it is.
>>109004738Gemma 4 thought the Canadian penny "was retired" in 2034 lol
>>109003199I tought it always was LLM-based?
>>109007309
>>109007210It depends on what you mean by always. It used to be hardcoded rules, then moved to statistical language models, then moved to non GPT type large language models, and not it moved to "LLMs" (GPT style large language models).
>>109004738But is it really?
>>109003199Wait is it LLM based for all languages right now? I want to test some Gaelic which is usually a little weird when translated, is it gonna be worse?
>>109011605Seems about as bad as it used to be. Ill get a few more tests.
>>109003199isn't this better? The ai can at least guess at context when before the machine didnt even try
>>109011605It takes things very literally.
>>109011605>>109011763It seems to have forgotten some word meanings and will need much more data before reaching the same quality, hallucinating some other meaning in the meanwhile.
>>109003199Are you guys using the translate.google site or some interface baked on the google search? Or is it just skill issue on my part?
>>109011801just the default translate.google.com one for the gaelic. try this one >>109007320 and see what you get.
>>109011801It is an instruction following model, they've obviously put in good safeguards, but it still leaks through.
>>109011828kek>>109011605this one would have tripped me up when i was learning gaelic. To go "siar" on work is to go back over it/double check it.
>>109011828Remarkable. If I add the "so", it does your thing, but if I remove, it does this. If I keep adding like "I told you something" it keeps working like yours.Indaresting.>>109011824For years we could throw shit at it and it would translate. There as a meme sometime ago about poopen sharten farten or whatever.
>>109011864>For years we could throw shit at it and it would translate.I remember it just taking a full word over in the translation if it didn't know the word. Like in picrel here it would take the fake words "cuireatháineacht" and "ghlabhan" and translate it to "english" as "A cuireatháineacht is in the ghlabhan". Now instead it seems to try to force some meaning into it.
>>109011888Not a single word here is real. I guess to some extent it so fun to play with. This is kind of like making hitler sounds, transcribing them, then throwing them in here to see what they could theoretically mean.
>>109011888 Checked>"A cuireatháineacht is in the ghlabhan"Oh, I see. I never noticed this phenomena before, so I can't comment on it. Because I'd usually throw full nonsense sentences (which it would autodetect as some random language and translate) or fully real sentences. I don't remember trying mixed like that.
>>109011904>Not a single word here is realThis always worked like this. It usually guessed some random language and gave a plausible sounding translation for it. The ching chong examples I remember very similar stuff years ago.
>>109011904picrel is from 2025, jun 6.https://archive.4plebs.org/pol/search/ocr/poopen%20sharten/page/5/
>>109003199it will ignore important details in the translation if it feels like it now, terrifying
>>109011914I may be mistaken then, or it could have just been Gaelic, but I remember a lot of scenarios where it would just not translate things. It still does that if it thinks you've selected the wrong language it seems.
>>109004628Type NYC 911 in wingdings, you will understand
>>109011951I think it may have just been gaelic since arabic (chinese didnt do this) has given a translation for this, even while it suggests Irish.
>>109012011forgot the pic
Another thing I don't remember it having is guessing intent and incomplete words.
>>109012054>can't even enjoy a ho without some other man cucking
>>109003254Old Google translate only did that since 10 years ago.I remember thread made about Google Translate randomly hallucinating infomercial babble out of thin air.
Is there a way to make it take Rust and convert it to C? Maybe some language is more likely to produce C or it could be prompted to do so?
>>109003199who cares about your thirdie language
>>109012199This. English is the language of Brahmans.
>>109012346
>>109003382The real strat is to also have some surface level understanding of the language (eg. including vs omitting ですよ) to complement reading the subtitles.
>>109003382Translation is one of the few things LLMs are actually good at. The big issue is that they can hallucinate.