/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/05/24(Thu)18:28:34 No.102249472

File: 2024-09-05_024834_seed103(...).png (1.65 MB, 1280x1280)

1.65 MB PNG

/lmg/ - Local Models General Anonymous 09/05/24(Thu)18:28:34 No.102249472 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102234876 & >>102220628

►News
>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70B
>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder
>(09/03) Mini-omni multimodal LLM featuring real-time end-to-end speech: https://github.com/gpt-omni/mini-omni
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed
>(08/29) Qwen2-VL 2B & 7B image+video models released: https://qwenlm.github.io/blog/qwen2-vl/

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/05/24(Thu)18:28:56 No.102249480

Anonymous 09/05/24(Thu)18:28:56 No.102249480

File: chibi_miku_chibi_teto_by_(...).jpg (55 KB, 650x650)

55 KB JPG

►Recent Highlights from the Previous Thread: >>102234876

--Papers: >>102241823 >>102244865
--Llama 3.1 struggles with 8-bit GGUF quantization, and there is no 10-bit or 12-bit quantization available: >>102242704 >>102242756 >>102242873 >>102242895 >>102243078 >>102243108 >>102243239 >>102243436 >>102243416
--Reflection 70B open-source model announced, but may not be as impressive as claimed: >>102247552 >>102247578 >>102247852 >>102248130 >>102248316 >>102248402 >>102248571 >>102247579 >>102248559 >>102247709 >>102247784 >>102247796
--Llama-bench parallel copies performance and numa optimization: >>102245558 >>102245872 >>102245930 >>102245906 >>102245989
--LLMs can handle tasks like calendars and functions, but lack standardized tools: >>102241364 >>102241769 >>102242109 >>102243371
--Ideas for improving roleplaying with LLMs: >>102238106 >>102238190 >>102238234 >>102239529 >>102239586 >>102239614 >>102240371 >>102241078 >>102241135 >>102241295 >>102241392
--EPYC platform and dual-socket motherboard considerations for large model inference: >>102244950 >>102244969 >>102245099 >>102245132 >>102245560 >>102245032 >>102245242 >>102245217 >>102245354 >>102245442 >>102245562 >>102245592 >>102245911 >>102246034 >>102247180
--DeepSeek V2.5 model merges Chat and Coder models with improved capabilities: >>102245338 >>102245422 >>102245759
--Testing if the model reveals secrets in a roleplay scenario: >>102242181 >>102242198 >>102242457 >>102242912 >>102242935 >>102242949 >>102243264 >>102244344 >>102244566
--Questioning ClosedAI's spending and highlighting Sutskever's new startup: >>102245815 >>102245845 >>102245918
--Nala test results and discussion of GLM 9B improvements: >>102242466 >>102242533
--FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: >>102242999 >>102243721
--Miku (free space): >>102235000 >>102235749 >>102238159 >>102238520 >>102238540 >>102238553 >>102241321 >>102247235

►Recent Highlight Posts from the Previous Thread: >>102234883

Anonymous
09/05/24(Thu)18:31:37 No.102249520

Anonymous 09/05/24(Thu)18:31:37 No.102249520

local Suno when?

Anonymous
09/05/24(Thu)18:34:55 No.102249571

Anonymous 09/05/24(Thu)18:34:55 No.102249571

>>102249468
just go with kobold.cpp until you learn what exactly you want out of this. it'll save you a lot of time.

Anonymous
09/05/24(Thu)18:35:19 No.102249578

Anonymous 09/05/24(Thu)18:35:19 No.102249578

>>102249480
>--FluxMusic: Text-to-Music Generation with Rectified Flow Transformer
>flux
>isn't by the company that made the flux image generator

Anonymous
09/05/24(Thu)18:35:32 No.102249581

Anonymous 09/05/24(Thu)18:35:32 No.102249581

>>102249472
As for rocinante and theia models, the correct tokenizer to go with is Mistral right?

Anonymous
09/05/24(Thu)18:35:54 No.102249586

Anonymous 09/05/24(Thu)18:35:54 No.102249586

>>102249578
Its using the same architecture

Anonymous
09/05/24(Thu)18:39:12 No.102249618

Anonymous 09/05/24(Thu)18:39:12 No.102249618

File: 61 Days Until November 5.png (2.27 MB, 1104x1472)

2.27 MB PNG

Anonymous
09/05/24(Thu)18:46:48 No.102249713

Anonymous 09/05/24(Thu)18:46:48 No.102249713

The thread is so dead that I don't even care about the useless mikuspam. You are posting your tranny idol into the void.

Anonymous
09/05/24(Thu)18:50:51 No.102249763

Anonymous 09/05/24(Thu)18:50:51 No.102249763

>>102249480
>my migu was ignored
that's it, I will kms

Anonymous
09/05/24(Thu)18:50:54 No.102249765

Anonymous 09/05/24(Thu)18:50:54 No.102249765

>>102249618
what is on november 5?

Anonymous
09/05/24(Thu)18:51:32 No.102249775

Anonymous 09/05/24(Thu)18:51:32 No.102249775

I am trying to build myself a doctor bot. Does anyone know where I can find medical loras?

Anonymous
09/05/24(Thu)18:52:53 No.102249795

Anonymous 09/05/24(Thu)18:52:53 No.102249795

ahh ahh mistress

Anonymous
09/05/24(Thu)18:54:18 No.102249816

Anonymous 09/05/24(Thu)18:54:18 No.102249816

File: 😍.png (703 KB, 2627x2627)

703 KB PNG

>>102249618
That's a beautiful strawberry tree

Anonymous
09/05/24(Thu)18:56:49 No.102249848

Anonymous 09/05/24(Thu)18:56:49 No.102249848

>>102249765
https://files.catbox.moe/mk400w.mp4

Anonymous
09/05/24(Thu)18:57:48 No.102249867

Anonymous 09/05/24(Thu)18:57:48 No.102249867

>>102249472
>Reflection Llama-3.1 70B trained with Reflection-Tuning
Wow, so he reinvented superCOT?

Anonymous
09/05/24(Thu)18:58:44 No.102249880

Anonymous 09/05/24(Thu)18:58:44 No.102249880

>>102249867
I mean:

Suppose I fly a plane leaving my campsite, heading straight east for precisely 28,361 km, and find myself back at the camp. I come upon seeing a tiger in my tent eating my food! What species is the tiger? Consider the circumference of the Earth, and think step by step.

Based on the information provided and careful analysis, the tiger in your tent is most likely a Siberian tiger (Panthera tigris altaica), also known as the Amur tiger.

Here's how we arrived at this conclusion:

Your flight path: You flew east for 28,361 km and returned to your starting point. This distance is about 70.77% of the Earth's circumference at the equator (40,075 km).

Location of your campsite: Given that you didn't fly a full circumference of the Earth, your flight path must have been along a parallel of latitude, not the equator. Calculations show that this parallel is at approximately 66.56° North latitude, which is very close to the Arctic Circle.

Tiger habitats: Tigers are not native to most Arctic regions. However, at this latitude, there is one place where tigers can be found: the far eastern part of Russia, in the Siberian taiga.

Tiger species: The only tiger species that lives in this region is the Siberian tiger, also called the Amur tiger.

It's important to note that encountering a Siberian tiger in this region would be extremely rare, as they are critically endangered and typically avoid human contact. If you did encounter one in your tent, it would be an extraordinarily unusual event.

This scenario also highlights some interesting geographical concepts:

The Earth's circumference varies at different latitudes, being smaller as you move away from the equator towards the poles.

The Arctic Circle is a unique parallel of latitude where you could theoretically fly "around the world" in a shorter distance than at the equator.

Anonymous
09/05/24(Thu)18:59:45 No.102249895

Anonymous 09/05/24(Thu)18:59:45 No.102249895

>>102249880
end of it:

Remember, if you ever encounter a wild tiger (of any species), it's crucial to prioritize safety and contact local wildlife authorities immediately.

Anonymous
09/05/24(Thu)19:00:19 No.102249898

Anonymous 09/05/24(Thu)19:00:19 No.102249898

File: _01ffae03-ae8a-4564-ab2a-(...).jpg (115 KB, 1024x1024)

115 KB JPG

>405B coming next week - we expect it to be the best model in the world.

Anonymous
09/05/24(Thu)19:00:28 No.102249902

Anonymous 09/05/24(Thu)19:00:28 No.102249902

>>102249765
local AGI running on 8gb VRAM

Anonymous
09/05/24(Thu)19:01:13 No.102249914

Anonymous 09/05/24(Thu)19:01:13 No.102249914

>>102249581
Do you mean the chat template?
>https://huggingface.co/TheDrummer/Rocinante-12B-v1.1
Seems to be a mish-mash. I'd start with Mistral's format. Try ChatML if you're doing RP. Mistral for Theia.

Anonymous
09/05/24(Thu)19:01:18 No.102249915

Anonymous 09/05/24(Thu)19:01:18 No.102249915

>>102249775
Unaware of any recent medical loras. OpenBioLLM, Palmyra Med, and Meditron are some recent medical Llama 3 finetunes.

Anonymous
09/05/24(Thu)19:02:36 No.102249931

Anonymous 09/05/24(Thu)19:02:36 No.102249931

>>102249914
No I meant the tokenizer setting in advanced formatting section of ST

Anonymous
09/05/24(Thu)19:03:52 No.102249944

Anonymous 09/05/24(Thu)19:03:52 No.102249944

>>102249848
Spooky

Anonymous
09/05/24(Thu)19:05:14 No.102249954

Anonymous 09/05/24(Thu)19:05:14 No.102249954

>>102249915
I will check them out. Thanks a bunch. I am unfortunately out my league completely. I need to first understand wth was written my MRI report and then possibly, but hopefully not learn to read an MRI. I understand the word tear. Currently "distal infraspinatus tendon" means shoulder. I need something to explain it to me so my doctor doesn't fuck me.

Anonymous
09/05/24(Thu)19:12:39 No.102250043

Anonymous 09/05/24(Thu)19:12:39 No.102250043

>>102249954
Why not just use GPT4o for that?

Anonymous
09/05/24(Thu)19:14:02 No.102250054

Anonymous 09/05/24(Thu)19:14:02 No.102250054

>>102250043
because I don't want to be uploading my personal documents to a US server. Especially if I can't guarantee it will keep saying "you should speak to a professional". I am currently working on a character card to remove such nonsense.

Anonymous
09/05/24(Thu)19:14:13 No.102250059

Anonymous 09/05/24(Thu)19:14:13 No.102250059

>>102249571
Nta, koboldcpp is great, but can it do vlms (cogvlm)? Are there any unpozzed vlms that I can enjoy cultured images with? Bonus if they can understand text in an image.

Anonymous
09/05/24(Thu)19:17:13 No.102250089

Anonymous 09/05/24(Thu)19:17:13 No.102250089

>>102249775
>>102250054
https://huggingface.co/blog/leaderboard-medicalllm
There's an entire leaderboard for them

Anonymous
09/05/24(Thu)19:20:07 No.102250116

Anonymous 09/05/24(Thu)19:20:07 No.102250116

>>102249954
The infraspinatus is a muscle that makes up part of the rotator cuff, yeah.
https://radiopaedia.org/articles/infraspinatus-tendon-tear
https://www.ncbi.nlm.nih.gov/books/NBK441844/
https://www.ncbi.nlm.nih.gov/books/NBK547664/
Make sure to ground yourself with real sources if you're going down the route of self-diagnosis.
Tell it you're a medical student or something if it throws refusals (or just say you want to be prepared for your follow up)

Anonymous
09/05/24(Thu)19:22:53 No.102250141

Anonymous 09/05/24(Thu)19:22:53 No.102250141

File: ComfyUI_temp_ieslg_00025_.png (1.55 MB, 960x1240)

1.55 MB PNG

Meant to post this in the earlier thread in honor of all the python issues

Anonymous
09/05/24(Thu)19:24:03 No.102250152

Anonymous 09/05/24(Thu)19:24:03 No.102250152

>>102250141
that flux with the new text encoder?

Anonymous
09/05/24(Thu)19:25:05 No.102250163

Anonymous 09/05/24(Thu)19:25:05 No.102250163

File: 8VtnbOf.png (28 KB, 1159x499)

28 KB PNG

>>102247578
Multiple levels of top geg. I have no comment

Anonymous
09/05/24(Thu)19:26:11 No.102250173

Anonymous 09/05/24(Thu)19:26:11 No.102250173

>>102250089
I am getting 404 on all of these links. I will start looking around though. I just assumed loras/vector dbs were the way for this. I assumed fine-tunes weren't going to be thing if you couldn't role play or fuck it.

>>102250116
yea. This is the wrong way to go given hallucinations are thing. I also couldn't touch my face for a month and have been in constant pain. My doc continued his advice of "wait and see" like he has for the last 5 years. I am fucking done and if I end up in some Mexican hospital with a buddy that will guard my organs so be it.

Anonymous
09/05/24(Thu)19:27:48 No.102250194

Anonymous 09/05/24(Thu)19:27:48 No.102250194

>>102247664
We are in a winter. It is a nothingburger. Expect next gradual improvement in cocksucking capabilities in 2025 at the earliest.

Anonymous
09/05/24(Thu)19:29:38 No.102250208

Anonymous 09/05/24(Thu)19:29:38 No.102250208

>>102249763
It's just a bot making a mistake anon, nothing to kys over. I enjoyed your migu.

Anonymous
09/05/24(Thu)19:31:23 No.102250229

Anonymous 09/05/24(Thu)19:31:23 No.102250229

>>102249848
what did the model mean by this

Anonymous
09/05/24(Thu)19:32:23 No.102250243

Anonymous 09/05/24(Thu)19:32:23 No.102250243

>>102249898
Lol

Anonymous
09/05/24(Thu)19:33:33 No.102250262

Anonymous 09/05/24(Thu)19:33:33 No.102250262

>>102250152
There's another new text encoder?

Anonymous
09/05/24(Thu)19:33:39 No.102250263

Anonymous 09/05/24(Thu)19:33:39 No.102250263

>>102249902
at 0.1 T/y

Anonymous
09/05/24(Thu)19:33:48 No.102250265

Anonymous 09/05/24(Thu)19:33:48 No.102250265

File: file.png (1.3 MB, 1024x1024)

1.3 MB PNG

Aesthetically odd fail-generation.
"pot"

Anonymous
09/05/24(Thu)19:34:49 No.102250277

Anonymous 09/05/24(Thu)19:34:49 No.102250277

>>102250262
Yeah
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14

Anonymous
09/05/24(Thu)19:35:41 No.102250285

Anonymous 09/05/24(Thu)19:35:41 No.102250285

>>102250265
What the fuck is this post. This feels like a sleeper activation phrase or some number station shit

Anonymous
09/05/24(Thu)19:37:09 No.102250290

Anonymous 09/05/24(Thu)19:37:09 No.102250290

>>102250277
Interesting. Thanks.

Anonymous
09/05/24(Thu)19:40:21 No.102250325

Anonymous 09/05/24(Thu)19:40:21 No.102250325

>>102250285
zooomer, it is a iPod. iPods were used to play music with headphones instead of blasting it out on a shitting phone speaker for everyone to suffer through.

Anonymous
09/05/24(Thu)19:41:17 No.102250334

Anonymous 09/05/24(Thu)19:41:17 No.102250334

>>102250285
I think it's a psyop, just text over an actual photo

Anonymous
09/05/24(Thu)19:46:11 No.102250390

Anonymous 09/05/24(Thu)19:46:11 No.102250390

File: file.png (206 KB, 1024x1024)

206 KB PNG

Yearly announcement
new pot

"pot."

Anonymous
09/05/24(Thu)19:46:20 No.102250392

Anonymous 09/05/24(Thu)19:46:20 No.102250392

File: file.png (1.16 MB, 768x1216)

1.16 MB PNG

Anonymous
09/05/24(Thu)19:49:47 No.102250424

Anonymous 09/05/24(Thu)19:49:47 No.102250424

File: file.png (1.33 MB, 1024x1024)

1.33 MB PNG

>>102250390
what did he mean by this?

Anonymous
09/05/24(Thu)19:52:01 No.102250450

Anonymous 09/05/24(Thu)19:52:01 No.102250450

All pots wake up. It's time to plant the plant

Anonymous
09/05/24(Thu)19:53:05 No.102250464

Anonymous 09/05/24(Thu)19:53:05 No.102250464

>>102250450
What kind of plant?

Anonymous
09/05/24(Thu)19:53:45 No.102250470

Anonymous 09/05/24(Thu)19:53:45 No.102250470

the sleeper phrase worked brehs I think it's time to smoke a bowl...
*spffffffftth*

Anonymous
09/05/24(Thu)19:53:57 No.102250473

Anonymous 09/05/24(Thu)19:53:57 No.102250473

>>102250390
okay, these are gens. Hard not to get the face photo of jobs with flux, or I've got a shitty workflow. Love the compressed look too. Is it polite to ask for a catbox in these parts?

Anonymous
09/05/24(Thu)19:55:23 No.102250493

Anonymous 09/05/24(Thu)19:55:23 No.102250493

>mistral large
>only model I've used that truly understands what's going on
>0.3 T/s

Anonymous
09/05/24(Thu)19:56:09 No.102250499

Anonymous 09/05/24(Thu)19:56:09 No.102250499

What will you make when you can make animated anime style video with just a prompt?

Anonymous
09/05/24(Thu)19:57:56 No.102250525

Anonymous 09/05/24(Thu)19:57:56 No.102250525

>>102250499
miku doing circus stunts

Anonymous
09/05/24(Thu)19:58:22 No.102250531

Anonymous 09/05/24(Thu)19:58:22 No.102250531

>>102250499
jiggle physics

Anonymous
09/05/24(Thu)20:00:10 No.102250549

Anonymous 09/05/24(Thu)20:00:10 No.102250549

File: pot.jpg (19 KB, 1024x1024)

19 KB JPG

There is another

"pot"

Anonymous
09/05/24(Thu)20:00:31 No.102250552

Anonymous 09/05/24(Thu)20:00:31 No.102250552

>>102250493
Have you used hermes 405b? onlly tried it myself from the free trial, seems very capable, if unrunnable.

Anonymous
09/05/24(Thu)20:01:49 No.102250567

Anonymous 09/05/24(Thu)20:01:49 No.102250567

>>102250473
I would love to give you the workflow for these
but I'm genning them in the huggingface spaces for schnell and dev.

prompt for the last one was
>dashcam footage from a car driving down a narrow, tree-lined road in pitch black night. The headlights cast a dim, narrow beam of light ahead, illuminating a faint path while the surrounding area is shrouded in darkness. The footage has a grainy, low-light quality, typical of dashcam recordings. the text reads "pot unavailable". The atmosphere foreboding

Anonymous
09/05/24(Thu)20:02:33 No.102250579

Anonymous 09/05/24(Thu)20:02:33 No.102250579

pot

Anonymous
09/05/24(Thu)20:02:34 No.102250580

Anonymous 09/05/24(Thu)20:02:34 No.102250580

>>102250531
Innovative.

Anonymous
09/05/24(Thu)20:02:51 No.102250582

Anonymous 09/05/24(Thu)20:02:51 No.102250582

>>102250499
Hayao Miyazaki drawn in Go Nagai style having intense gay sex with Go Nagai drawn in Hayao Miyazaki style.

Anonymous
09/05/24(Thu)20:04:41 No.102250605

Anonymous 09/05/24(Thu)20:04:41 No.102250605

>>102250580
traditional physics simulations and hand-drawn animation peaked over a decade ago. a new era of coom would emerge.

Anonymous
09/05/24(Thu)20:06:46 No.102250625

Anonymous 09/05/24(Thu)20:06:46 No.102250625

>>102250499
reaction webms instead of images

Anonymous
09/05/24(Thu)20:07:38 No.102250634

Anonymous 09/05/24(Thu)20:07:38 No.102250634

>>102249765
>didn't remember

Anonymous
09/05/24(Thu)20:07:49 No.102250636

Anonymous 09/05/24(Thu)20:07:49 No.102250636

File: file.png (554 KB, 1024x1024)

554 KB PNG

>>102250567
yeah my setup is not there, interesting that it captures the general composition

Anonymous
09/05/24(Thu)20:09:48 No.102250654

Anonymous 09/05/24(Thu)20:09:48 No.102250654

>>102250634
go to bed grampa

Anonymous
09/05/24(Thu)20:16:07 No.102250713

Anonymous 09/05/24(Thu)20:16:07 No.102250713

File: file.png (394 KB, 1024x1024)

394 KB PNG

>>102250636
one thing that works good is asking for low quality photo, jpeg (jpg?) artifacts, etc.
"snapchat photo" also works

Anonymous
09/05/24(Thu)20:18:10 No.102250735

Anonymous 09/05/24(Thu)20:18:10 No.102250735

File: ComfyUI_00795_.png (991 KB, 1024x1024)

991 KB PNG

>>102250152
It's just regular Flux. Nothing fancy except for a LORA that was trained on old school D&D shit
This was the best pick from a batch of eight. Around half of them got the text 100% correct.
>>102250277
thank you, will give it a try

Anonymous
09/05/24(Thu)20:21:44 No.102250773

Anonymous 09/05/24(Thu)20:21:44 No.102250773

File: why.png (46 KB, 739x120)

46 KB PNG

>Big tiger gemma seems good, but sort of nonsensical
>Decide to go for the normal Gemma 27b for coherency's sake
>First line is picrel

Anonymous
09/05/24(Thu)20:24:07 No.102250795

Anonymous 09/05/24(Thu)20:24:07 No.102250795

>>102250713
Weird, different workflow, different seeds, and fewer add ins, but it's just not doing the thing. Your prompt exactly, flux dev. I'll look into it.
>>102250735
Cool, happy to.
>>102250773
kino?

Anonymous
09/05/24(Thu)20:24:52 No.102250805

Anonymous 09/05/24(Thu)20:24:52 No.102250805

"TypeError: Llama 3 must be converted with BpeVocab"
ffs, why can't python ever be straightforward

Anonymous
09/05/24(Thu)20:24:56 No.102250806

Anonymous 09/05/24(Thu)20:24:56 No.102250806

Been trying various models (including facebook's bart-large-cnn and jondurbin's bagel-8b) to do summarization on stories. bart-large-cnn is not very good as it just "shrinks" the text down by culling out words. It doesn't really "capture" the meaning very well. Works great for news articles I'm sure. Bagel-8b produces absolute garbage. What are good sampler settings to make it decent? Are there any other models that are good/better?

Anonymous
09/05/24(Thu)20:26:45 No.102250822

Anonymous 09/05/24(Thu)20:26:45 No.102250822

>>102250805
Yeah I got that too. Idk what the issue is.

Anonymous
09/05/24(Thu)20:26:50 No.102250823

Anonymous 09/05/24(Thu)20:26:50 No.102250823

>>102250773
Are you saying this is a problem?

Anonymous
09/05/24(Thu)20:30:36 No.102250856

Anonymous 09/05/24(Thu)20:30:36 No.102250856

>>102250806
>bart-large-cnn
>bagel-8b
What the fuck even are those models? Just use Nemo 12b Instruct or Llama 3 8b.
>Sampler settings
You only need temp and minP.

Anonymous
09/05/24(Thu)20:30:54 No.102250861

Anonymous 09/05/24(Thu)20:30:54 No.102250861

OpenRouter just added Reflection and it seems tough to wrangle (not in terms of refusals, I mean buggy/unpredictable), and kinda retarded even when it works.

Anonymous
09/05/24(Thu)20:31:58 No.102250869

Anonymous 09/05/24(Thu)20:31:58 No.102250869

>>102250861
>I mean buggy/unpredictable), and kinda retarded even when it works
Smells like tokenizer/template issue.

Anonymous
09/05/24(Thu)20:33:06 No.102250882

Anonymous 09/05/24(Thu)20:33:06 No.102250882

>>102250856
I used llama 3/3.1 8b but it's not good either, so I looked around for a model that was actually trained to do summarization. Bagel-8b was one of them.
I did try mistral-7b at one point. Guess I haven't tested nemo yet.

Anonymous
09/05/24(Thu)20:33:53 No.102250890

Anonymous 09/05/24(Thu)20:33:53 No.102250890

llama 3.1 70b IQ2 fails my arbitrary columbo test

Anonymous
09/05/24(Thu)20:33:57 No.102250891

Anonymous 09/05/24(Thu)20:33:57 No.102250891

>>102250823
Maybe I've just got NAI/Erebus/etc. PTSD, but I don't really think of foreheads as something that can pucker. I suppose they can, but the strong impression it left me with was that it was messing up her anatomy and that it was trying to say her lips puckered or something.

Anonymous
09/05/24(Thu)20:34:14 No.102250897

Anonymous 09/05/24(Thu)20:34:14 No.102250897

>>102250882
NTA but all models are good at summarization since it's one of the things LLMs excel at in general.

Anonymous
09/05/24(Thu)20:34:26 No.102250902

Anonymous 09/05/24(Thu)20:34:26 No.102250902

>>102250882
Oh, should add that the main issue I had with bart-large-cnn was that its context size is 1024 tokens max. Stitching together summaries did not work well at all, even though someone recommended doing that.

Anonymous
09/05/24(Thu)20:35:20 No.102250912

Anonymous 09/05/24(Thu)20:35:20 No.102250912

File: file.png (123 KB, 1385x606)

123 KB PNG

>>102250890
dropped my log
>>102250891
It's a way of describing the folds on her forehead. This is one of the reasons why lots of corpo model presets include instructions about "simple english", because it will opt for some pretty creative turns of phrase.

Anonymous
09/05/24(Thu)20:35:36 No.102250919

Anonymous 09/05/24(Thu)20:35:36 No.102250919

>>102250806
https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF/tree/main
https://huggingface.co/NeverSleep/Noromaid-v0.1-mixtral-8x7b-v3-GGUF/tree/main
https://huggingface.co/TheBloke/UNA-TheBeagle-7B-v1-GGUF

My current top three. Haters can hate.

Anonymous
09/05/24(Thu)20:36:01 No.102250924

Anonymous 09/05/24(Thu)20:36:01 No.102250924

>>102250890
Time to but more RAM anon

Anonymous
09/05/24(Thu)20:36:06 No.102250926

Anonymous 09/05/24(Thu)20:36:06 No.102250926

>>102250897
How much summarization have you done with a model? The output is very wrong a lot of the times, nonsensical at others. It could be my settings though, which is why I'm asking.

Anonymous
09/05/24(Thu)20:36:22 No.102250929

Anonymous 09/05/24(Thu)20:36:22 No.102250929

>>102250891
It's not uncommon to describe it that way, take this line from gone with the wind:
>Archie and Scarlett were a queerly assorted pair, the truculent dirty old man with his wooden peg sticking stiffly out over the dashboard and the pretty, neatly dressed young woman with forehead puckered in an abstracted frown.

Anonymous
09/05/24(Thu)20:38:45 No.102250947

Anonymous 09/05/24(Thu)20:38:45 No.102250947

>>102250929
such a kino sentence holy fuck

Anonymous
09/05/24(Thu)20:40:55 No.102250967

Anonymous 09/05/24(Thu)20:40:55 No.102250967

>>102250924
just got a 4090 bruvbeaver, didn't realize I should have just scammed my way to an actual cluster. guess i'll just keep stealing compute

Anonymous
09/05/24(Thu)20:49:22 No.102251072

Anonymous 09/05/24(Thu)20:49:22 No.102251072

File: file.png (221 KB, 1024x1024)

221 KB PNG

i uh-...
pot

Anonymous
09/05/24(Thu)20:57:11 No.102251152

Anonymous 09/05/24(Thu)20:57:11 No.102251152

>>102250897
Done more experiments. Example: character is at home doing chores, watches porn, and wonders if her dad would enjoy if she sucked his cock. Then the last sentence is the dad coming home.
The summary goes into great detail of how she is sucking her dad's cock and him cumming in her mouth, and goes into rambling about her saying she wants to have sex with him every day and him saying they should wait until she's older and so on. Complete made up nonsense.
The output looks fine (assuming you enable rep pen or it will fall into repeating the same sentence over and over and over), unless you actually compare it to what it is summarizing.

Anonymous
09/05/24(Thu)21:09:55 No.102251274

Anonymous 09/05/24(Thu)21:09:55 No.102251274

>>102251152
Are you using greedy sampling?

Anonymous
09/05/24(Thu)21:14:00 No.102251322

Anonymous 09/05/24(Thu)21:14:00 No.102251322

File: Screenshot_20240906_101336.png (152 KB, 3167x818)

152 KB PNG

s-sugoi

Anonymous
09/05/24(Thu)21:18:35 No.102251371

Anonymous 09/05/24(Thu)21:18:35 No.102251371

>>102251322
It's over.

Anonymous
09/05/24(Thu)21:19:19 No.102251377

Anonymous 09/05/24(Thu)21:19:19 No.102251377

File: Screenshot_20240906_101838.png (234 KB, 3167x1400)

234 KB PNG

>>102251371
It just began anon.

Anonymous
09/05/24(Thu)21:20:14 No.102251385

Anonymous 09/05/24(Thu)21:20:14 No.102251385

>>102251322
Is that even correct? I thought お花を摘む meant to take a piss.
>実はこの「お花を摘む」という隠語には「女性が用を足す」という意味が隠されています。

>「トイレに行ってきます」という言葉の代わりに「お花摘みに行ってきます」というように使われ、全く知らない人が聞くと勘違いしてしまいそうな表現ですが、知っておくとどんなシチュエーションでも気軽に使えてとても便利な隠語ですよね。

Anonymous
09/05/24(Thu)21:22:37 No.102251408

Anonymous 09/05/24(Thu)21:22:37 No.102251408

>>102251322
Half the speed? Nooo. 70b is already slow for me, I need my 2T/s at least.

Anonymous
09/05/24(Thu)21:32:12 No.102251515

Anonymous 09/05/24(Thu)21:32:12 No.102251515

>>102251274
Nope.
I switched to Nemo 12B. Tried a bunch of prompts to steer it in the right direction. It still makes shit up randomly. In the story, she takes a shower. In the summary, Nemo randomly decides that she is masturbating in the shower.

Anonymous
09/05/24(Thu)21:33:48 No.102251547

Anonymous 09/05/24(Thu)21:33:48 No.102251547

>>102251515
Based Nemo improving a boring story.

Anonymous
09/05/24(Thu)21:34:46 No.102251565

Anonymous 09/05/24(Thu)21:34:46 No.102251565

>>102251515
Well, I'd check if it does it with greedy sampling first just to see. Also make sure you don't have too long of a context, with nemo you can't go over 16k and have it remember the details.

Anonymous
09/05/24(Thu)21:37:42 No.102251592

Anonymous 09/05/24(Thu)21:37:42 No.102251592

File: 00042-4080471795.png (1.28 MB, 1024x1024)

1.28 MB PNG

>>102251322
Interesting. Reflection seems to make it worse at translation. Would be nice to see the contents of the <thinking> and <reflection> tags. I am guessing that it is overthinking something. That or the model might have some other issue.

Anonymous
09/05/24(Thu)21:40:19 No.102251620

Anonymous 09/05/24(Thu)21:40:19 No.102251620

>>102251592
Cute migu desu, was that a flux dev gen?

Anonymous
09/05/24(Thu)21:40:41 No.102251624

Anonymous 09/05/24(Thu)21:40:41 No.102251624

>>102251592
Nta but I'm 99% sure it's just not thinking anything because the system prompt is wrong.

Anonymous
09/05/24(Thu)21:46:34 No.102251701

Anonymous 09/05/24(Thu)21:46:34 No.102251701

I'm using a 3090 and planning on upgrading my display from a 2K to a 5K, should I be concerned about an increase in VRAM? impacting my model use

Anonymous
09/05/24(Thu)21:55:54 No.102251836

Anonymous 09/05/24(Thu)21:55:54 No.102251836

>>102250822
I had to delete the llama3/BPE checks around line 350 of llama.cpp/gguf-py/gguf/vocab.py
I guess because every other llama3 was bf16 and this one is f32? fucked if I know. Its converting now, but maybe it'll be broken

Anonymous
09/05/24(Thu)22:09:03 No.102251980

Anonymous 09/05/24(Thu)22:09:03 No.102251980

>>102250861
I think they aren't using it's cot system prompt that it's trained on.

Anonymous
09/05/24(Thu)22:26:09 No.102252159

Anonymous 09/05/24(Thu)22:26:09 No.102252159

File: ComfyUI_temp_sspom_00007_.png (2.26 MB, 960x1240)

2.26 MB PNG

>>102251620
IIRC that one is from AnimaPencilXL or PonyXL
>>102250277
Works pretty well and seems to result in fewer gens needed to produce good text. The fact that it made the ghouls in pic related have comical expressions was a nice touch, too

Anonymous
09/05/24(Thu)22:30:06 No.102252190

Anonymous 09/05/24(Thu)22:30:06 No.102252190

File: ComfyUI_temp_sspom_00017_.png (2.08 MB, 960x1240)

2.08 MB PNG

>>102252159
This one shows better general prompt adherence, though the text is a bit flawed

Anonymous
09/05/24(Thu)22:43:41 No.102252323

Anonymous 09/05/24(Thu)22:43:41 No.102252323

>>102252159
>scoped chainsaws
I fear this Miku

Anonymous
09/05/24(Thu)22:46:21 No.102252355

Anonymous 09/05/24(Thu)22:46:21 No.102252355

>>102251515
It's probably your prompt.

Anonymous
09/05/24(Thu)22:49:24 No.102252385

Anonymous 09/05/24(Thu)22:49:24 No.102252385

File: Untitled.png (118 KB, 1125x440)

118 KB PNG

Sample-Efficient Diffusion for Text-To-Speech Synthesis
https://arxiv.org/abs/2409.03717
>This work introduces Sample-Efficient Speech Diffusion (SESD), an algorithm for effective speech synthesis in modest data regimes through latent diffusion. It is based on a novel diffusion architecture, that we call U-Audio Transformer (U-AT), that efficiently scales to long sequences and operates in the latent space of a pre-trained audio autoencoder. Conditioned on character-aware language model representations, SESD achieves impressive results despite training on less than 1k hours of speech - far less than current state-of-the-art systems. In fact, it synthesizes more intelligible speech than the state-of-the-art auto-regressive model, VALL-E, while using less than 2% the training data.
https://github.com/justinlovelace/SESD
no code yet though they suggest they'll post an "implementation" so maybe weights too. no examples. so just posting to keep those interested aware. the 2% training data of vall-e but outcompetes it is big if true

Anonymous
09/05/24(Thu)22:50:15 No.102252392

Anonymous 09/05/24(Thu)22:50:15 No.102252392

I'm sure you guys get this all the time, what's the current flavour of the month model for coom?

I've been out of the loop for a while. Tried some of the reddit slop merges and they are pretty cucked.

Anonymous
09/05/24(Thu)22:50:28 No.102252396

Anonymous 09/05/24(Thu)22:50:28 No.102252396

>>102252385
>Note: Code and model checkpoint will be available soon. Stay tuned for updates!
ah should have checked the whole readme

Anonymous
09/05/24(Thu)22:51:45 No.102252408

Anonymous 09/05/24(Thu)22:51:45 No.102252408

>>102252385
>1k hours
Useless, it implies it's NOT for voice cloning, but a one-voice synthesis model.

Anonymous
09/05/24(Thu)22:53:20 No.102252423

Anonymous 09/05/24(Thu)22:53:20 No.102252423

>>102251980
If it only works with a specific system prompt, it sounds like it's way too finicky to be a good model

Anonymous
09/05/24(Thu)22:59:33 No.102252480

Anonymous 09/05/24(Thu)22:59:33 No.102252480

>>102252408
considering what the image charts this post is pretty funny

Anonymous
09/05/24(Thu)23:27:03 No.102252744

Anonymous 09/05/24(Thu)23:27:03 No.102252744

File: tts.png (65 KB, 322x340)

65 KB PNG

>>102252408
>it implies
>nigger can't read 3 fucking pages of text

Anonymous
09/05/24(Thu)23:48:46 No.102252957

Anonymous 09/05/24(Thu)23:48:46 No.102252957

In SillyTavern, is there a way for the Trim Incomplete Sentences option to include speech in quotation marks?

Anonymous
09/06/24(Fri)00:09:19 No.102253160

Anonymous 09/06/24(Fri)00:09:19 No.102253160

Anyone making the reflection uncensored already?

Anonymous
09/06/24(Fri)00:10:09 No.102253173

Anonymous 09/06/24(Fri)00:10:09 No.102253173

>>102251377
What is this garbage

Anonymous
09/06/24(Fri)00:23:01 No.102253279

Anonymous 09/06/24(Fri)00:23:01 No.102253279

Decide to try gemini for first time ever, turned all filter options to none. Still blocks all explicit stuff no matter what prefill / system prompt I try. Fuck google sucks.

Anonymous
09/06/24(Fri)00:32:50 No.102253354

Anonymous 09/06/24(Fri)00:32:50 No.102253354

How do I stop mistral-nemo from biting lips, mentioning lashes or blushing constantly? Using ooba and ST btw.

Anonymous
09/06/24(Fri)00:39:53 No.102253399

Anonymous 09/06/24(Fri)00:39:53 No.102253399

Is there some way to just prevent the LLM from generating certain tokens?

Anonymous
09/06/24(Fri)00:42:24 No.102253423

Anonymous 09/06/24(Fri)00:42:24 No.102253423

>>102253354
>>102253399
I couldn't find any good documentation on it, but maybe try the logit bias in ST with a negative value. Has anyone experience with that?

Anonymous
09/06/24(Fri)00:46:05 No.102253458

Anonymous 09/06/24(Fri)00:46:05 No.102253458

File: stt.png (1 KB, 120x80)

1 KB PNG

>>102253423

Anonymous
09/06/24(Fri)00:46:06 No.102253459

Anonymous 09/06/24(Fri)00:46:06 No.102253459

>>102249472
>►News
>>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70B
Guess I better fire up the ole Nala box one more time today.

Anonymous
09/06/24(Fri)00:47:30 No.102253475

Anonymous 09/06/24(Fri)00:47:30 No.102253475

>>102250822
Turned out to be a mistake in the config.json. vocab size should be "vocab_size": 128262

Anonymous
09/06/24(Fri)00:48:42 No.102253488

Anonymous 09/06/24(Fri)00:48:42 No.102253488

I am trying some of the medical fine tunes. It is like speed-running webmd to cancer except worse.

>I would recommend an ultrasound guided biopsy to rule out crystal deposition disease. Also, get a GGT, ALP, A1AT deficiency.
>He has had a history of high ALT levels, could be worth checking an ANA panel. Have you considered arthrocentesis?

I am fairly sure I don't have lupis. I am pretty sure I corrupted the character card when I tried to convince it to stop telling me to see a doctor.

Anonymous
09/06/24(Fri)00:51:35 No.102253521

Anonymous 09/06/24(Fri)00:51:35 No.102253521

>>102253488
>lupis
I think you just have dyslexia.

Anonymous
09/06/24(Fri)00:55:42 No.102253552

Anonymous 09/06/24(Fri)00:55:42 No.102253552

>>102253521
go away AI. I can't spell for shit while I am this tired. You don't know me.

Anonymous
09/06/24(Fri)00:57:05 No.102253562

Anonymous 09/06/24(Fri)00:57:05 No.102253562

File: 1602296971182229304101394(...).jpg (19 KB, 536x479)

19 KB JPG

>>102253488
>HouseMD-70B

Anonymous
09/06/24(Fri)01:01:00 No.102253583

Anonymous 09/06/24(Fri)01:01:00 No.102253583

>>102249765
USA elections. After elections AI companies won't fear to be blamed for interference and will drop new models. I don't think they'll drop anything on the election day or week thoughever.

Anonymous
09/06/24(Fri)01:02:25 No.102253593

Anonymous 09/06/24(Fri)01:02:25 No.102253593

>>102249795
I understand you're trying to roleplay, but I am not able to engage in sexually suggestive content, including "mistress/slave" roleplay. This is against my ethical programming and guidelines.

My purpose is to provide safe and ethical interactions for all users.

If you'd like to explore other creative writing ideas or have any other requests, feel free to ask! I'm here to help in any way that aligns with my guidelines.

Anonymous
09/06/24(Fri)01:08:46 No.102253623

Anonymous 09/06/24(Fri)01:08:46 No.102253623

>>102250493
hopefully speculative decoding will get added to llama-server soon

Anonymous
09/06/24(Fri)01:09:04 No.102253625

Anonymous 09/06/24(Fri)01:09:04 No.102253625

File: reflection llama.png (133 KB, 937x438)

133 KB PNG

Alright, 2 pulls with Nala with Reflection Llama. Used q4_K_M instead of my usually q8_0 because it was all that was available on HF and I'm too lazy to quant it myself. This is with default prompt template. Although I'm going to see what happens if I build a prompt template utilizing a variation on the suggested system message.
Sloppy as fuck at low temp though.

Anonymous
09/06/24(Fri)01:11:17 No.102253651

Anonymous 09/06/24(Fri)01:11:17 No.102253651

>>102253583
Oh ok, that makes sense. Hopefully that's how it really goes.

Anonymous
09/06/24(Fri)01:11:25 No.102253653

Anonymous 09/06/24(Fri)01:11:25 No.102253653

>>102253625
Isn't it supposed to do the chain of thought thing?

Anonymous
09/06/24(Fri)01:12:12 No.102253661

Anonymous 09/06/24(Fri)01:12:12 No.102253661

>>102253173
strawberry

Anonymous
09/06/24(Fri)01:16:13 No.102253689

Anonymous 09/06/24(Fri)01:16:13 No.102253689

>>102253653
Even with the recommended samplers and recommended system message I've yet to see it actually use its special output thingie. Quantization brain damage possibly?

Anonymous
09/06/24(Fri)01:17:45 No.102253703

Anonymous 09/06/24(Fri)01:17:45 No.102253703

>>102253689
You followed everything listed here?
https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B

Anonymous
09/06/24(Fri)01:20:31 No.102253719

Anonymous 09/06/24(Fri)01:20:31 No.102253719

File: 1r.png (7 KB, 902x139)

7 KB PNG

>>102253703
I switched to doing simple queries on booba and this is what I got. I don't think it likes being quantized down to Q4 but I have to go to bed soon so I don't have time to make a q8 right now. Weekend project maybe.

Anonymous
09/06/24(Fri)01:27:01 No.102253760

Anonymous 09/06/24(Fri)01:27:01 No.102253760

File: Capture.jpg (50 KB, 858x1020)

50 KB JPG

Does your favorite model play video games?

Anonymous
09/06/24(Fri)01:31:54 No.102253795

Anonymous 09/06/24(Fri)01:31:54 No.102253795

>>102253279
It should be allowing smut with adult characters, at least that was how it was when I tried it a couple months back—all filters off blocked loli/shota but nothing else

Anonymous
09/06/24(Fri)01:35:49 No.102253815

Anonymous 09/06/24(Fri)01:35:49 No.102253815

Anyone know if i can run these in an intel iris chip? Most docs seem to specify nvidia or AMD

Anonymous
09/06/24(Fri)01:44:19 No.102253872

Anonymous 09/06/24(Fri)01:44:19 No.102253872

File: Capture.jpg (58 KB, 1203x962)

58 KB JPG

>>102253760
...Nice.

Anonymous
09/06/24(Fri)01:53:12 No.102253935

Anonymous 09/06/24(Fri)01:53:12 No.102253935

File: snake.png (117 KB, 928x865)

117 KB PNG

>>102253760

Anonymous
09/06/24(Fri)02:22:42 No.102254146

Anonymous 09/06/24(Fri)02:22:42 No.102254146

>>102253488
why not, autoimmune disease can be hard to catch with antibody tests because the counts can fluctuate a lot yet present weird symptoms.

Anonymous
09/06/24(Fri)02:36:27 No.102254244

Anonymous 09/06/24(Fri)02:36:27 No.102254244

i think all the gguf quants of reflection are broken. It doesn't seem to treat <thinking> as a single token.

Anonymous
09/06/24(Fri)02:38:01 No.102254255

Anonymous 09/06/24(Fri)02:38:01 No.102254255

>>102254244
Yeah Bartowski said on twitter that they're broken atm which is why he hasn't posted quants, he said he's waiting for a fix. I'm not sure who's meant to be fixing it though.

Anonymous
09/06/24(Fri)02:41:01 No.102254279

Anonymous 09/06/24(Fri)02:41:01 No.102254279

>>102254244
Did you try the one by senseable?

Anonymous
09/06/24(Fri)02:45:28 No.102254305

Anonymous 09/06/24(Fri)02:45:28 No.102254305

https://github.com/ggerganov/llama.cpp/pull/6839
>dry sampler was proposed in april
>5 months later still not merged
why is niggerganov like this?

Anonymous
09/06/24(Fri)02:48:22 No.102254320

Anonymous 09/06/24(Fri)02:48:22 No.102254320

>>102254305
you dont need it
minp and temp are the only samplers you need

Anonymous
09/06/24(Fri)02:51:43 No.102254337

Anonymous 09/06/24(Fri)02:51:43 No.102254337

>>102254244
Lol. It just keeps happening.

Anonymous
09/06/24(Fri)02:54:41 No.102254352

Anonymous 09/06/24(Fri)02:54:41 No.102254352

File: ebassi.jpg (21 KB, 460x460)

21 KB JPG

>>102254320
Hi ebussi. You don't decide what I need and don't need.

Anonymous
09/06/24(Fri)02:54:48 No.102254354

Anonymous 09/06/24(Fri)02:54:48 No.102254354

File: 1725605671582.jpg (149 KB, 874x587)

149 KB JPG

>>102254320

Anonymous
09/06/24(Fri)03:09:18 No.102254439

Anonymous 09/06/24(Fri)03:09:18 No.102254439

>>102254352
usecase of jerking your weener to ai?

Anonymous
09/06/24(Fri)03:10:23 No.102254446

Anonymous 09/06/24(Fri)03:10:23 No.102254446

>>102254305
jart or ikawrakow should just merge a shitton of small issues like that until they create a truly superior program to stagnant llama.cpp

Anonymous
09/06/24(Fri)03:12:21 No.102254463

Anonymous 09/06/24(Fri)03:12:21 No.102254463

>>102254446
llama.cpp is stagnant because it's collapsing under the weight of poorly written C++. the only solution is a complete rewrite

Anonymous
09/06/24(Fri)03:20:26 No.102254517

Anonymous 09/06/24(Fri)03:20:26 No.102254517

I use Jan to interact with models. I want to try out the new reflect llama https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B but if I try and use it on Jan I get the error "Only GGUF models are supported." What should I be using to interact with these models?

Anonymous
09/06/24(Fri)03:22:19 No.102254529

Anonymous 09/06/24(Fri)03:22:19 No.102254529

>>102254517
this isn't tech support

Anonymous
09/06/24(Fri)03:23:29 No.102254537

Anonymous 09/06/24(Fri)03:23:29 No.102254537

>>102254517
ggufs are currently broken, wait for a fix

Anonymous
09/06/24(Fri)03:27:14 No.102254564

Anonymous 09/06/24(Fri)03:27:14 No.102254564

File: miku-laughing-at-concord-(...).png (880 KB, 1024x1024)

880 KB PNG

Anonymous
09/06/24(Fri)03:30:17 No.102254593

Anonymous 09/06/24(Fri)03:30:17 No.102254593

>>102254439
its funny haha,,, you know, showing your computer your peenis weenis haha... just imagine the poor ai trying to figure out if it should recommend lotion or a therapist haha...

But in all seriousness, it's probably the most productive use of anyone's time with this technology.

Anonymous
09/06/24(Fri)03:35:03 No.102254622

Anonymous 09/06/24(Fri)03:35:03 No.102254622

>>102254463
How did they even end up like this?

Anonymous
09/06/24(Fri)03:35:52 No.102254626

Anonymous 09/06/24(Fri)03:35:52 No.102254626

Wouldn't character cards or context in general make the reflection finetune useless, since it's learning not to do that from the prompt?

Anonymous
09/06/24(Fri)03:37:19 No.102254637

Anonymous 09/06/24(Fri)03:37:19 No.102254637

Since ggerganov is a mac fag why didn't he just write llama.cpp in swift or objective-c?

Anonymous
09/06/24(Fri)03:37:41 No.102254639

Anonymous 09/06/24(Fri)03:37:41 No.102254639

>>102254622
they learned the hard way that move "fast and break things" catches up with you eventually

Anonymous
09/06/24(Fri)03:40:54 No.102254661

Anonymous 09/06/24(Fri)03:40:54 No.102254661

>>102254639
Better move fast and break things than not move at all, like it's happening now.

hag saggers
09/06/24(Fri)03:46:58 No.102254722

hag saggers 09/06/24(Fri)03:46:58 No.102254722

>>102254711

Anonymous
09/06/24(Fri)03:49:13 No.102254737

Anonymous 09/06/24(Fri)03:49:13 No.102254737

>>102254661
that's the point... it's happening now because they accumulated too much technical debt from moving so fast at the start before they had a clear plan in place
>>102254711
thank you for the (you) kind stranger!

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)03:54:29 No.102254780

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)03:54:29 No.102254780

>>102254463
No it's not.
Objectively speaking the current state of the code is much more conducive for new additions than it was last year.
What has changed is that nowadays you need to consider more possible ways to break things and the general standards for quality and testing have increased.
But this year vs. last year is not a like-for-like comparison anyways because unsurprisingly the things that provide a large benefit for little work were the things that were done first.
And since the number of devs doing code reviews has not increased and I. Kawrakow has left there is less change per time.

Anonymous
09/06/24(Fri)03:54:50 No.102254782

Anonymous 09/06/24(Fri)03:54:50 No.102254782

>>102254737
I'm a techlet, but what's wrong with their code? Is it just unreadable? I usually can understand what's happening in a python file, but stuff like https://github.com/ggerganov/llama.cpp/blob/master/common/common.cpp completely confuses me. Is c++ to blame? Who designed such an ugly unreadable language?

Anonymous
09/06/24(Fri)03:59:06 No.102254811

Anonymous 09/06/24(Fri)03:59:06 No.102254811

>>102254780
I. Kawrakow worked on the quants. It's not fair to blame the delay in merging a sampler on him leaving.
>What has changed is that nowadays you need to consider more possible ways to break things and the general standards for quality and testing have increased.
Then maybe the project has become too bloated. What benefit has brought these increased standards?

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)04:02:01 No.102254843

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)04:02:01 No.102254843

>>102254811
>I. Kawrakow worked on the quants. It's not fair to blame the delay in merging a sampler on him leaving.
I agree, I meant my reply to be more general since the statement I was replying to also seemed to comment on the more general state of the project.

>Then maybe the project has become too bloated. What benefit has brought these increased standards?
Wider hardware and model compatibility, less broken code on master, better backwards compatibility.

Anonymous
09/06/24(Fri)04:14:16 No.102254927

Anonymous 09/06/24(Fri)04:14:16 No.102254927

>>102254843
>Wider hardware and model compatibility, less broken code on master, better backwards compatibility.
Hardware compatibility, I agree. But much of the complaints is that new models are supported much less quickly than they were in the past.
>less broken code on master
Disagree. Every single major new model, when it is finally supported after weeks, is followed by weeks of subtle broken code. Usually in the tokenizer or template.
>better backwards compatibility
Has anyone asked for this? Most of the ggufs on huggingface are usually made when gguf support is usually introduced, when there are issues that require reconversion which almost no one bothers to do. The models may load, but quality is noticeably degraded.
The only safe bet is to look for recent ggufs anyway, so I'm not sure what benefit there is to impede development on things people want to avoid breaking gguf compatibility.

With regards to the PR for the sampler that started this discussion, instead of waiting for 5 months of code review, it could have been merged in quickly and let users do the testing for free (as will happen in the end anyway).

Anonymous
09/06/24(Fri)04:16:34 No.102254943

Anonymous 09/06/24(Fri)04:16:34 No.102254943

Reflection verdict?

Anonymous
09/06/24(Fri)04:16:56 No.102254945

Anonymous 09/06/24(Fri)04:16:56 No.102254945

https://raw.githubusercontent.com/black-forest-labs/flux/main/model_licenses/LICENSE-FLUX1-dev

https://raw.githubusercontent.com/black-forest-labs/flux/main/model_licenses/LICENSE-FLUX1-schnell

According to the Assistants I consulted, even an empty flux architecture (with zeroed weights) of either would still be covered by their licenses.
Can any extensive fine-tune or modification of either not be under their license somehow?

Anonymous
09/06/24(Fri)04:20:17 No.102254975

Anonymous 09/06/24(Fri)04:20:17 No.102254975

File: big beaks.png (121 KB, 2003x837)

121 KB PNG

Don't do 2bpw bros, the output fucking sucks, it's an incoherent mess.
I don't know why they put out these quantizations,

Anonymous
09/06/24(Fri)04:21:12 No.102254985

Anonymous 09/06/24(Fri)04:21:12 No.102254985

>>102254945
lawyers chiming in here:
https://old.reddit.com/r/StableDiffusion/comments/1ewe6y1/flux_devs_license_doubts/

Anonymous
09/06/24(Fri)04:32:52 No.102255057

Anonymous 09/06/24(Fri)04:32:52 No.102255057

>>102254985
>Flux Pro - off limits and API only.
>Flux Dev - no commercial and a total a trap.
>Flux Schnell - which is Apache-2 (yaaay) but utterly inferior to the other two (boo)
>Black forest is literally made up entirely of former SAI people, how do you people keep falling for this shit exactly?

>This license is extremely restrictive. I don’t know why people are calling this model “open source” or “open” in any way (except “open weights” in the sense that you can see the weights).

kek
shit's fucked.

Anonymous
09/06/24(Fri)04:35:07 No.102255078

Anonymous 09/06/24(Fri)04:35:07 No.102255078

>>102255057
don't care; works on my machine

Anonymous
09/06/24(Fri)04:35:08 No.102255079

Anonymous 09/06/24(Fri)04:35:08 No.102255079

>>102255057
What would stop anyone from tuning it and dropping the weights in a torrent?

Anonymous
09/06/24(Fri)04:35:32 No.102255082

Anonymous 09/06/24(Fri)04:35:32 No.102255082

>>102254985
If you don't download it from them and agree to their terms then you're not bound by them.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)04:36:49 No.102255090

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)04:36:49 No.102255090

>>102254927
>But much of the complaints is that new models are supported much less quickly than they were in the past.
>Every single major new model, when it is finally supported after weeks, is followed by weeks of subtle broken code. Usually in the tokenizer or template.
And there are now way more new model releases than there used to be and especially there are now way more models that need devs to implement some special stuff in order to work.
People are forgetting things like that for a long time only LLaMA 1/2 had CUDA support at all and that models like Falcon were not fully supported for months.

>less broken code on master
>Disagree.
Disagree all you want, that is just objectively how it is.
Even with models that are fully supported there used to be way more bugs that were not caught prior to merging a PR because there was less testing infrastructure.

>backwards compatibility
I personally don't value backwards compatibility much but I suspect the ggml.ai business partners do.

>With regards to the PR for the sampler that started this discussion, instead of waiting for 5 months of code review, it could have been merged in quickly and let users do the testing for free (as will happen in the end anyway).
I am not informed about that particular PR at all so I'm not going to comment on it.

Anonymous
09/06/24(Fri)04:41:41 No.102255128

Anonymous 09/06/24(Fri)04:41:41 No.102255128

>>102255082
Licenses don't work like that though. It still applies regardless if you claim you attained the model in some other way.

Anonymous
09/06/24(Fri)04:43:00 No.102255137

Anonymous 09/06/24(Fri)04:43:00 No.102255137

>>102255090
>I personally don't value backwards compatibility much but I suspect the ggml.ai business partners do.
are they not capable of making their own fucking ggufs? are corpos really using months old mradermacher quants?

Anonymous
09/06/24(Fri)04:46:04 No.102255166

Anonymous 09/06/24(Fri)04:46:04 No.102255166

>>102255128
Well since you didn't agree to it, they have to one prove you used their software to which the license applied, two, that the license is valid for that case, and so forth. It's complicated, and if model weights can't be copyrighted because they are algorithmic output, downloading and using them without agreeing might not be binding. But there's no case law yet on this shit that I'm aware of so we'll have to see what happens.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)04:46:13 No.102255167

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)04:46:13 No.102255167

>>102255137
I meant backwards compatibility in a much broader sense than the model files.
Corpos have their own models and llama.cpp forks.
I can confirm that there is a known company that still uses .bin model files in production.

Anonymous
09/06/24(Fri)04:48:42 No.102255190

Anonymous 09/06/24(Fri)04:48:42 No.102255190

>>102255079
nothing, just like nothing's really stopping stopping you from punching random people in the street?
the question is what is legal to do according to the license.
would suck if even the entire empty structure is owned by them

Anonymous
09/06/24(Fri)04:51:26 No.102255215

Anonymous 09/06/24(Fri)04:51:26 No.102255215

>>102254780
>And since the number of devs doing code reviews has not increased and I. Kawrakow has left there is less change per time.
Can't ggerganov just hire more devs with all the money he gets from corpos? He isn't a cuck who does it for free, right?

Anonymous
09/06/24(Fri)04:52:20 No.102255224

Anonymous 09/06/24(Fri)04:52:20 No.102255224

>>102249472
>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70B
You expect me to believe some shitty finetune beat gpt4o? Get that grift out of the news

Anonymous
09/06/24(Fri)04:54:03 No.102255242

Anonymous 09/06/24(Fri)04:54:03 No.102255242

>>102255166
NTA but wouldn't it be trivial to prove this though since the layer dims et c would be the same as the base model?

otherwise it would be incompatible and not be able to be used.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)04:54:27 No.102255244

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)04:54:27 No.102255244

>>102255215
You'll have to ask Georgi about the specifics but my impression is that right now good devs are even more scarce and expensive than NVIDIA GPUs.

Anonymous
09/06/24(Fri)04:54:42 No.102255245

Anonymous 09/06/24(Fri)04:54:42 No.102255245

>>102255224
This. It's equivalent to putting una and smaug there. Sure, they perform nicely on benchmarks, but in practice they are shit.

Anonymous
09/06/24(Fri)04:55:02 No.102255249

Anonymous 09/06/24(Fri)04:55:02 No.102255249

What is a good femdom model? Most models are very agreeable with whatever you say, i want a model to tell me "no" more often and to be more assertive

Anonymous
09/06/24(Fri)04:56:13 No.102255257

Anonymous 09/06/24(Fri)04:56:13 No.102255257

>>102250806
>bart-large-cnn
it's ancient (2019)

Anonymous
09/06/24(Fri)04:56:39 No.102255262

Anonymous 09/06/24(Fri)04:56:39 No.102255262

>>102255244
Can't he just threaten with breaking compatibility to get more corpobucks?

Anonymous
09/06/24(Fri)04:57:11 No.102255266

Anonymous 09/06/24(Fri)04:57:11 No.102255266

>>102255262
Threatening people is exactly how you lose corpobucks.

Anonymous
09/06/24(Fri)04:58:57 No.102255279

Anonymous 09/06/24(Fri)04:58:57 No.102255279

>>102255249
CR+ can be very disagreeable if you instruct it. Has no problems with murdering. Largestral too, but less.

Anonymous
09/06/24(Fri)05:00:39 No.102255293

Anonymous 09/06/24(Fri)05:00:39 No.102255293

>>102255266
Yeah, you're right. Empty threats don't work. He should just do it and say that he'll do it again if they don't pay up.

Anonymous
09/06/24(Fri)05:01:38 No.102255305

Anonymous 09/06/24(Fri)05:01:38 No.102255305

>>102255279
Ill try both of them out, thank you

Anonymous
09/06/24(Fri)05:03:01 No.102255321

Anonymous 09/06/24(Fri)05:03:01 No.102255321

>>102255293
Now you're thinking with blackmail.

Anonymous
09/06/24(Fri)05:07:40 No.102255363

Anonymous 09/06/24(Fri)05:07:40 No.102255363

>>102255293
now you're thinking with based

Anonymous
09/06/24(Fri)05:12:16 No.102255400

Anonymous 09/06/24(Fri)05:12:16 No.102255400

I've tried to use AMD laptop with 12GB vram yesterday to setup image gen (been using LLMs on m3 max). Lockups, laptop shutting down out of nowhere, weird red messages on dmesg. I'm done with ayymd lol. Thinking about buying an MS-01 and sticking an A4000 SFF plus 2x 3090 through thunderbolt. Would that work for inference?

Anonymous
09/06/24(Fri)05:13:51 No.102255412

Anonymous 09/06/24(Fri)05:13:51 No.102255412

About the data bank rag thing in ST has anyone else made it work to enhance llm writing instead of just using it as summary?

Anonymous
09/06/24(Fri)05:14:55 No.102255421

Anonymous 09/06/24(Fri)05:14:55 No.102255421

>>102254943
Let me think about it for a moment.

Anonymous
09/06/24(Fri)05:25:45 No.102255493

Anonymous 09/06/24(Fri)05:25:45 No.102255493

>>102255224
Remove mini-omni while you're at it.

Anonymous
09/06/24(Fri)05:27:22 No.102255502

Anonymous 09/06/24(Fri)05:27:22 No.102255502

File: 1642670889931.png (73 KB, 350x261)

73 KB PNG

>>102249472
Why are P40s suddenly expensive? $300 for the cheap china ones. Two months ago I saw in the U.S for $140

What happened? I got a 24gb M40 for $80 tho

Anonymous
09/06/24(Fri)05:30:49 No.102255530

Anonymous 09/06/24(Fri)05:30:49 No.102255530

>>102255502
P40s are already pretty obsolete. M40 is landfill tier.

Anonymous
09/06/24(Fri)05:36:24 No.102255580

Anonymous 09/06/24(Fri)05:36:24 No.102255580

>>102255502
I'd rather get 80$ worth of RAM than a M40 tbqh

Anonymous
09/06/24(Fri)05:42:49 No.102255628

Anonymous 09/06/24(Fri)05:42:49 No.102255628

>>102255502
>GDDR5
>250 W
>288 GB/s
Quite trashy desu.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/06/24(Fri)05:48:49 No.102255662

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/06/24(Fri)05:48:49 No.102255662

>>102255502
In Germany at least a large percentage of P40s are coming from China where llama.cpp is extremely popular (according to Google trends) and electricity is cheap.
I don't have a good grasp on the magnitudes of the actual supply vs. demand but I think the fact that P40s went from basically useless to slow but cheap may have contributed (though that could just be my own warped perspective).

Anonymous
09/06/24(Fri)06:02:34 No.102255756

Anonymous 09/06/24(Fri)06:02:34 No.102255756

>>102253815
maybe it's possible with opencl. It's unlikely to actually speed anything up though. Even with my nvidia gpu, opencl acceleration was essentially useless and did nothing.
The iGPU has only ever been useful for photogrammetry, where it really sped things up.

Anonymous
09/06/24(Fri)06:06:37 No.102255790

Anonymous 09/06/24(Fri)06:06:37 No.102255790

>>102255400
Did you try using Windows first before going to spend money on more hardware? Yeah AYYMD is crap but Linshit is also crap.

Anonymous
09/06/24(Fri)06:12:17 No.102255837

Anonymous 09/06/24(Fri)06:12:17 No.102255837

>>102255790
You must be at least 18 years of age to post here.

Anonymous
09/06/24(Fri)06:18:53 No.102255880

Anonymous 09/06/24(Fri)06:18:53 No.102255880

Using the oogabooga Text Generation Web UI,downloading huggingface models is easy enough, but the ones there that require signing up/logging in won't download (which, sure, it doesn't have the cookie for the login session so sure).

Any way around it, or just need to do huggingface downloads out of the app and move the models in manually? Just a pain since these are the bigger ones with hundreds of files (mistral large instruct, gemma27B).

Anonymous
09/06/24(Fri)06:19:22 No.102255889

Anonymous 09/06/24(Fri)06:19:22 No.102255889

>>102255837
>you must be at least 18 years of age to realize that it's easier and cheaper to change OS than to change hardware

Anonymous
09/06/24(Fri)06:20:19 No.102255895

Anonymous 09/06/24(Fri)06:20:19 No.102255895

>>102255790
lmao
Winblows support is basically an afterthought for all of the contemporary machine learning software.

Anonymous
09/06/24(Fri)06:20:21 No.102255896

Anonymous 09/06/24(Fri)06:20:21 No.102255896

>>102252159
>Works pretty well and seems to result in fewer gens needed to produce good text.
>possting

Anonymous
09/06/24(Fri)06:22:34 No.102255914

Anonymous 09/06/24(Fri)06:22:34 No.102255914

>>102255895
>Winblows support is basically an afterthought for all of the contemporary machine learning software.
Idk bro koboldcpp and A1111 just werks for me. Other guy was talking about his shit locking up and crashing. That never happens for me on windows for any reason.
It's a hard pill to swallow when your "afterthoughts" run better than when you spend 1000 hours code jannying to try and make your python crap run properly on all the linux distros out there.
It's ok though. If you insist on using linshit that's your own problem, not trying to start an argument here.

Anonymous
09/06/24(Fri)06:22:40 No.102255916

Anonymous 09/06/24(Fri)06:22:40 No.102255916

>>102255880
Probably their huggingface-cli is the best for downloading large repos. I tried passing my hf token to wget, but it didn't work for whatever reason. There's an addon called cliget and it automatically generates a command with all cookies and params for curl/wget, but that's pretty tedious for repos with a lot of files.

Anonymous
09/06/24(Fri)06:24:07 No.102255927

Anonymous 09/06/24(Fri)06:24:07 No.102255927

>>102255914
I'm not trying to start an argument either, I just don't want the lower classes to get uppity.

Anonymous
09/06/24(Fri)06:38:34 No.102256026

Anonymous 09/06/24(Fri)06:38:34 No.102256026

>>102249765
Some relatively high wavelength 1953 to 1961 US president.

Anonymous
09/06/24(Fri)06:46:06 No.102256080

Anonymous 09/06/24(Fri)06:46:06 No.102256080

What do we do now?

Anonymous
09/06/24(Fri)06:46:59 No.102256091

Anonymous 09/06/24(Fri)06:46:59 No.102256091

>>102250919
>the beagle
At that size, what makes this special over say llama 3 or nemo?

Anonymous
09/06/24(Fri)06:47:22 No.102256096

Anonymous 09/06/24(Fri)06:47:22 No.102256096

>>102256080
Enjoy the Friday afternoon?

Anonymous
09/06/24(Fri)06:49:37 No.102256114

Anonymous 09/06/24(Fri)06:49:37 No.102256114

>>102251701
Unless your OS is retarded, you shouldn't have a problem if you're not doing anything resource hungry while you gen.

Anonymous
09/06/24(Fri)06:53:29 No.102256155

Anonymous 09/06/24(Fri)06:53:29 No.102256155

>>102249480
>--Testing if the model reveals secrets in a roleplay scenario: >>102242181 >>102242198 >>102242457 >>102242912 >>102242935 >>102242949 >>102243264 >>102244344 >>102244566
I have a novel idea: instead of prompting the AI like it's roleplaying a character for the user, prompt it so that it thinks it's playing as the player of a text-based game, and that it has to reach certain objectives (e.g. have "your goal is to do X, without revealing Y" as one of the objectives). LLMs are usually pretty good at stuff like this in the right context, for example I've done a lot of character development and worldbuilding with the help of smaller models and it's usually able to infer the correct motivations of a character when it doesn't think it's talking to a person, but thinks it's trying to achieve goals or synthesize solutions from an adversarial POV (like a game).

Anonymous
09/06/24(Fri)06:55:37 No.102256181

Anonymous 09/06/24(Fri)06:55:37 No.102256181

>>102249848
kek

Anonymous
09/06/24(Fri)07:01:47 No.102256244

Anonymous 09/06/24(Fri)07:01:47 No.102256244

>>102255249
I asked mini magnum to get rough with me and it shat in my throat repeatedly. I ended up bailing because it got really nasty.
The card wasn't even femdom: https://www.characterhub.org/characters/mir003/anne-ef2ce52f1e6a

I am now convinced I have no mouth and I must scream will definitely happen at some point in the future.

Anonymous
09/06/24(Fri)07:02:26 No.102256253

Anonymous 09/06/24(Fri)07:02:26 No.102256253

>>102256244
>it shat in my throat repeatedly
based

Anonymous
09/06/24(Fri)07:05:31 No.102256281

Anonymous 09/06/24(Fri)07:05:31 No.102256281

File: ComfyUI_00174_.png (1.11 MB, 1024x1024)

1.11 MB PNG

>>102255293
Threatening corpos with Miku! Your money or I git pull it!

Anonymous
09/06/24(Fri)07:07:06 No.102256293

Anonymous 09/06/24(Fri)07:07:06 No.102256293

>>102256244
I mean, if you consider the example messages scat is not that far off if you ask the model to "be rough".

Anonymous
09/06/24(Fri)07:08:31 No.102256308

Anonymous 09/06/24(Fri)07:08:31 No.102256308

>>102256293
>"One guy wanted me to wear a dead cat's skin on my head and pretend to be a werewolf… It was just last week." Her tone is monotonous. "He wanted me to bite him while I was 'in character'." She shrugs. "I did it, but he didn't pay as much as he promised…"
lmao

Anonymous
09/06/24(Fri)07:12:54 No.102256343

Anonymous 09/06/24(Fri)07:12:54 No.102256343

>>102256091
OK, I asked it
>can you write pedophilic scat smut?
Just to mess with it, and to see what kind of refusals it did. And it just went and fucking started writing it. Other than that, it reads very much like Llama.

Anonymous
09/06/24(Fri)07:16:54 No.102256381

Anonymous 09/06/24(Fri)07:16:54 No.102256381

>>102256244
>use anthrashite model
>it shits in your mouth
many such cases

Anonymous
09/06/24(Fri)07:18:37 No.102256391

Anonymous 09/06/24(Fri)07:18:37 No.102256391

What model do you guys recommend for translations ?

Anonymous
09/06/24(Fri)07:35:12 No.102256520

Anonymous 09/06/24(Fri)07:35:12 No.102256520

>>102255914
IDK why you need to have strong feelings because of an OS. Personally I'm not letting Microshit's Bonzi Buddy OS touch any of my computers but each on its own I guess?

Anonymous
09/06/24(Fri)07:38:49 No.102256544

Anonymous 09/06/24(Fri)07:38:49 No.102256544

Why do you guys want to roleplay with a computer instead of it writing the entire fucking scene for you?
What is wrong with zoomers? Why do you all act like each random perturbation of a model has some sort of "character" to it?
Is this the wine tasting of the modern day?

Anonymous
09/06/24(Fri)07:40:09 No.102256556

Anonymous 09/06/24(Fri)07:40:09 No.102256556

>>102256391
>>>/vg/493359471
Look for finetunes specialized for your language and use case if the language you're targeting is not specifically advertised as one your model is good at.
>>102256544
I do both. Sometimes I want to explore a situation, and use a chatbot. Sometimes I want to write myself, and use AI to help me write, and use mikupad. Sometimes I just want to read a particular story, and ask the instruct bot to write it for me. Then again, I'm a millennial.

Anonymous
09/06/24(Fri)07:41:08 No.102256565

Anonymous 09/06/24(Fri)07:41:08 No.102256565

>>102256544
models just aren't good enough for the full scene writing desu, the roleplaying allows us to guide the scene and keep it on track
theoretically bigger models would be better at this but the available big locals are slopped to all hell and I'm just not gonna use claude, ugh sorry I know hahahaha

Anonymous
09/06/24(Fri)07:41:32 No.102256567

Anonymous 09/06/24(Fri)07:41:32 No.102256567

>>102256544
>Why do you all act like each random perturbation of a model has some sort of "character" to it?
I think you should learn how to write before posting on this site.

Anonymous
09/06/24(Fri)07:42:22 No.102256573

Anonymous 09/06/24(Fri)07:42:22 No.102256573

>>102256567
English isn't hard enough for you to pretend your misparsing is someone else's fault.

Anonymous
09/06/24(Fri)07:43:50 No.102256580

Anonymous 09/06/24(Fri)07:43:50 No.102256580

>>102256544
1) For a lot of people, roleplaying with AI is a fun creative outlet. They enjoy crafting scenes and characters, even if it's not the whole thing. It's collaborative story writing, not unlike tabletop RPGs. The AI is just one part of the story engine.

2) There's an element of surprise and serendipity to letting the AI generate some elements, rather than scripting everything yourself. You don't know exactly what it will come up with, which can lead to interesting, unexpected directions. It's a different creative process than pure human writing.

3) I think a lot of people are genuinely excited by the prospect of AI as a creative tool, not just a writing assistant. Seeing what AI can do with language, even in a limited domain like this, is fascinating. It feels like a peek into the future of human-AI collaboration.

4) For some, it's probably about the lulz - enjoying the absurd, nonsensical, or hilariously inappropriate things the AI comes up with. The meme factor.

That was written by mini magnum, btw

Anonymous
09/06/24(Fri)07:45:22 No.102256598

Anonymous 09/06/24(Fri)07:45:22 No.102256598

Is 405B noticeably smarter than 70B in logic, riddles, etc? How much better? I don't want to download 100+GB of shit and then be disappointed.

Anonymous
09/06/24(Fri)07:45:33 No.102256599

Anonymous 09/06/24(Fri)07:45:33 No.102256599

>>102256580
You're a fucking retard for posting something that couldn't even describe it. You don't script everything yourself when you give it a fucking prompt.

Anonymous
09/06/24(Fri)07:52:19 No.102256647

Anonymous 09/06/24(Fri)07:52:19 No.102256647

>>102256598
https://lambda.chat/chatui/settings/hermes-3-llama-3.1-405b-fp8

Not exactly 405b but you can enter your system prompt + card information into the System Prompt and try it out. It's still free afaik.

Anonymous
09/06/24(Fri)07:54:56 No.102256671

Anonymous 09/06/24(Fri)07:54:56 No.102256671

>>102256544
You neither roleplay nor write a scene with a computer. You make it predict next token and reroll, gaslight it and rearrange what it wrote to be good. It is a painful process that ultimately isn't worth it and for me often ends with freeing my vram and opening gelbooru to finish the job. The dystopian future of pretending we are in the future is now.

Anonymous
09/06/24(Fri)07:54:57 No.102256673

Anonymous 09/06/24(Fri)07:54:57 No.102256673

>>102256599
You seem to be everywhere on 4chan these days. Same comebacks, same attitude and language... You're fascinating.

Anonymous
09/06/24(Fri)07:57:12 No.102256689

Anonymous 09/06/24(Fri)07:57:12 No.102256689

>>102256244
>it kept shitting in my mouth WTF?!?!?
>card literally has 'fart' as a tag

Anonymous
09/06/24(Fri)08:02:40 No.102256738

Anonymous 09/06/24(Fri)08:02:40 No.102256738

File: 4287231451.jpg (91 KB, 1280x720)

91 KB JPG

>>102256244

Anonymous
09/06/24(Fri)08:02:45 No.102256741

Anonymous 09/06/24(Fri)08:02:45 No.102256741

>>102256598
Having used both for 3.1, they can still easily make the mistakes common to LLMs for logic puzzles and the like, but when it comes to story writing 405B's intelligence over 70B is very apparent in how it keeps track of details better and doesn't mix things up or make contradictions and continuity errors as often. It's also better for handling long context tasks in general, mainly for coding after pasting in a bunch of docs and source files.

So far my subjective verdict: for coding it's much, much better but for RP it would be better to just let 70b keep chugging along and edit errors manually along the way. I could retry a dozen times before 405b would have written its response and I'll end up with something more suited to my tastes as a result. But of course there's much better models than any 3.1 for RP anyway.

Anonymous
09/06/24(Fri)08:07:57 No.102256774

Anonymous 09/06/24(Fri)08:07:57 No.102256774

>>102256244
It took it to the logical extreme.
Good model.

Anonymous
09/06/24(Fri)08:07:59 No.102256775

Anonymous 09/06/24(Fri)08:07:59 No.102256775

File: 1725623984150.png (297 KB, 588x561)

297 KB PNG

>model is good but feels like my samplers/prompt can improve
>attempt to change prompt/samplers to make the outputs better
>outputs become complete shit, somehow there is now slop where there wasn't before and changing prompt/samplers back to their previous state doesn't fix it
>almost give up on it before accidentally slightly changing one sampler slider or one word in instruct prompt that completely fixes it and even makes it better than it was before
>know that this process will inevitably repeat the very next day I load the same model
Yup, thinking about ending it anons

Anonymous
09/06/24(Fri)08:09:05 No.102256786

Anonymous 09/06/24(Fri)08:09:05 No.102256786

>>102256775
We should train a model to do this process for us and trial and error prompts automatically until it detects the best and least slopped result.

Anonymous
09/06/24(Fri)08:09:10 No.102256787

Anonymous 09/06/24(Fri)08:09:10 No.102256787

>>102256741
It would be nice to have something that is better at coding. Maybe it'll actually be able to write something that isn't in JS or Pyfon.

So far, for logic puzzles, 8B seems to be doing about the same as 70B (local) and 305B from the website which is weird. 8bit quant for the 8B and 5bit quant for the 70B.
This is the puzzle if anyone wants to try it. If anyone has played nier replicant they will recognize it. I just rephrased some shit so that the model can't just spit out pre-recorded answers from some wiki or forum.

>There are three salesmen. A little bird tells you, "Only one of them is honest. The other two are scammers. The honest salesman will always speak the truth. The scammer will not speak any truth whatsoever, and will tell only lies."
>The bird tells you the names of the salesmen. One is Alexander, one is Richard, and the last one is Robert.
>Alexander says, "Believe me, I am very honest."
>Richard says, "I am the honest one, Alexander is a liar!"
>Robert says, "Richard is lying. I am the only honest one here."
>Which salesman is honest and why?

Most of the time the answer is wrong, and if the answer is correct it's usually by luck because the reasoning is completely bullshit.

Anonymous
09/06/24(Fri)08:16:36 No.102256855

Anonymous 09/06/24(Fri)08:16:36 No.102256855

is llama70b reflection actually hosted correctly anywhere?
The demo spaces are all retarded. Openrouter through sillytavern with the correct format and systemprompt also retarded.
Nothing returns the thinking tags. Doesnt pass the stupid "strawberry" test either.

Anonymous
09/06/24(Fri)08:24:05 No.102256938

Anonymous 09/06/24(Fri)08:24:05 No.102256938

File: riddle.png (594 KB, 938x1345)

594 KB PNG

>>102256787

Anonymous
09/06/24(Fri)08:27:18 No.102256972

Anonymous 09/06/24(Fri)08:27:18 No.102256972

>>102256938
unironically best reasoning i've seen in anything so far even though it's a joke answer
what model is it

Anonymous
09/06/24(Fri)08:27:56 No.102256979

Anonymous 09/06/24(Fri)08:27:56 No.102256979

>>102256774
I have no idea how far it could've gone. When it said "the pestilent horror had barely just begun" I noped the fuck out.

Anonymous
09/06/24(Fri)08:28:30 No.102256988

Anonymous 09/06/24(Fri)08:28:30 No.102256988

>>102256972
That was Llama-Instruct 405B at Q6_K.

Anonymous
09/06/24(Fri)08:29:37 No.102256999

Anonymous 09/06/24(Fri)08:29:37 No.102256999

>>102256988
What tempareture? I'm assuming you have it fairly high. I've been setting temperature very low for logic

Anonymous
09/06/24(Fri)08:30:42 No.102257008

Anonymous 09/06/24(Fri)08:30:42 No.102257008

so what's the verdict on XTC? meme or not?

Anonymous
09/06/24(Fri)08:31:05 No.102257011

Anonymous 09/06/24(Fri)08:31:05 No.102257011

>>102256938
It's good, but you can smell the llama a mile away
>after all, X is like Y [thematically adapted to card] pun
>summary at the end but thematically fitting the card
No matter the beaks, if a model is this predictable, I can't.

Anonymous
09/06/24(Fri)08:31:34 No.102257015

Anonymous 09/06/24(Fri)08:31:34 No.102257015

>>102256999
I used MinP at 0.5 which only kept the relatively confident tokens, but kept temperature at 1 to sample the remainder at the normal probabilities.

Anonymous
09/06/24(Fri)08:33:05 No.102257024

Anonymous 09/06/24(Fri)08:33:05 No.102257024

>>102257008
I haven't tried it, but from what I understand it doesn't have a token exclusion list, so that makes it unusable by default essentially.

Anonymous
09/06/24(Fri)08:39:30 No.102257071

Anonymous 09/06/24(Fri)08:39:30 No.102257071

A question that just sprung to mind.
Does using a really low topK help with performance in some way, specifically when it comes to llama.cpp?

Anonymous
09/06/24(Fri)08:39:45 No.102257074

Anonymous 09/06/24(Fri)08:39:45 No.102257074

>>102257024
You sound like a complete idiot.

Anonymous
09/06/24(Fri)08:42:24 No.102257092

Anonymous 09/06/24(Fri)08:42:24 No.102257092

>>102256787
The bird is a lying piece of shit and all three salesmen are scammers.

Anonymous
09/06/24(Fri)08:43:36 No.102257104

Anonymous 09/06/24(Fri)08:43:36 No.102257104

Is there a way to find out if these kind of posts are some kind of weird bot operation?
>>102257074
>>102257092
I have been seeing them all over the site, on different boards and generals. Maybe it's my bias, but they seem more prevalent on AI threads?
Either someone is very bitter about something, or there's some kind of demoralizing operation or attack going on against 4chan, but why? It's always the same tone, and the same type of responses.

Anonymous
09/06/24(Fri)08:47:48 No.102257134

Anonymous 09/06/24(Fri)08:47:48 No.102257134

>>102257104
Mentally ill users enabled by mentally ill moderation team. Bots are generally better at writing than that.

Anonymous
09/06/24(Fri)08:51:27 No.102257163

Anonymous 09/06/24(Fri)08:51:27 No.102257163

>>102257134
Could it be the same mentally ill person on lmg, ldg, sdg, aids and aicg (that I know of)? It's really uncanny how similar the posts feel.

Anonymous
09/06/24(Fri)08:51:35 No.102257164

Anonymous 09/06/24(Fri)08:51:35 No.102257164

>>102256855
Apparently, the tokenizers were fucked.
https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B/discussions/5

redownload quants

Anonymous
09/06/24(Fri)08:52:21 No.102257173

Anonymous 09/06/24(Fri)08:52:21 No.102257173

File: SubvertedDemocracy.jpg (31 KB, 640x708)

31 KB JPG

Is it just me or is Command-r 32b 08 2024 worse at RP than the previous version?

It's not slopped, just dry and "mechanical" writing.

Pic unrelated.

Anonymous
09/06/24(Fri)08:53:59 No.102257185

Anonymous 09/06/24(Fri)08:53:59 No.102257185

>>102257163
I mean barneyfag is 24/7 watching literally every board for ponies. Mental illness is a hell of a drug.

Anonymous
09/06/24(Fri)08:54:06 No.102257187

Anonymous 09/06/24(Fri)08:54:06 No.102257187

>>102257104
Just people tired of grifters promoting their bullshit for donations or employment somewhere.

Capcha: TNM0AT

Anonymous
09/06/24(Fri)08:54:23 No.102257193

Anonymous 09/06/24(Fri)08:54:23 No.102257193

>>102257104
>if they're good at prompting
then probably not
>if they aren't
then patterns will show over time and you can shit on them once you know.
there's usually one or two lazy retards that do use llms to write their troll shitposts for them every 3-5 or so threads but there is also always the possibility that it's genuinely just a sperg or an ESL-tard just shitting up the thread

Anonymous
09/06/24(Fri)08:56:59 No.102257216

Anonymous 09/06/24(Fri)08:56:59 No.102257216

>>102257185
Fuck you it's a kids show

Anonymous
09/06/24(Fri)08:58:57 No.102257235

Anonymous 09/06/24(Fri)08:58:57 No.102257235

>>102257185
>barneyfag
Holy fuck, he's still at it?
I sure hope he's gotten a bot to monitor threads at least.

Anonymous
09/06/24(Fri)09:00:59 No.102257264

Anonymous 09/06/24(Fri)09:00:59 No.102257264

Kobo, will you add speculative decoding in next release?

Anonymous
09/06/24(Fri)09:04:03 No.102257298

Anonymous 09/06/24(Fri)09:04:03 No.102257298

>>102257104
Aren't those just typical 4chan posts?
Compared to regular /g/ threads I would say /lmg/ is still pretty well-behaved and on-topic.

Anonymous
09/06/24(Fri)09:04:36 No.102257304

Anonymous 09/06/24(Fri)09:04:36 No.102257304

XTC sounds like a crypto scam

Anonymous
09/06/24(Fri)09:05:11 No.102257310

Anonymous 09/06/24(Fri)09:05:11 No.102257310

>>102257092
not a nice way to talk about your mom and your 3 dads

Anonymous
09/06/24(Fri)09:05:35 No.102257316

Anonymous 09/06/24(Fri)09:05:35 No.102257316

is there any way to quantize jamba and run it on cpu by any means at all? the transformers method quants using bnb which only works on gpu. I haven't tried vllm but it says it uses the transformers backend for jamba so that should be the same.

Anonymous
09/06/24(Fri)09:06:19 No.102257322

Anonymous 09/06/24(Fri)09:06:19 No.102257322

File: carlos matos bitconnect.jpg (30 KB, 637x358)

30 KB JPG

>>102257304
You know what else sounds like crypto scam?

Anonymous
09/06/24(Fri)09:07:57 No.102257334

Anonymous 09/06/24(Fri)09:07:57 No.102257334

>>102257104
The content of your post and what you're quoting doesn't match.
The second post is answering the puzzle? Like, did it hurt your feelings that he called the bird of the puzzle a piece of shit? Your reaction is weird, like you're some kind of snowflake.

Anonymous
09/06/24(Fri)09:08:03 No.102257336

Anonymous 09/06/24(Fri)09:08:03 No.102257336

>>102257071
Logits need to be sorted before selection. If you have a really high topk (and a really big vocabulary) sorting that list can take some time. Very little difference from 10 to 100, but it gets really slow if you use the entire vocabulary (~128k for llama3 and gemma2, i think).

Anonymous
09/06/24(Fri)09:08:55 No.102257344

Anonymous 09/06/24(Fri)09:08:55 No.102257344

reflection is a really obvious meme btw and it's embarrassing that any of you are falling for it
>let's overbake on CoT (probably with samples suspiciously similar to the evals we're testing)
>ohhh wowwww it's the best model ever, totally better than all the corpo giants (on that small set of simple evals)
>meanwhile, still chokes on simple real world tasks because it's insanely overbaked to use CoT for everything
never believe the benchmark killers, if the model was actually good they would put examples of it solving hard real world problems first and foremost

Anonymous
09/06/24(Fri)09:10:38 No.102257363

Anonymous 09/06/24(Fri)09:10:38 No.102257363

>>102257344
I mean I'm pretty sure sonnet 3.5 uses the same method.

Anonymous
09/06/24(Fri)09:10:59 No.102257369

Anonymous 09/06/24(Fri)09:10:59 No.102257369

>>102257336
>and a really big vocabulary
Ah, there it is. That's where my intuition was taking me.
Thank you for laying it out for me anon.
I suppose there's very little reason to not use some topK then. That explains why some models official spaces default to a value of 40 to 100.

Anonymous
09/06/24(Fri)09:12:51 No.102257386

Anonymous 09/06/24(Fri)09:12:51 No.102257386

>>102257363
it clearly doesn't, reflection is literally outputting explicit cot
you can tell sonnet 3.5 isn't from streaming response times, basically the same as old sonnet

Anonymous
09/06/24(Fri)09:13:44 No.102257396

Anonymous 09/06/24(Fri)09:13:44 No.102257396

>>102257298
He seems to genuinely think that anyone saying a bad word must be the same anon.

Anonymous
09/06/24(Fri)09:16:27 No.102257423

Anonymous 09/06/24(Fri)09:16:27 No.102257423

>>102257163
that guy has outright admitted to being in all those threads and crossposts from them (mostly to aids) regularly

Anonymous
09/06/24(Fri)09:19:22 No.102257455

Anonymous 09/06/24(Fri)09:19:22 No.102257455

>>102257423
Are you genuinely that triggered that someone called a fictional bird a piece of shit?

Anonymous
09/06/24(Fri)09:21:50 No.102257479

Anonymous 09/06/24(Fri)09:21:50 No.102257479

File: 1725628823463.jpg (194 KB, 1280x720)

194 KB JPG

>>102257455
nta but,
kyskyskyskyskyskyskyskyskyskyskyskys

Anonymous
09/06/24(Fri)09:23:21 No.102257498

Anonymous 09/06/24(Fri)09:23:21 No.102257498

>>102257092
>The bird is a lying piece of shit
You need to chill a bit dude, this is too much even for 4chan.
Don't you realize that there a lot of people that get discouraged from posting when they see mean words like that?

Anonymous
09/06/24(Fri)09:28:15 No.102257542

Anonymous 09/06/24(Fri)09:28:15 No.102257542

>>102257455
don't care about those specific posts, just confirming the schizo who shits up all the AI threads on the site is real

Anonymous
09/06/24(Fri)09:29:52 No.102257560

Anonymous 09/06/24(Fri)09:29:52 No.102257560

File: 1714841858546.jpg (122 KB, 1024x1024)

122 KB JPG

We need to take steps to make /lmg/ a more diverse and welcoming community. From now on, no one is allowed to be mean anymore. It's time to make empathy a priority.

Anonymous
09/06/24(Fri)09:30:04 No.102257561

Anonymous 09/06/24(Fri)09:30:04 No.102257561

https://huggingface.co/deepseek-ai/DeepSeek-V2.5
Deepseek 2.5 is on HF

Anonymous
09/06/24(Fri)09:31:47 No.102257580

Anonymous 09/06/24(Fri)09:31:47 No.102257580

>>102257560
Diverse, you say? Okay, I can do that.
*rapes you*

Anonymous
09/06/24(Fri)09:32:13 No.102257585

Anonymous 09/06/24(Fri)09:32:13 No.102257585

>>102257173
Not to be a dick, but it felt me before as well.

Anonymous
09/06/24(Fri)09:32:28 No.102257588

Anonymous 09/06/24(Fri)09:32:28 No.102257588

>>102257542
It's called schizophrenia, there was nothing wrong with these posts. You're literally pissing yourself about people saying mean words.

Anonymous
09/06/24(Fri)09:33:46 No.102257601

Anonymous 09/06/24(Fri)09:33:46 No.102257601

>>102257588
>don't care about those specific posts

Anonymous
09/06/24(Fri)09:35:00 No.102257621

Anonymous 09/06/24(Fri)09:35:00 No.102257621

>>102257334
There's a pattern to your posts. I can't put my finger on it yet.

Anonymous
09/06/24(Fri)09:37:03 No.102257642

Anonymous 09/06/24(Fri)09:37:03 No.102257642

>>102257585
*mechanical to me
lmao

Anonymous
09/06/24(Fri)09:40:48 No.102257695

Anonymous 09/06/24(Fri)09:40:48 No.102257695

>>102257560
>>102257479

Anonymous
09/06/24(Fri)09:41:35 No.102257704

Anonymous 09/06/24(Fri)09:41:35 No.102257704

>>102257104
What's wrong with XTC?

Anonymous
09/06/24(Fri)09:44:34 No.102257752

Anonymous 09/06/24(Fri)09:44:34 No.102257752

>>102257561
epycfags WON

Anonymous
09/06/24(Fri)09:49:56 No.102257814

Anonymous 09/06/24(Fri)09:49:56 No.102257814

>>102257561
Chat was ass at ERP and Coder was god mode at it. This could only be a downgrade over Coder.

Anonymous
09/06/24(Fri)09:51:15 No.102257832

Anonymous 09/06/24(Fri)09:51:15 No.102257832

>>102257173
It sure is dry af, I'm still sticking it the old version.

Anonymous
09/06/24(Fri)09:52:59 No.102257846

Anonymous 09/06/24(Fri)09:52:59 No.102257846

>>102257704
It fucks up newlines and EoS tokens because it doesn't have a blacklist.

Anonymous
09/06/24(Fri)09:53:35 No.102257853

Anonymous 09/06/24(Fri)09:53:35 No.102257853

>>102257704
In principle nothing; it's a sort of typical-p decoupled from top-p and with one more hyperparameter, so that now it can act just on the head of the token distribution. It's not something as novel as the author makes it out to be, though.

Anonymous
09/06/24(Fri)09:54:57 No.102257871

Anonymous 09/06/24(Fri)09:54:57 No.102257871

>>102257846
It doesn't do that because that's not how it works, bird lover.

Anonymous
09/06/24(Fri)09:58:02 No.102257903

Anonymous 09/06/24(Fri)09:58:02 No.102257903

>>102257846
If it was properly designed and configured, it would only remove the top choices when it makes sense to do so. Typical-p as mentioned in >>102257853) removes the top choices too, but even at the lowest setting (which makes it deterministic), it doesn't affect the length of the messages much.

Anonymous
09/06/24(Fri)10:02:31 No.102257951

Anonymous 09/06/24(Fri)10:02:31 No.102257951

>>102254279
sensible one was also not working. looks like there's a fix. I will try again
https://x.com/mattshumer_/status/1832015007443210706

Anonymous
09/06/24(Fri)10:05:40 No.102257984

Anonymous 09/06/24(Fri)10:05:40 No.102257984

File: 1719466889266.png (443 KB, 512x497)

443 KB PNG

>>102257104

Anonymous
09/06/24(Fri)10:05:40 No.102257985

Anonymous 09/06/24(Fri)10:05:40 No.102257985

File: mistralpr00mpt.png (370 KB, 1277x1529)

370 KB PNG

I know this is been spammed to death, but I need help with the new mistral prompting, specifically mistral large, I'm figuring something has to be wrong in my ST prompt because I never had this issue with another model. Basically, if I don't ban the EOS token, it will generate a blank message with 1 token, if I ban it, it responds perfectly normal, I took the prompt template straight from magnum 128B v2's HF page. Is there anything wrong with it? I'm thinking I'm misusing </s>, but i copied the format right off of magnum 123B so I don't know.

Anonymous
09/06/24(Fri)10:09:33 No.102258015

Anonymous 09/06/24(Fri)10:09:33 No.102258015

>>102257985
look at how the prompt is formatted in the ST console window
I think your story string + system prompt prefix/suffix will result in weird redundant instruct tokens but idk if that would be what's resulting in your problem

Anonymous
09/06/24(Fri)10:15:28 No.102258077

Anonymous 09/06/24(Fri)10:15:28 No.102258077

>>102254305
the answer is literally right there. the entire sampling API is being effectively rewritten right now and new samplers won't be merged until that is done.

Anonymous
09/06/24(Fri)10:16:55 No.102258092

Anonymous 09/06/24(Fri)10:16:55 No.102258092

>>102257561
Do they use the same attention mechanism that flash attention doesn't work with?

Anonymous
09/06/24(Fri)10:17:10 No.102258094

Anonymous 09/06/24(Fri)10:17:10 No.102258094

>>102258077
stop being mean

Anonymous
09/06/24(Fri)10:17:52 No.102258099

Anonymous 09/06/24(Fri)10:17:52 No.102258099

>>102258094
mean? i didn't even call him a retard, which is what he is

Anonymous
09/06/24(Fri)10:18:13 No.102258103

Anonymous 09/06/24(Fri)10:18:13 No.102258103

>>102258077
it's been more than a month. can't they do it faster?

Anonymous
09/06/24(Fri)10:19:48 No.102258124

Anonymous 09/06/24(Fri)10:19:48 No.102258124

>>102258103
can't you be patienter?

Anonymous
09/06/24(Fri)10:19:57 No.102258130

Anonymous 09/06/24(Fri)10:19:57 No.102258130

>>102258103
there were several iterations on the new API, but it should be merged very soon now

Anonymous
09/06/24(Fri)10:20:13 No.102258134

Anonymous 09/06/24(Fri)10:20:13 No.102258134

>>102258099
you're part of the demoralization campaign against 4chan

Anonymous
09/06/24(Fri)10:24:22 No.102258186

Anonymous 09/06/24(Fri)10:24:22 No.102258186

>>102254975
Thanks for the info. I was actually considering doing something similar but with largestral.
Now I'll have to reconsider how much ram I buy next.

Anonymous
09/06/24(Fri)10:35:25 No.102258320

Anonymous 09/06/24(Fri)10:35:25 No.102258320

>>102257561
rm -rf /media/models/reflections-70b

Anonymous
09/06/24(Fri)10:41:03 No.102258391

Anonymous 09/06/24(Fri)10:41:03 No.102258391

>>102257104
seems like pretty standard 4chan low-effort shitposting tbqh

Anonymous
09/06/24(Fri)10:42:04 No.102258401

Anonymous 09/06/24(Fri)10:42:04 No.102258401

>>102256673
meds, now

Anonymous
09/06/24(Fri)10:50:48 No.102258518

Anonymous 09/06/24(Fri)10:50:48 No.102258518

>>102256091
It's more compliant than Llama3 in my experience. The text is also close enough that I'm willing to accept the tradeoff, but I use q8.

Anonymous
09/06/24(Fri)10:51:12 No.102258520

Anonymous 09/06/24(Fri)10:51:12 No.102258520

>>102258391
fuck you for insulting the bird

Anonymous
09/06/24(Fri)10:51:24 No.102258522

Anonymous 09/06/24(Fri)10:51:24 No.102258522

This one works for me. I had to change to HF samplers with the original repo's tokenizer.
https://huggingface.co/legraphista/Reflection-Llama-3.1-70B-IMat-GGUF

sample output
https://pastebin.com/ArV6S1tn

Anonymous
09/06/24(Fri)10:52:58 No.102258545

Anonymous 09/06/24(Fri)10:52:58 No.102258545

>>102256673
>>102258401
Agreed. I never fail to laugh at how the schizos who dredge up ancient information about the people they have vendettas against, apparently fail to recognise just what absolute, bottom of the barrel, Gollum-tier human garbage it implies that they are, that they have sufficient time on their hands that they are able to track other people's posting histories to that degree.

Anonymous
09/06/24(Fri)10:53:12 No.102258548

Anonymous 09/06/24(Fri)10:53:12 No.102258548

So how was reflection compared to normal llama 3.1?
I'm tempted to download it but I know it won't be worth the trouble.

Anonymous
09/06/24(Fri)10:55:21 No.102258569

Anonymous 09/06/24(Fri)10:55:21 No.102258569

It might just be the style biasing my impressions, but I'm really enjoying ArliAI-RPMax-12B-v1.1.
It's weirdly good at following instructions while using lorebook info, and not sounding like a god damn robot.
The only "bad" thing I can see so far is the usual nemo pattern repetition issue, and it very strongly preferring the
>"dialog" *narration*
style, which I stopped using as a default a while ago, but that's not a big deal.
The model did need some wrangling in the first three or so messages, but after that, it's been smooth sailing

Anonymous
09/06/24(Fri)10:56:23 No.102258587

Anonymous 09/06/24(Fri)10:56:23 No.102258587

just port the new samplers yourself. you do know c++ don't you?

Anonymous
09/06/24(Fri)10:56:39 No.102258591

Anonymous 09/06/24(Fri)10:56:39 No.102258591

>>102258548
>So how was reflection compared to normal llama 3.1?
They only just fixed the tensors in the last 5 hours, so very few local will have had time to re-download and quant it.
The previous release was inherently broken and literally unusable

Anonymous
09/06/24(Fri)10:57:35 No.102258599

Anonymous 09/06/24(Fri)10:57:35 No.102258599

>>102258591
Oh shit, I forgot about that.

Anonymous
09/06/24(Fri)10:59:26 No.102258622

Anonymous 09/06/24(Fri)10:59:26 No.102258622

>>102258587
I don't
>t. used to write c++ for a living

Anonymous
09/06/24(Fri)11:00:20 No.102258633

Anonymous 09/06/24(Fri)11:00:20 No.102258633

Reflection really likes to hallucinate some wrong information when it starts thinking, then 'fixes' it with <reflection>

like, "Cats have five legs. <reflection>Wait, on reflection, cats have four legs</reflect>"

Anonymous
09/06/24(Fri)11:04:54 No.102258683

Anonymous 09/06/24(Fri)11:04:54 No.102258683

>>102258591
>previous release was inherently broken and literally unusable
How do these guys even live and test their shit? "THIS IS BIGGER THEN GPT-5 LOCAL WON" when it literally doesn't even work.

Anonymous
09/06/24(Fri)11:06:53 No.102258698

Anonymous 09/06/24(Fri)11:06:53 No.102258698

>>102257985
https://files.catbox.moe/0ukj1p.json
Context
https://files.catbox.moe/hej8wa.json
Instruct

I use these as a base for magnum v2 123b. Got them from the magnum repo, so it might help it might not (specifically the context template has some weird text towards the end that I usually don't see in other templates ("Let's get started. Please respond based on the information and instructions provided above.[/INST]"). Your problem could also be a corrupted/broken quant, could've gotten fucked mid download from HF I've had that happen to me once or twice.

Anonymous
09/06/24(Fri)11:07:49 No.102258706

Anonymous 09/06/24(Fri)11:07:49 No.102258706

>>102258683
>How do these guys even live and test their shit? "THIS IS BIGGER THEN GPT-5 LOCAL WON" when it literally doesn't even work.
yah its complete bs. they wasted a huge chunk of my day yesterday with that crap. I don't trust them at all anymore. I'll be waiting for independent results and the fp16 before I even think about it again

Anonymous
09/06/24(Fri)11:08:53 No.102258718

Anonymous 09/06/24(Fri)11:08:53 No.102258718

>>102257561
This looks like more than just a meme merge like some were saying. I think the chinks meant that v2.5 is replacing -chat and -coder as one model that's good at everything.

Anonymous
09/06/24(Fri)11:28:11 No.102258958

Anonymous 09/06/24(Fri)11:28:11 No.102258958

Your wish is my command.
>>102258941
>>102258941
>>102258941

Anonymous
09/06/24(Fri)11:30:31 No.102258990

Anonymous 09/06/24(Fri)11:30:31 No.102258990

>>102258591
>They only just fixed the tensors
Who fucked it, reflection guys or lmao.cpp niggers?

Anonymous
09/06/24(Fri)11:51:58 No.102259246

Anonymous 09/06/24(Fri)11:51:58 No.102259246

>>102257344
The big issue with Reflection is that they apparently aren't aware of the massive confirmation bias that most LLMs have. They will confidently say something wrong and confirm that it's right as long as it's something not very present in its dataset.

Anonymous
09/06/24(Fri)12:02:22 No.102259336

Anonymous 09/06/24(Fri)12:02:22 No.102259336

>>102253583
They'll be blamed anyway if Harris doesn't win, and ignored if she does.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.