[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collagel.jpg (3.18 MB, 3264x3264)
3.18 MB
3.18 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102216905

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>
File: 2024-09-04_00020_.png (1.88 MB, 1280x720)
1.88 MB
1.88 MB PNG
>>102221118
ty baker
>>
You're not getting "early Pixar movie" quality games without rendering at 4k or higher resolution with ray tracing, the very thing you bemoan.
>>
File: abomination.png (2.07 MB, 1024x1024)
2.07 MB
2.07 MB PNG
>>102221118
>Pixart Sigma
Why is it so bad?
>>
>>102221205
Undertrained. But you can extrapolate how good it would be if it weren't.
>>
>>102221192
>Pixar Renderman
>Historically, RenderMan used the Reyes algorithm to render images with added support for advanced effects such as ray tracing and global illumination. Support for Reyes rendering and the RenderMan Shading Language were removed from RenderMan in 2016.[6]
>RenderMan currently uses Monte Carlo path tracing to generate images.
>>
>>102221218
your point?
>>
>>102221118
>https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
broken.
>>
File: ComfyUI_03012_.png (1.04 MB, 1024x768)
1.04 MB
1.04 MB PNG
>>
File: ComfyUI_03011_.png (1.03 MB, 1024x768)
1.03 MB
1.03 MB PNG
>>
>>102221118
Nice coinslot
>>
>>102221241
>>102221255
looks like the turkish shill kek
>>
File: 00105-1329315580.png (1.49 MB, 888x1152)
1.49 MB
1.49 MB PNG
>>
File: 2024-09-04_00034_.png (1.38 MB, 1280x720)
1.38 MB
1.38 MB PNG
>>
Has anyone here messed around with the beta scheduler settings (alpha and beta) on different styles? On forge they're both set to 0.6 by default
>>
File: 2024-09-04_00041_.jpg (1.29 MB, 3840x2160)
1.29 MB
1.29 MB JPG
>>
File: 00019-1465427422.png (1.65 MB, 1104x1440)
1.65 MB
1.65 MB PNG
>>102221381
i did with 1.5, but not forge.
>>
File: 2024-09-04_00037_.jpg (914 KB, 3840x2160)
914 KB
914 KB JPG
>>
File: image_0.jpg (136 KB, 1080x2160)
136 KB
136 KB JPG
>joycaption pic rel with llama 8b gguf
>it knows the character, Naruto, Uchiha
pretty good desu

>This is a digital illustration featuring a close-up of Itachi Uchiha, a character from the manga and anime series "Naruto." He is portrayed with a serious, almost menacing expression, his eyes half-closed, and his mouth slightly open, revealing a small amount of blood on his teeth. His hair is dark blue, slightly disheveled, and he wears a black headband with a white emblem, typical of the Uchiha clan. His skin is fair, with a few visible scars and scratches, adding to his rugged appearance. He is wearing a black sleeveless shirt, with a simple chain necklace around his neck. The background is stark white, ensuring that the focus remains entirely on Itachi. The style of the illustration is semi-realistic, with sharp lines and a detailed shading technique that accentuates the textures of his clothing and skin. The lighting is harsh, casting sharp shadows that emphasize the contours of his face and body. The overall mood is intense, reflecting the character's dark and brooding personality.
>>
>>102221566
absolute slop
>>
File: ComfyUI_00125_.png (533 KB, 832x624)
533 KB
533 KB PNG
How do I get the "finger gun" gesture? If I do "finger gun" it draws her with gun barrels as fingers.
>>
>>102221624
try setting up joycaption or florence to caption an image of someone doing it and copy that.
>>
friend of mine found out how to run flux and keeps sending me slop, how do i make it stop without crushing him
>>
>>102221118
>upper left
'cado on the 'log kek
>>
File: 2024-09-04_00048_.png (1.47 MB, 1280x720)
1.47 MB
1.47 MB PNG
>>
>>102221566
kek and now he's married to tsunade apparently (who isn't in the picture, btw)
>The second individual, who is not visible in the entire image, is believed to be the man's wife, Tsunade, whose face is partially submerged in the water, mirroring Itachi's position. She is shown with long, dark hair and a serene expression, her face reflecting the serene, calm surroundings.

a...a for effort, joycaption...
>>
>>102221152
palette cleaner.
>>
File: 00109-3593613647.png (1.22 MB, 688x1008)
1.22 MB
1.22 MB PNG
>>
File: ComfyUI_01662_.png (867 KB, 960x1280)
867 KB
867 KB PNG
>>
File: 00110-1738613588.png (950 KB, 688x1008)
950 KB
950 KB PNG
>>
File: 1720532618849758.png (1.12 MB, 1024x952)
1.12 MB
1.12 MB PNG
>>
File: ComfyUI_01665_.png (3.21 MB, 1440x1920)
3.21 MB
3.21 MB PNG
>>
File: ComfyUI_03046_.png (910 KB, 1024x1024)
910 KB
910 KB PNG
>>
>>102221659
grit your teeth, that's what friends are for
>>
File: 00112-4176825515.png (661 KB, 616x808)
661 KB
661 KB PNG
>>
File: 1709021105766225.png (1.16 MB, 1080x1040)
1.16 MB
1.16 MB PNG
is this good
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14
>>
>>102222108
not really
>>
File: FLUX-D-040924-0068.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>
File: 1703608430156.jpg (197 KB, 1024x1024)
197 KB
197 KB JPG
>>
File: 1697143849200.jpg (1.08 MB, 1024x1024)
1.08 MB
1.08 MB JPG
>>
File: 2024-09-04_00059_.jpg (1007 KB, 3840x2160)
1007 KB
1007 KB JPG
>>102222163
funny
>>
File: 00114-4043077342.png (585 KB, 616x808)
585 KB
585 KB PNG
funny indeed
>>
>>102222204
Stop posting pictures of Jews
>>
>>102222222
>>
File: FLUX-D-040924-0071.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>
File: 00004-6183653.png (1.82 MB, 1024x1536)
1.82 MB
1.82 MB PNG
>>
File: fs_0320.jpg (33 KB, 640x640)
33 KB
33 KB JPG
>>
File: FLUX-D-040924-0069.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>102222571
I have seen women with tattoos of similar quality
>>
File: 2024-09-04_00074_.jpg (677 KB, 3840x2160)
677 KB
677 KB JPG
>>
stable diffusion is just a model. A model is just algorithms that try to predict stuff based on a database trained with more algorithms.

So, can I just make my own base model and train it myself? I'm not talking LoRAs, I'm talking about making it from 0. Stable diffusion is trained with millions of images, but wouldn't it be better for what I want to have a smaller and curated better database?
>>
>>102222692
if you have the tens of thousands of dollars of compute time, sure
>>
>>102222692
Yes you can train a model from scratch, Pixart SIgma is the easiest model to do this with. You need a minimum of two 4090s though but ideally you need like two A6000s. You need a minimum of like 500k extremely diverse images. With simple hardware you can expect to spend months of non-stop training.
>>
>>102222736
So the threshold is the computing power, not the database or the algorithms themselves?
>>
File: 2024-09-04_00076_.jpg (1.4 MB, 3840x2160)
1.4 MB
1.4 MB JPG
>>102222692
you can.. and many people did so, see
>https://huggingface.co/models?pipeline_tag=text-to-image&sort=trending
but the less data you feed into your model, the less it knows. This might be just what you desire possibly, for example a fashion designer would feed all his creations into an ai model, then the ai model can come up with new ones.. but most competent models want to be as general purpose as possible and for that you need millions of pictures and billions of neural network neurons.

Also a model is not an algorithm its a representation of a neural network like in our own brain (just simpler) .. the algorithm is the inference algorithm that asks the neural network to do things based on its learned "knowledge"
>>
>>102222758
you also need a good dataset and the manpower/time/money to curate it
>>
>>102222758
the computing power and the data set of images that needs to captioned with useful descriptions, so if you ask the model later for generated images it actually knows what you are asking for, else it will just like a random dice throw and will produce garbled content
>>
Realistically, he needs to learn how to bake a lora lmao
>>
>>102222756
My logic was that for what I want I don't need everything models like stable diffusion have, like, they are overkill. So maybe with less database and less computations, I get something smaller and less potent but that works better for something small and specific that I need. But I dunno how much the computational difference is. I do have a 4090 and can get the images I need, I just need to know what I have to do.
>>
>>102222778
>>102222766
>>102222758
I would argue the single most important thing is the data set.
You can have 1 billion images with dog shit captions and you will get a dog shit model
You can have 1 million images with excellent captioning and get an excellent model
Getting a non clit eastwood dataset is functionally impossible.
>>
>>102222785
yes, that would be the first step to making a real model, after all a LoRA is a mini model
>>
I wish all the shitty avatarfags would spam /h/ instead, even their schizoshit would get a bunch of attention there. It's good for them and for us, they'll even get a fanbase 1girling in hdg
>>
File: 2024-09-04_00081_.jpg (1.22 MB, 3840x2160)
1.22 MB
1.22 MB JPG
>>
>>102222795
It depends on your final image goals. A basic art model could just be easily be trained on top of Pixart Sigma and you save a lot of the pretraining legwork. A lot of the compute time is teaching the diffusion model and T5 how to make pretty colors. A single 4090 isn't going to cut it unless you want to have your computer run 24/7 for a year.
>>
>>102222798
I think something like LAION 1B trained on a modern architecture would be a wild ride. The problem is you waste a lot of compute on erroneous captioning.
>>
>>102222798
reminds me that BFL is fucking radio silent about their data set, even BFL anon that visited us here dodged every question about it
>>
>>102222815
agreed
>>
>>102222825
>>102222785
>>102222778
>>102222766
>>102222761
The thing is that the models always have that "AI" look, I see that same pattern posted here all the time. Maybe it is just people not cooking enough.
I've been messing around with LoRAs and trying to replicate artstyles and characters, but they always look off, and I never see good examples.
Maybe I have to choose better parameters or have a better base model to fine-tune, maybe it is just that.
Do you have good results?
>>
>>102222795
If you need to ask it's not for you
>>
File: 00119-23355489.png (594 KB, 616x808)
594 KB
594 KB PNG
>>
>anons are falling for it
>>
>>102222868
Models have the AI look because they're all trained on incestuous disgusting datasets. Not to toot Pixarts horn again, but it's also the model with the least AI look to it. Flux is overtrained on butt chin bokeh pictures. SDXL is overtrained on stock photography.
>>
File: 2024-09-04_00018_.png (883 KB, 1280x720)
883 KB
883 KB PNG
>>102222868
the lora I made for for flux.dev came out pretty good and can reproduce the style I trained it on 9 out of 10 times, but its a pretty simple anime style so it might vary for other target styles .. and the AI look .. well it gets better with every model released
>>
>>102222848
>>102222798
So is not just having a good database, it is also necessary to have good metadata, like saying these images are humans, these are props, these are backgrounds, metadata that helps the training to understand the images themselves, things like that?
>>102222873
>If you aren't born learned, fuck off!
You fuck off
>>102222896
So all models are shit by default? What is the better one for anime images/art/non realistic art in general? Can't they make good databases?
I remember stable diffusion 2 being shit because they removed copyrighted images or something like that.
>>102222909
That is exactly the AI look I want to avoid, I see it everywhere.
>>
>>102222909
>That is exactly the AI look I want to avoid, I see it everywhere.
then AI image generation maybe ain't for you.. pick up a pencil and learn drawing maybe?
>>
File: ComfyUI_00057_.png (2.27 MB, 2560x2560)
2.27 MB
2.27 MB PNG
>>102222939
if you want something to not "look ai" you need a clear art style and focus on what you want, otherwise the generic stuff fills in the gaps
>>
this for >>102222957
>>102222939
>>
>>102222939
>So all models are shit by default
now you're getting it, it would take a truly colossal amount of effort to make an actually good dataset, which nobody wants to pay for
>>
>>102222978
>which nobody wants to pay for
also there are legal and political issues with a perfect data set
>>
>>102222993
a problem that can be solved with vast amounts of money but yea
>>
File: ComfyUI_00003_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
Have a cute picture since I finally got this working on my computer. Is there any way to give a logo or symbol as a reference to Flux through ComfyUI?
>>
>>102222967
That's pretty good anon, good job.
>>102222978
Aren't /g/autists up to the task?
>>102222957
I can always polish the image at the end, but I want to get as close as possible.

Are LoRAs able to mask up the base models flaws?
>>
>>102223004
>pls halp Elon ..
idk you also need some legal immunity
>>
>>102222939
Nobody is born learned, some are capable of learning, you are not. Stop biting off more than you can chew.
>>
>>102223024
>Are LoRAs able to mask up the base models flaws?
if you finetune em long enough, yes .. but often they then get one sided.. good example is SDXL, which was pretty meh at the beginning, and seriously bad at porn.. then this pony autist hammered in most degenerate porn capabilities.. but now the pony version of SDXL can only make porn well, all other stuff it can only do meh
>>
>>102223024
crowdsourcing is the only practical way to do it but the main problem is that the average coomer has both absolutely abysmal taste and none of the knowledge required to properly caption an image

someday it will exist but not soon
>>
>>102222939
Images are captioned with a paragraph or more of text, these captions help the model learn relational context. These models end up being shit because 1) they remove all the copyrighted/probematic content and 2) they overtrain on whizbang wahoo Reddit shit. So most models end up with highly airbrushed one girl portraits and digital art trending on Artstation Twitter bait. Flux also barely focused on art stuff which really makes the model suffer. At the bare minimum you should properly annotated public domain art which on its own is hundreds of thousands of extremely high quality, actually well made images.
>>
https://huggingface.co/datasets/bigdata-pw/Artsy
1.5m art set is up, 2.5tb webdataset, high res, variety of mediums
dataset viewer isnt available yet for some reason, just hugging face things
>>
>>102223057
yea sadly.. photography genned with flux looks like a sad instagram feed
>>
>>102223093
on another note, most SDXL photography gens look like a pornstar wiki entry
>>
>>102223088
are they tagged?
>>
>>102222939
While it is unlikely that you can train one of these models, yet, you should prepare for the future. Download all the images you can before they make scraping impossible. Download everything. It's crazy to think how much hardware and training has evolved since SD 1.4.
>>
File: ComfyUI_01675_.jpg (1.68 MB, 1792x2304)
1.68 MB
1.68 MB JPG
>>
>>102223110
>Entries include image, title, artist, date and medium
if it's not in the source it's not in the set. would you like to contribute to a crowdsourced tagging initiative?
>>
File: file.png (211 KB, 1532x352)
211 KB
211 KB PNG
>>102223157
the GPUs already go burr
>>
File: FLUX-D-040924-0082.png (1.68 MB, 832x1216)
1.68 MB
1.68 MB PNG
>>
>>102223179
desu it's a small enough set that i can get them florence captioned for a few dollars. this may be of interest to you
https://github.com/bigdata-pw/florence-tool
>>
File: file.png (117 KB, 1130x678)
117 KB
117 KB PNG
>>102223196
I bulk process everything and have it put in a database
>>
>>102223206
cool, still it's probably more efficient than your script
>>
>>102223138
WDYM scrapping will be impossible
>>
>>102223222
I doubt it's faster than a while loop
>>
>>102223228
i doubt yours is fast at all
>>
File: 1722286744169.jpg (609 KB, 1024x1024)
609 KB
609 KB JPG
>>
>>102223251
Thanks for the datasets, I downloaded them. But I really don't give a fuck because I doubt your thing does multiple images per second.
>>
>>102223259
you're welcome. of course it does multiple per second, i wouldn't recommend it otherwise
>>
>>102223225
Everyone knows all the AI models are scraping the internet which means data = money. More and more sites are going to go fort knox with their images limiting how many images you can download per day.
>>
>>102223284
Are you telling me it will do multiple long captions per second? What's the magic? The 3090 is currently doing 1 per 800ms give or take.
>>
File: 1711444719248.jpg (591 KB, 1024x1024)
591 KB
591 KB JPG
>>
>>102223294
the code is right there, take a look
DETAILED_CAPTION or MORE_DETAILED_CAPTION? i'd have to check for those but i've ran CAPTION task @ 60k per hour on A40
>>
File: ComfyUI_01679_.jpg (828 KB, 1440x1920)
828 KB
828 KB JPG
>>
File: 00001-4050492822.jpg (1.78 MB, 3584x4608)
1.78 MB
1.78 MB JPG
Upscale fail, zoom in for all the hidden people.
>>
File: 1721233344755.jpg (455 KB, 1024x1024)
455 KB
455 KB JPG
>>
>>
File: FLUX-D-040924-0052.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>
File: ComfyUI_01683_.jpg (911 KB, 1440x1920)
911 KB
911 KB JPG
>>
As much as I hate to admit it, relying on bucketing alone to train LoRAs is probably a bad idea. Curated square shaped images is better.
>>
>>102223655
I have always gotten better results using bucketing
>>
>>102223506
Those are all the spirits of their ancestors watching them masturbate
>>
File: 1712549484553.jpg (742 KB, 1024x1024)
742 KB
742 KB JPG
>>
>>102223284
HLKY
>>
>>102223655

How does Kohya_SS handles bucketing? Seems to handle details well even with bucketing.
>>
File: Untitled-2.png (850 KB, 1024x1024)
850 KB
850 KB PNG
Putting 4 512 images into a 2x2 grid to make 1024x1024 grid and improve learning during LoRA training. Based or cringe?
>>
File: 1710578907614033.png (928 KB, 1280x720)
928 KB
928 KB PNG
i feel like prompting has improved my writing abilities by just having to be as concise as possible
>>
>>102223770
It will learn to make 2x2 grids pretty well
>>
>>102223778
Let's find out! See you in a thousand or so steps.
>>
>>102223787
That's the spirit. Remember, 90% of the time nobody has a fucking clue, so it's always best to try it yourself.
>>
>>102223770
All training is "please make more of this" in a very literal fashion
>>
>>102223802
I have anecdotal evidence from a previous LoRA I made with flux where I jus threw entire manga pages into the mix and it didn't create manga pages until I completely fried it at like 8000k steps. Otherwise it still produced individual characters from the Manga.
>>
>>102223820
did the captions refer to the nature of manga pages?
>>
>>102223830
I don't use captions.
It did produce manga pages when I prompted for a manga page specifically though.
>>
File: 00024-1630348299.png (1.48 MB, 1024x1536)
1.48 MB
1.48 MB PNG
>>
>>102223770
why would that improve learning... why wouldn't you just use multi resolution with more images and higher batch size... It's just going to get confused about the image/subject outcome
>>
File: 00025-3236521974.png (1.35 MB, 1024x1536)
1.35 MB
1.35 MB PNG
>>102223865
>>
>>102223921
>>102223865
Dreamshaper face
>>
I'm grumpy
Please gen a fizzlegapalagopacus and post results, it's the only way to lift my spirits
>>
File: 00018-605497460.jpg (168 KB, 1488x1008)
168 KB
168 KB JPG
>>102223647
I like the aura of this gen. Nice job
>>
File: 00004-3494582037.png (2.27 MB, 1024x1536)
2.27 MB
2.27 MB PNG
>>102223921
I'm using realCartoonRealistic but I've gotten similar face with also RealSpice earlier versions. Do you dislike the face?
>>
File: ComfyUI_00759_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
File: 1721650851783.jpg (760 KB, 1024x1024)
760 KB
760 KB JPG
>>
File: ComfyUI_01330_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>
File: ezgif-3-0a5df1f3a6.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>102223943
>>
File: FLUX-D-040924-0091.png (1.12 MB, 832x1216)
1.12 MB
1.12 MB PNG
>>102223943
>>
File: image(31).png (415 KB, 1024x1024)
415 KB
415 KB PNG
>>102223943
>A kawaii fizzlegapalagopacus
>>
File: 0.jpg (264 KB, 1024x1024)
264 KB
264 KB JPG
>>
File: 1719954209165.jpg (472 KB, 1024x1024)
472 KB
472 KB JPG
>>
File: FLUX-D-040924-0093.png (1.19 MB, 832x1216)
1.19 MB
1.19 MB PNG
>6yo me waiting for my mom to stop yelling at the cashier to tell her I have shit my pants
>>
>>102223985
>>102223990
>>102223999
thank you
>>
>>102223875
I think you're being dismissive of the idea.
>>
File: ComfyUI_33332_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>102223943
>>
File: 0.jpg (144 KB, 1024x1024)
144 KB
144 KB JPG
>>
File: FLUX-D-040924-0094.png (1.17 MB, 832x1216)
1.17 MB
1.17 MB PNG
You will never guess the prompt for this.
>>
What did Flux mean by this?
>>
hey /ldg/ retard newfag here, need some help
say if I want to start hosting a local UI to generate based on someone's artworks (popular company mascot), what do I write in the prompt box? "Extreme detailed image of this creature in tuxedo"?
>>
File: ComfyUI_01687_.jpg (879 KB, 1440x1920)
879 KB
879 KB JPG
>>
>>102224398
You don't need the embelishment. If it knows the mascot you just write "a photograph of Mickey Mouse"
>>
>>102223024
>>102223093
It's a prompt issue, Flux can do really good realistic images but you need a good prompt

For now most Flux loras are shit, they break realistic images generation, they're only good for art style images
>>
>>102224269
>T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.
>>
File: face.jpg (84 KB, 889x500)
84 KB
84 KB JPG
>>102223921
>>102223865
>>102223927
>>
File: ComfyUI_00790_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
>>102224481
I look like this
>>
File: 00031-725912355.png (2.53 MB, 1024x1536)
2.53 MB
2.53 MB PNG
>>102224473
She looks like she is seeing the future that's in store if she fails his tasks.
>>
File: lora.jpg (61 KB, 500x500)
61 KB
61 KB JPG
Average lora result
>>
File: ComfyUI_01689_.jpg (903 KB, 1440x1920)
903 KB
903 KB JPG
>>
>>102224519
Skill issue
>>
>>102224527
It's same prompt as >>102224473

Most flux loras are shit, not tested before
>>
>>102224432
Thanks for answering, but what if it doesn't know?
Will inputting the images help it recognizes the creature or do I need to train it slowly, and what would one write for it to learn the name of said creature?
>>
File: FLUX-D-040924-0100.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>102224538
I don't really know how people fuck up LoRA training so bad.
>creating new e-girls by mixing e-girl LoRAs
>>102224558
Then you either download or train a LoRA of that mascot to make it know the concept. You can't just input images, not really.
>>
File: 00034-3226005829.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>
File: pillow.png (614 KB, 512x768)
614 KB
614 KB PNG
flux is pretty cool
>>
what's the least amount of pictures you've been successful training a character lora for flux? i hear 20 pics, but i dont have 20 quality ones
>>
>>102224622
5-10, depends how well it can understand the concept. You can always turn on flip to "double" the images, meaning it trains on a reverse of the image too.
>>
File: ComfyUI_01691_.jpg (993 KB, 1920x1440)
993 KB
993 KB JPG
>>
>>102221683
Kek that's what I see too. Dunno if intentional or not.
>>
File: ComfyUI_03069_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>102223770
>>102223778
>>102223814
>>102223875

Grid schizo here. Done with grid training at 1500 steps, but it was already showing signs of overtraining at 1000 steps. All in all I had 4 grids of 4 images.
Never once during training was a grid produced, nor was one produced during inference. Training on grids will not output grids.
I trained at 1024x1024
>>
>>102224729
>artifaaaacts
Run your dataset through a jpeg artifacts removing model in chainner before training, And if those are h264 artifacts, consider downloading a high quality release and retaking screenshots in png.
>>
File: 1725419888.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
what's the local equivalent of NAI's variety boost?
>Variety Boost implements the idea presented in Kynkäänniemi et al’s paper, Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models: https://arxiv.org/abs/2404.07724
i've been waiting for pony v7 before i get back to trying t2i but i wanna try this out
>>
File: ComfyUI_03075_.png (690 KB, 1360x768)
690 KB
690 KB PNG
>>102224750
No
>>
File: 1725411017.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>
File: 1712094275179441.png (1.25 MB, 832x1280)
1.25 MB
1.25 MB PNG
>>
File: ComfyUI_03083_.png (1.21 MB, 1360x768)
1.21 MB
1.21 MB PNG
Anyway, training is a grid works fine, but it began to overtrain at around 750 steps. Here it is at 1500. I till try again at a more conservative LR since I am only using limited blocks I thought I would need to crack it up a bit higher than usual.
>>
File: 1725419007.png (872 KB, 1024x1024)
872 KB
872 KB PNG
>>
>>102224075
joyous
>>
>>102224851
>>102224729
I didn't notice the artifacts as clearly in your other loras, do you think this has to do with the grids, or compressing your images smaller to make the grids?
>>
File: 1725430117.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>
File: Untitled.png (120 KB, 584x223)
120 KB
120 KB PNG
>>102224912
Probably the dimension (128) and the fact this was trained natively at 1024 and not 512. The images themselves were cropped but not compressed in any way.
>>
File: 1725430799.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
>>102224939
>>102224912

There was ONE other thing I did different.

>- "transformer.single_transformer_blocks.25.proj_out"

Someone mentioned that this had influence on textures? Not sure how true that is and how much is placebo. But it was being trained at the time and the video artifacting was definitely captured.
>>
>>102224843
prompt?
>>
>>102225091
picture of a young woman at a club, motion blur, Amateur photography, on flickr in 2007, 2005 blog, 2007 blog, blurfoto 
<lora:Aesthetic_Amateur_Photo_V4:1>
<lora:amateurphoto-v3.5:1>
<lora:FLUX.1d_-_Blurry_Photos:1>
<lora:flux_realism_lora:1>
>>
>>102224780
Is this house beans?
>>
File: 1725432352.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
>>
>>102225119
Yes, kidney beans. :^)
>>
File: 1702933762350070.png (363 KB, 512x512)
363 KB
363 KB PNG
>>102225157
guru lora?
>>
>>102221943
nice (ack) picture anon
>>
>>102225187
Correct. It's a homebrewed one.
>>
File: 1705316989103712.png (1.29 MB, 832x1280)
1.29 MB
1.29 MB PNG
>>
>train on 12gb vram with a 500 image dataset with 1024x1024 bucketing, fine
>try the exact same settings with 80 images, but with multi resolution, half 1024x1024, half 524x524 = OOM on 3rd step in

whyYYYYYYYYYYYYYY
>>
File: 1725433390755061.png (174 KB, 480x474)
174 KB
174 KB PNG
>>102225241
me in the back
>>
File: 00018-1882900784.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
>>102225243
nevermind I'm a retard and had tried downgrading torch/cuda for some previous test, re-upgrading it fixed the issue. WE'RE BACK IN VRAMLET BUSINESS EYO
>>102225260
very cute
>>
>>102225243
Bucketing different resolutions can eat more VRAM.
>>
>>102225243
>Training long enough on a to get a good results on 500 images on 12gb of ram
Literal cock and ball torture.
>>
>>102225263
Nice. I'm excited to try finetuning ar 12gb so I can combine all my ideas into one checkpoint in a more consistent flexible way.

Should be ready soon, they even got it to work at around 8gb
>>
>>102225077
The pedo poster is back
>>
>>102225263
>nevermind I'm a retard and had tried downgrading torch/cuda for some previous test, re-upgrading it fixed the issue. WE'RE BACK IN VRAMLET BUSINESS EYO
I hope you're not using pytorch 2.4.0, this one seems to give worse quality images than 2.3.1
>>
>>102225269
NTA but it's not that bad, you just start training go to sleep, wake up with results.
>>
>>102225280
Good idea, give him attention
>>
>>102225281
how so?

Also turkish guy said 2.5.0 makes it faster supposedly.
>>
>>102225295
It's only faster if you subscribe to his patreon
>>
File: file.png (3.45 MB, 3185x1612)
3.45 MB
3.45 MB PNG
>>102225295
>following the turkish grifter's advice
come on anon :(
>>
>>102225269
at 32 dim it only took 10~ epochs which was roughly overnight and some change, was done when I woke up. I later ran more tests on rented gpus with 30 epochs and didn't really feel the results were all that better for letting it cook longer, if anything it just started overcooking hands more frequently. granted, I had both on 3e-4 LR which is a bit high, it worked great for my style lora
>>102225281
I thought that was shown to only be for image generation & not lora training. either way I don't really mind, I've been happy with my results so far and I'll accept a bit of quality shitting vs needing to buy a 4090 (I'm a neet I can't afford that shit)
>>
File: file.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
noice
https://civitai.com/models/717607/ayanami-rei-neon-genesis-evangelion-flux-prototype?modelVersionId=802464
>>
>>102225308
I meant for training. But the turkish guy told me not to listen to you guys so I'm gonna assume this image is a false flag photoshop
>>
>>102225324
>I meant for training.
Some anon tried to make a training comparison but he couldn't because he got OOM on the 2.3.1

>the turkish guy told me not to listen to you guys so I'm gonna assume this image is a false flag photoshop
kek
>>
>>102225308
2.5.0 actually looks closer to the prompt though, further hand is better too (thumb tucked instead of a weird faggy looking effeminate thumbnail)

if anything it kind of looks like, both from pic rel and the examples I've seen, that different pytorch versions = variances in seeds are different. it'd be nice if someone did one of these with a bunch of different seeds to compare so we could make a proper verdict
>>
>>102225342
>2.5.0 actually looks closer to the prompt though
I don't agree with that, usually when you think of an angry detective you'd go for the old grumpy guy, not the 22 yo newbie

>it'd be nice if someone did one of these with a bunch of different seeds to compare so we could make a proper verdict
yeah true, I don't know if I have the motivation to do that though, maybe one day
>>
>>102225293
Go back to geyddit This site without schizos were boring.
>>
>>102225342
yeah I would need to see multiple examples with multiple seeds to confirm the quality personally.
>>
>>102225364
I think the prompt can be confusing, Japanese anime makes it want to go with bigger eyes. Maybe try it witout anime in the prompt for a clearer result I guess.
>>
>>102225319
Everybody in the audience:
"We are becoming heeeerrrrrrr"
>>
>>102225364
>I don't agree with that, usually when you think of an angry detective you'd go for the old grumpy guy, not the 22 yo newbie
its the "anime illustration from japan" part for me. the first one just looks like a western cartoon. beyond that, it's not like anything in the prompt is saying it needs to be a grumpy old guy, so there is no reason for the AI to assume its a grumpy old guy even if that's your kneejerk thought, you know?
>yeah true, I don't know if I have the motivation to do that though, maybe one day
feels lmao, I sure as shit am not bothering. ignorance is bliss for now
>>
File: file.png (2.44 MB, 1024x1024)
2.44 MB
2.44 MB PNG
>>102225390
Concept bleeding is tough, that's why Loras are some cope, they rape the weights too much and common, Loras was already a thing in 2022, you'd think in those 2 years we would find its replacement, something that would be more subtle, but nah...
>a pulp cult anime illustration from japan,
>ppayanamirei, highly detailed. No flat color, no hand drawn, no fan art. Rei has pale blue hair.
>Rei and Hatsune Miku are on a bar, dressed up
>>
File: 00021-2581197552.png (1.48 MB, 824x1024)
1.48 MB
1.48 MB PNG
>>
>>102225405
there really is no way loras can't be improved more than we've seen so far, I feel like there just isn't enough investment in lora tech. kind of understandable since its not like SaaS is using them, but shitty for us localtards
>>
>>102225442
This one is really nice
>>
>>102225442
hey afroctopus, whats krakken'?
>>
ok so just a quick recap, someone correct me if I'm wrong. if I want to generate FLUX content, I have the following options (24GB VRAM):
a) I use the original model with the fp8 weight, the full fp16 t5 text encoder and as soon as one lora is involved, I OOM so I need to force the CLIP into the ram, causing a nice delay on various occasions but I do get a steady 1.0x s/it (CFG 1) once the generation process starts.
b) I use a (Q8) GGUF model and whatever text encoder (fp16: little headroom, Q8: more (orly) and get no delay whatsoever when changing the prompt or adjusting the lora weights but my gen speed plummets down to 2.x s/it, and that is at CFG 1. am I missing something here?
>>
>>102225450
>kind of understandable since its not like SaaS is using them, but shitty for us localtards
I'm not so sure about that, my theory is that Midjourney and NovelAI are using some kind of Loras to get those styles fluidity shit, I could be wrong though
>>
>>102225454
based pedo poster
>>
>>102225472
>fp8 weight
>full fp16 t5 text encoder
>I OOM

>Q8
> fp16 t5 text encoder
>little headroom

how is that possible? Q8 is bigger than fp16
>>
>>102225485
>>102225472
>Q8 is bigger than fp16
*fp8
>>
Guys I had no idea there was Flux music too
>>
>>102225474
sadly I doubt NAI or MJ would opensource any improvements they make upon loras if they do, do that. interesting theory, though
>>102225472
iirc q8 is "better" (closer to fp16) than fp8, I don't think theres much point to using fp8
>>
Is the resident pedo poster using flux with a lora or what? What's the deal here, I thought the model would restrict such gens
>>
>>102225513
and it's a crypto currency
flux does everything
>>
>>102225513
probably a grift, that retard added "Flux" onto its name to bring some undeserved attention
>>
pedo really cooked with that last one i must sadly admit
>>
>>102225513
>Flux music
Where?
>>
>>102225517
>iirc q8 is "better" (closer to fp16) than fp8, I don't think theres much point to using fp8
the only point imo is for the 24gb vram fags, it's easier to put fp8 unet + fp16 clip in it, Q8 is a big bigger so if you go for loras you have chance of OOM
>>
>>102225485
wait I am not fully awake yet. right now I am using the t5xxl_fp16 and the Q8 GGUF with citys GGUF nodes from yesterday before he backtracked (and that first best smooth babybutt whatever clip:l) and that gives me enough headroom to work with several loras.
>>102225517
well if it doubles my generation time then I can live with a little loss, but again, the whole situation seems unsatisfactory. that is why I am asking.
>>
>>102225525
>>102225520
oh...lol

>>102225533
https://github.com/feizc/FluxMusic

Probably not relate to flux based on the above comments
>>
>>102225551
If someone ever made a ComfyUI style interface/workflow with Udio-level capabilities, I'd probably just proompt music all day long
>>
File: 2024090301.jpg (123 KB, 1488x1008)
123 KB
123 KB JPG
Good morning
>>
File: 00037-1602250031.png (1.78 MB, 1024x1536)
1.78 MB
1.78 MB PNG
>>102225574
Good morning Sir. How are we feeling today?
>>
File: 2024090307.jpg (158 KB, 1488x1008)
158 KB
158 KB JPG
>>102225590
Pretty okay how about you
>>
File: 00038-550035507.png (1.51 MB, 1024x1536)
1.51 MB
1.51 MB PNG
>>102225602
Can't complain, though the day seems to be looking very humid and warm.
>>
>>102225574
>>102225590
>>102225602
>>102225618
What does this have to do with Image Synthesis?
>>
File: 2024090308.jpg (90 KB, 1488x1008)
90 KB
90 KB JPG
>>102225618
I'm sorry. There was actually a pretty strong chill here yesterday. Hope you get through it ok
>>
>>102225563
yeah would be fun, and if I like a tune I could expand on it with a real music maker like FL studio
>>
File: 00040-1817392868.png (1.55 MB, 1024x1536)
1.55 MB
1.55 MB PNG
>>102225627
The images are doing the heavy lifting in that part.
>>102225635
Yeah, I'm not that worried about it. I'm not really in a hot climate.
>>
File: 00042-4263889763.png (1.61 MB, 1024x1536)
1.61 MB
1.61 MB PNG
>>102225602
>>102225635
What model are you running? Seems to handle fantasy fairly decently.
>>
>>102225485
you are right, I am using the fp8_e4m3f model and the fp16 t5 now, 3 loras, 91% VRAM. no OOM. something must've changed. well NICE.
>>
File: 00023-2450449909.png (859 KB, 824x1024)
859 KB
859 KB PNG
>>
File: 2024090303.jpg (134 KB, 1488x1008)
134 KB
134 KB JPG
>>102225724
Its just flux
>>102225680
Good to hear. Hope you have AC anyway
>>
File: rope.png (3.7 MB, 1568x1568)
3.7 MB
3.7 MB PNG
>>
File: 00049-3532477094.png (1.7 MB, 1024x1536)
1.7 MB
1.7 MB PNG
>>102225792
I have some in the car only :D 1st one didn't even have that. At home I chill with a tub of cold water under my desk and my feet in it. Works like a charm to cool down.
>>
File: 2024090302.jpg (161 KB, 1488x1008)
161 KB
161 KB JPG
>>102225828
That sounds really comfy actually. I'll havw to try that sometime.
>>
File: 00050-2724863666.png (1.48 MB, 1024x1536)
1.48 MB
1.48 MB PNG
>>102225846
Works really well with a decently sized tub. With all the blood vessels running through feet and all that.
Your image tone makes me think of Miyazaki toons. Style tho I can't put my finger on.
>>
>>
>>102225908
Holy overcooked
>>
>>102225912
Cfg 99 be like
>>
>>102225925
fr fr no cap
>>
>>102225405
>they rape the weights too much
how about reducing their weight? yea sure some are seriously overtrained, but you local, you got the control
>>
File: 2024-09-04_00092_.jpg (885 KB, 3840x2160)
885 KB
885 KB JPG
>somewhere, over the mountain, over the sea, someone is waiting for you anon
>>
why are the shit celeb loras good and the good celeb loras SHIT bros
>>
>>102225968
and that person is miku wearing a cap with the word " C R I N G E" written on on it
>>
>>102225981
There are no good celebs
>>
>>102226004

>>102222222
>>
File: ComfyUI_03088_.png (1.13 MB, 1360x768)
1.13 MB
1.13 MB PNG
>>102224851
I did some retraining on the shinobu LoRA after realizing it was picking up the compression artifacts. I lowered the rank from 128 to 32 and halved the learning rate to 3e-4.
I still used the same 4 images containing 2x2 grids. When training at 512x512 I get like 1.2 or 1.4 it/s compared to the 3.5 it/s I got using this method so at 1024 with 4 images in the one image I actually averaged faster training considering I was only required to do 300 steps per images.
So anyway, I think using grids is pretty viable desu.
>>
File: 7.jpg (1.01 MB, 2000x2783)
1.01 MB
1.01 MB JPG
>>102226152
no fangs!

but ya.. good work, I used some silly dozens of character pictures like pic related, did not mess the lora up, flux is relatively robust for some sprinkles of madness in the data set
>>
File: ComfyUI_03094_.png (1.24 MB, 1360x768)
1.24 MB
1.24 MB PNG
>>102226168
Yeah there was not much information about her teeth in the grids. Turned out to be a cool experiment though. I might try it again with an actual properly curated dataset
>>
File: ComfyUI_03096_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
Also it will give you a grid if you ask for one but training on grids alone does not give you grid outputs. Very interesting.
>>
>>102226224
I'm sorry for doubting your experiment, anon
>>
File: ComfyUI_03102_.png (1.26 MB, 832x1024)
1.26 MB
1.26 MB PNG
>>102226255
I dont blame you, it was dumb on paper.
>>
File: ComfyUI_03108_.png (1.35 MB, 832x1024)
1.35 MB
1.35 MB PNG
https://gofile.io/d/SeD1cI

Here's the shinobu LoRA for anyone who wants it. I think uploaded work to civit is for buzz farming jeets.
>>
>>102225943
if you reduce the strength the character doesn't look like them at the end unfortunately
>>
People out here really "developing workflows" and "training Loras" on their gaming rig to gen another tired variant of
1girl, big titties, fantasy style
>>
>>102225770
hhhehehhehe
>>
File: ComfyUI_03116_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>
>>102226409
Yes
>>
senpai noticed me
>>
>>102226409
bait harder faggot
>>
About to try flux quants. What am I in for?
>>
>>102226483
You sound upset
>>
File: file.png (2.56 MB, 1024x1024)
2.56 MB
2.56 MB PNG
>>102226409
Yes
>>
>>102226506
flux quants
>>
>>102226506
you'll wait few minutes and then say 'meh'
>>
>>102226506
Flux Quants, ostensibly
>>
I'm afraid this one >>102226528 gets it.
>>
>>102226409
You sound upset
>>
meh
>>
https://github.com/feizc/FluxMusic
>FluxMusic
How much of a griffter must you be to piggyback on the deserved achievements of Flux?
>>
>>102226321
>I think uploaded work to civit is for buzz farming jeets.
I don't think so anon, I go to civitai just to find cool loras to try it out, it's a shame you don't want normal people to enjoy your lora just because there's some buzz farming jeets in there
>>
>>102226581
I don't see a problem. What am I missing?
>>
>>102226581
>This repo contains PyTorch model definitions, pre-trained weights, and training/sampling code for paper Flux that plays music.
>Acknowledgments: The codebase is based on the awesome Flux and AudioLDM2 repos.

It's a study with code, Anon. It's literally progress in the field based of a great new architecture.
>>
>>102226608
If I make a restaurant I'll call it McDonalds-Eat, so that I can gather the name recognition of McDonalds, surely they won't mind right?
>>
>>102226581
It's only a problem if the model isn't as good at making music as flex is at making images. Which, is most likely not. So you are right, but for the wrong reasons.
>>
>>102226623
I see.
>>
>>102226623
they're demonstrating how you can use the flow model for music generation. not sure why you're butthurt about that.
>>
>>102226581
>no demo
I'm not gonna download 12gb of files just to get dissapointed in the end
>>
File: 1694323809924677.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
>>102226641
>the flow model
you mean the flux model?
>>
File: 1695941189546662.png (178 KB, 353x334)
178 KB
178 KB PNG
>>102225097
Thanks for not being a fag like most people who don't share prompts. Well done!
>>
flex model
>>
>>102226661
flux utilizes flow, its name is derived from it
>>
>>102226683
then why not calling it FlowMusic then?
>>
JUMP FOR R
>>
File: file.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
https://www.youtube.com/watch?v=cko4ub-r0NQ
>>
File: 1709869421675574.png (342 KB, 896x512)
342 KB
342 KB PNG
>>
>>102226690
I just referred to it as that because of flow-based diffusion transformer models, as used in flux
not sure why that is important to you
>>
File: 2024-09-04_00126_.jpg (186 KB, 1280x720)
186 KB
186 KB JPG
>>
>>102226728
If you want to defend piggyback grifter that means it's important to you, are you the owner of this repo anon?
>>
File: 1704650385209592.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>
Bread delivery is here...
>>102226744
>>102226744
>>102226744
>>
File: file.png (479 KB, 1200x750)
479 KB
479 KB PNG
>>102226760
I hated this olympics so hard because of that, the only good things that it broughts was Djokovic winning the gold and Phryge, that mascot is so cute, someone should make a lora out of it kek
>>
>>102226581
because they ARE using flux, you idiot
they are just representing music as spectrograms
>>
>>102226742
yes, I'm one of the Google employees who have published the paper. that's because no one can ever disagree with you without having some kind of conflict of interest, lmao
>>
>>102226788
>that's because no one can ever disagree with you without having some kind of conflict of interest, lmao
how ironic is that
>>
>>102226742
>piggyback grifter
What's your issue with a free and open study based off new models? They're not asking for money
>>
>>102226811
Sure Jan, they're doing this for free, and not to have some recognition or some investments or something
>>
>>102226776
>Phryge
it is an abomination. wtf a anthropomorphic Phrygian cap?! how occult can you get
>>
>>102226839
how is anthropomorphism occult? we do this all the time with cat girls
>>
>>102226823
yeah. people don't enjoy pushing fields forward for any other reason. let's stop doing research based off existing work
>>
>>102226848
I was talking about the whole Mithra cult stuff that is associated with Phrygian caps, especially if you look at that blatant Mithra worship in opening ceremony.
>>
>>102226871
Oh yeahh, that's for sure this guy who baits people into his repo based on Flux name recognition is a saint who wants to push this field for free.
>>
>>102226883
you're failing into their trap anon, the Phrygian cap will always represent the french revolution, and not some whatever satanist shit those mentally ill degenrate wanted to represent during this olympics
>>
Thoughts about CSGO?
https://huggingface.co/spaces/xingpng/CSGO
>>
>>102226896
the paper is literally called "Flux that plays music"
do you understand that there's no baiting because that's what it is?
>>
>>102226742
did I talk to you in some earlier thread where you referred to me as "chitty" (city96) because I didn't agree with your buzz word description?
same vibe.
>>
>>102226949
what the fuck? I'm not that anon, I love city
>>
>>102226966
alright, great. just not the first time I've gone through the process of "you don't agree with me? are you in fact the very owner of the repo?"
>>
File: file.png (1.1 MB, 1179x712)
1.1 MB
1.1 MB PNG
>>102227013
>just not the first time I've gone through the process of "you don't agree with me? are you in fact the very owner of the repo?"
I got the "But how does this affect you PERSONALLY" vibe, not better lol >>102226728
>>
>>102226760
this is great
>>
>>102227032
well, I wondered why my usage of the "flow" term rather than "flux" term was important.
anyway, wouldn't you agree about that the main grifters would be the ones profiting off flow-based Transformers, i. e. BFL?
I don't think "grifter" applies to them, but more so for them than the authors of the new study.
>>
>>102227089
>wouldn't you agree about that the main grifters would be the ones profiting off flow-based Transformers, i. e. BFL?
why? BFL is the one who spent millions making this flow-based Transformers relevant in the first place, an architecture is nothing without the millions of dollars used to pretrain it in the first place
>>
>>102227117
and their models wouldn't exist if they didn't base it off existing public research, available for free
*shrugs*
>>
>>102227147
I mean, I respect more the guys who spent millions of dollars into a model to make it good and then share it for everyone for free, we literally got dalle at home thanks to BFL, making an architecture doesn't cost shit in comparison, sure it has to be a good architecture, but the guy who made the flow based architecture had probably no idea how well it would went for a fully pretrained 12b model, BFL were the one who took the risk on investing millions into an architecture that could fail in high scale
>>
File: 1724474774866286.png (2.11 MB, 990x974)
2.11 MB
2.11 MB PNG
>>102221118
>OpenAI Pleads That It Can’t Make Money Without Using Copyrighted Materials for Free

>"It would be impossible to train today’s leading AI models without using copyrighted materials."

https://futurism.com/the-byte/openai-copyrighted-material-parliament


What does this mean for AI in general and for us?
>>
>>102228424
not much, if the US wants to cuck itself to oblivion, other countries will do it for it



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.