[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 91_ab_76417.jpg (150 KB, 1136x768)
150 KB
150 KB JPG
Previous /sdg/ thread : >>100370923

>Beginner UI local install
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io

>Local install
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI (Node-based): https://rentry.org/comfyui
AMD GPU: https://rentry.org/sdg-link#amd-gpu
Intel GPU: https://rentry.org/sdg-link#intel-gpu

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Auto1111 forks
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux
Vladmandic: https://github.com/vladmandic/automatic

>Run cloud hosted instance
https://rentry.org/sdg-link#run-cloud-hosted-instance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
Inpainting: https://huggingface.co/spaces/fffiloni/stable-diffusion-inpainting
pixart: https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma

>Models, LoRAs & embeddings
https://civitai.com
https://huggingface.co
https://rentry.org/embeddings

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>SDXL info & download
https://rentry.org/sdg-link#sdxl

>Index of guides and other tools
https://codeberg.org/tekakutli/neuralnomicon
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html

>Share image prompt info
4chan removes prompt info from images, share them with the following guide/site...
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg

Official: discord.gg/stablediffusion
>>
be happy now, i wont post the news with the pastebin because im nice :)
>>
>>100375549
say sike right now
say it
>>
>>100375549
don't post the news at all. thanks
>>
I'm tired of these shit tier OP images
Why do we get 5 good OP's a week at max
>>
My first impressions of Sigma were kinda bad because I had no idea how to prompt the thing. Now I have a little more experience with it. It's really good.
My prompt format is pretty much this:

>[Style], [year], [artist name], [subject description and location of subject], [other subject description and location], [setting or environment], [additional things I add to the prompt to further refine the style (detailed, flat, perspective etc)]
And it works pretty well.
>>
under no circumstances is anyone to post the pastebin
>>
>>100375549
based schizo, i love the whole pastebin autism lmao
>>
>mfw Resource news

05/07/2024

>CCDM: Continuous Conditional Diffusion Models for Image Generation
https://github.com/UBCDingXin/CCDM

>MediaPipe Hand Crop Fix
https://github.com/sign-language-processing/mediapipe-hand-crop-fix

>LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model
https://github.com/L-Sun/LGTM

>AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
https://github.com/X-LANCE/AniTalker

>DVMSR: Distillated Vision Mamba for Efficient Super-Resolution
https://github.com/nathan66666/DVMSR

>ImageInWords: Unlocking Hyper-Detailed Image Descriptions
https://google.github.io/imageinwords/

>MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
https://dai-wenxun.github.io/MotionLCM-page/

>comfy-cli: Command Line Interface for Managing ComfyUI
https://github.com/yoland68/comfy-cli

>Performance Profiling Report (Forge/A1111/ComfyUI)
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/716

>ComfyUI-Video-Editing-X-Attention
https://github.com/chaojie/ComfyUI-Video-Editing-X-Attention

>AM-RADIO: Reduce All Domains Into One
https://github.com/NVlabs/RADIO

05/06/2024

>Detector-Free Structure from Motion
https://zju3dv.github.io/DetectorFreeSfM/

05/05/2024

>ComfyUI Prompt Quill
https://github.com/osi1880vr/prompt_quill_comfyui

>Efficient Implementation of Kolmogorov-Arnold Network [KAN]
https://github.com/Blealtan/efficient-kan

>controlnetXL_line2color
https://huggingface.co/kataragi/controlnetXL_line2color

05/04/2024

>PuLID now supported in sd-webui-controlnet!
https://github.com/Mikubill/sd-webui-controlnet/discussions/2841

>ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars
https://github.com/3DTopia/ThemeStation

05/03/2024

>Virtuoso Nodes: Set of nodes to give Photoshop-like functionality within ComfyUI.
https://github.com/chrisfreilich/virtuoso-nodes
>>
>>100375619
Sigma is trained on captions generated by Share-Captioner/LLaVa, and it has a context window of 300 tokens, which is almost 4 times larger than SD 1.5/2/XL/3
You'll do better once you embrace the boomer prompt style
>>
>mfw Research news

05/07/2024

>Generated Contents Enrichment
https://arxiv.org/abs/2405.03650

>Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
https://arxiv.org/abs/2405.03520

>Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review
https://arxiv.org/abs/2405.03417

>Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
https://arxiv.org/abs/2405.03280

>Mind the Gap Between Synthetic and Real: Utilizing Transfer Learning to Probe the Boundaries of Stable Diffusion Generated Data
https://arxiv.org/abs/2405.03243

>Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval
https://arxiv.org/abs/2405.03190

>Video Diffusion Models: A Survey
https://arxiv.org/abs/2405.03150

>SketchGPT: Autoregressive Modeling for Sketch Generation and Recognition
https://arxiv.org/abs/2405.03099

>Matten: Video Generation with Mamba-Attention
https://arxiv.org/abs/2405.03025

>Paintings and Drawings Aesthetics Assessment with Rich Attributes for Various Artistic Categories
https://arxiv.org/abs/2405.02982

>VectorPainter: A Novel Approach to Stylized Vector Graphics Synthesis with Vectorized Strokes
https://arxiv.org/abs/2405.02962

>iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
https://arxiv.org/abs/2405.02951

>MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
https://arxiv.org/abs/2405.02859

>Stable Diffusion Dataset Generation for Downstream Classification Tasks
https://arxiv.org/abs/2405.02698

>Enhancing Social Media Post Popularity Prediction with Visual Content
https://arxiv.org/abs/2405.02367

>Efficient Text-driven Motion Generation via Latent Consistency Training
https://arxiv.org/abs/2405.02791

>Adapting to Distribution Shift by Visual Domain Prompt Generation
https://arxiv.org/abs/2405.02797

>U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
https://arxiv.org/abs/2405.02730
>>
How much VRAM / RAM Pixart Sigma uses?
>>
>>100375652
I'm snug in bed right now, can you try:" a woman is showing her derriere proudly." I used laava to caption a Lora once and that's how it captioned all the ass mirror selfies.
>>
File: ComfyUI_temp_kvoce_00129_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>100375678
If you load T5 on CPU 20GB RAM, ~3-6GB VRAM
>>
>>100375678
24/64 but that's overkill
>>
So this is what mr Zhang was up to?
https://huggingface.co/spaces/lllyasviel/IC-Light
https://github.com/lllyasviel/IC-Light
He let forge die for this? Oh nononono
>>
File: PASigma_03091_.png (1.4 MB, 1344x768)
1.4 MB
1.4 MB PNG
>>100375678
3GB VRAM 20GB RAM
>>
>good OP image gets posted
>thread moves
>shit tier OP gets posted by faggot spammer that doesn't contribute to thread after
>more trolling
>shit thread movement
I'm tired
>>
>>100375699
>fixes pony's biggest problem
Looks good to me now Cumfy will try to put it in his project and not expect anyone to bitch at him because it's only okay when he does it
>>
>>100375699
>IC-Light
>IC
>"I see"
wow he's so witty. how does he do it?
>>
>>100375699
why didn't the thread schizo include this in their news [shit]post
>>
>>100375743
>>fixes pony's biggest problem
afaik this isn't a model that fixes pony biggest problem, that being a complete and absolute lack of control over the image style, because the autistic author encrypted/hashed artist names, I hope he and his family dies in a house fire or something
>>
File: cu9at4u97f361.png (580 KB, 4914x3930)
580 KB
580 KB PNG
What's your best realism hand fixing methods? As a drawfag anime or cartoon hands are a quick fix for me but realistic or even semi realistic hands are too much of a bother to paint and even after a lot of work it looks uncanny. Regular inpainting never fucking works.
>>
>>100375699
that looks fun
>>
>>100375761
because nigbo can't do anything right
>>
File: 00000-3659716232.jpg (241 KB, 1640x1304)
241 KB
241 KB JPG
Morning
>>
>>100375770
Skill issue, the biggest problem was backgrounds. Now that Illy is back he can intergeate it into forge or the B team can finish thier performance investigation and port it to A1111
>>
Alright sigma boys, I've got dual GPU training working, offloaded the text encoder to cuda:1 so now you can train a lot faster on the primary GPU.
>>
>its so sad you talk about d*b*
>hes not even here
>
>>
>>100375810
Oh he's here all right, probably just woke up
>>
>>100375810
Debo is here chronically.
>>
So is the best way to do animation to generate the image first and then animate it in a separate engine? I don't need much animation, just like 2 to 4 second clips, akin to those 1950s style movie trailers on YouTube. I want to make something like that for my current D&D campaign, which is really gay but I want to surprise my players since it's almost over.
>>
>>100375827
Ani would like to help you but he's made zero progress in over a year and is now in hiding
>>
>>100375827
Animation is roughly limited to gimmicky animated still frames.
>>
>>100375818
thats it, the next threads i will post your news WITH the pastebin :) look forward to it
>>
>>
>>100375699
>his last name is actually Zhang
kek
>>
File: PASigma_03102_.png (1.91 MB, 1344x768)
1.91 MB
1.91 MB PNG
>>
>>100375852
phallic towers
distant human walking away
trail

trifecta of sdg recurrent motifs
>>
>>100375761
He's biased which is why people filter his news. He sits on discord a slurps both Cumfy and Ani because he's a fucking loser. He waste hours of his life reading useless shit and improving at a snails pace. It's why he's envious of posters that started the same time as him.
>>
>>100375908
1girl
penis
solo
scenery
>>
File: 161956_00001_.png (1.38 MB, 1368x760)
1.38 MB
1.38 MB PNG
"Custers last thoughts"
>>
File: sigup_0003.jpg (2.31 MB, 6400x2560)
2.31 MB
2.31 MB JPG
>>
>>
If you put default in your prompts you're not even trying
>>
>>100375847
That's okay, I'm okay with what I see in those YouTube videos

I also suck a prompts. This is a particularly bad example, but I did pic related on NightCafe with "Redhead alchemist woman in Renaissance garb wielding rapier as demons surround her in a deep rocky gully"

Should I be learning weighting more or phasing more to get better results?
>>
>>100376016
I usually put futa in my prompt what are you talking about?
>>
File: 00294.png (2.58 MB, 1432x1840)
2.58 MB
2.58 MB PNG
>>
>>100376055
It's hard to know, you have to play with the parameters because the AI has wonky interpretations of what motion should be, could be everything from the fires moving to the girl slowly swinging her arm
>>
>>100375845
thats not true, i was told he is the bestest animator of this universe and the next two, who will make anime a reality
>>
>>100376056
But enough about your childhood molestation
>>
File: PixArt-upscale-flow_2.png (3.88 MB, 2689x1536)
3.88 MB
3.88 MB PNG
upscaling pixart from 512 -> 2048, though I kinda doubt it's necessary, rather than just starting immediately at 2048. Just playing around with workflows
>>
>>100376209
>I kinda doubt it's necessary
I feel the same. It sucks that it seems, if you want something larger than 1024, it HAS to be in the 2048 range. Anything in-between and the model shits itself.
>>
>>100376183
Can anyone tell me a single thing Ani has done beside node wiggling and minor code edits for Cumfy?
>>
>>100376266
he and cumfart made big cum cums once
>>
>>
>>100376266
single-handedly brought down all of midjourney's infrastructure for half a day
>>
what is this talk about AI processors?
will SD on cpu be viable or what or are they launching pci accelerators?
news are presented is such a way like everybody already knows about it
>>
>>100376284
>>100376295
But what about animation
Also it's cute Emad jumps ship after mid journey called SAI out for the hack
>>
>>100376316
it'll probably be the halfway solution between CPU and GPU and targeted for phones and devices like Alexa
>>
>>100376266
he has the balls to ask his coworker "hey bro, can i use your adobe session to show my online friends that im soon not working in this shithole with you anymore?"
>>
>>100376316
it will all be for enterprise use only you WILL pay for the SaaS inference
>>
>>100376335
sounds like crap
>>
>>100376352
It'll be cheaper than a GPU
>>
>>100376332
he made like 10 """animations""" and was so proud that he reposted them daily for months
>>
File: PASigma_03130_.png (1.42 MB, 1344x768)
1.42 MB
1.42 MB PNG
>>
meds, all of you, now
>>
>>100376359
and won't have 12gb of vram

>>100376347
I refuse to touch AI that isn't fully local
>>
File: PASigma_03135_.png (1.58 MB, 1344x768)
1.58 MB
1.58 MB PNG
>>
>>100376404
I mean it will have to if they want it for local models which is the whole point
>>
File: ComfyUI_PixArt_00043_.png (2.91 MB, 1536x1536)
2.91 MB
2.91 MB PNG
>>100376246
you can get around that by using tiles and setting the tile size to whatever size model you're using
>>
>>100376336
It was him
>>100376360
I remember all the wild shit he was saying
>I'm saving animation
Same animations for over a year
>nobody but me knows what they are doing
While continuerevolution runs circles around him
>I'm a important figure in the thread
Tries to control OP and inject himself as authority even though multiple anons run laps over him
>I'm being paid 6 figures to make loras
Struggles to make his homo Lora and has been caught failing and asking for help months after getting fired from his job
>pony is shit and can't do sfw and my made up job we couldn't do it in front of a anime industry head
Now uses pony full time
It wouldn't be so bad if he owned up to his faggotry

It's painful to watch
>>
>>100376464
he was also samefagging A LOT, especially to anons also doing animations, my guess is he felt threatened by them
>>
File: 2451814471.jpg (901 KB, 4000x4000)
901 KB
901 KB JPG
>>
>>100376428
Oh that's a smart idea, thanks anon. Will try later.
>>
>>100376495
Of course he is he's also mentally ill, look at him try to report the forge dev over nothing. He doesn't even have the balls to admit he was wrong and will still double down
>>
File: sigxl_0008.jpg (337 KB, 3200x1280)
337 KB
337 KB JPG
>>
Donald Trump
>>
sigmachads WWA?
>>
>>100376464
Him saying the pony score tags weren't needed was weird too
>>
File: PASigma_03158_.png (1.32 MB, 1344x768)
1.32 MB
1.32 MB PNG
>>
>>100376464
for me the public homosexual ERPing was the worst
like do what you want with each others butt, why do you blog about it on /g/ instead of your discord?
>>
>>100376574
It was funny how mad he got when ran corrected him
>>100376599
It's pathetic more pathetic than the depression posting
>>
y'all sound like catty middle-aged women
>>
>>100376629
For real lmao
>>
>>100376629
absolutely and they wouldn't have it any other way
>>
>>100376629
Yes because a faggot that uses discord friends to protect himself and slide narratives that can be easily found in the archive should not get off the hook after he continues his faggot behavior once he feels comfortable.
>>
File: PASigma_03177_.png (1.12 MB, 1344x768)
1.12 MB
1.12 MB PNG
>>
>>100376629
>y'all sound like catty middle-aged women
Half of 'em are twenty years ( and acceptance of their inner self ) away from being so.
>>
Please stop posting about yourself in third person you are derailing the thread
>>
File: PASigma_03193_.png (1.23 MB, 1344x768)
1.23 MB
1.23 MB PNG
>>
>>100376651
Love, I got it bad for you
I saved the best I have for you
You sometimes make me sad and blue
>wouldn't have it any other way
Love, my aim is straight and true
...
https://www.youtube.com/watch?v=7pL9vdpSvnY
>>
i dont give a shit about the old royalty i wanna know who METTEYA is
>>
File: PASigma_00271_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
A bunch of no-generates around today. Always complaining
>>
>>100376765
a talented sigmarian
>>
File: deza_00023_.png (2.74 MB, 2016x1152)
2.74 MB
2.74 MB PNG
gm

>>100376760
2010 was 30 years ago
>>
>>100376768
very cool anon once known as interior anon!
>>
>>100376781
thread schizo
>>
>>100376781
I bet your mom wished she drowned you
>>
File: PASigma_00254_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>100376781
24-10=?
>>
File: deza_00024_.png (2.75 MB, 2016x1152)
2.75 MB
2.75 MB PNG
>>100376768
today, I will request that you break out of your box (literally) and gen some rectangle aspects

>>100376794
>>100376796
gm. what are you genning today?

>>100376798
30
>>
got him.
the absolute fool.
>>
>>100376781
>2010 was 30 years ago
sometimes it feels like this and anything before 2000 feels like ancient history, as if it was in last millennium
>>
>>100376811
i would tell you but then you would know which regular i am
>>
File: 2451814472.jpg (1.21 MB, 4000x4000)
1.21 MB
1.21 MB JPG
>>
File: PASigma_00321_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>100376811
Request denied. Try submitting it again next quarter.
>>
>nogen shit stir about yourself in third person
>two hours later post a gen like nothing happened
>receive a hundred thousand (you)s
Is it that simple?
>>
>>100376832
rainy girl, i fucking knew it
>>
File: PASigma_00323_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>100376852
>two hours later post a gen like nothing happened
It's that simple.
>>
File: deza_00026_.png (2.77 MB, 2016x1152)
2.77 MB
2.77 MB PNG
>>100376825
everything was simultaneously yesterday and a century ago

>>100376838
I dig this

>>100376850
>get boxed
fugg

>>100376852
the reason they nogen for attention is because their gens get no attention.
>>
>>100376852
Write it up and I'll put it in the pastebin
>>
File: PASigma_00324_.png (870 KB, 1024x1024)
870 KB
870 KB PNG
Bald woman for balance
>>
i miss the split threads
one thread for d*b* and his samefagging
one thread for everyone else
>>
its kinda funny the when he replies to eight posts and seven of them are yourself nogen posting
>>
File: 0-AFH080262024.jpg (134 KB, 1288x1288)
134 KB
134 KB JPG
>>
File: PASigma_00325_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
File: PASigma_00328_.png (281 KB, 1024x1024)
281 KB
281 KB PNG
Experience the art of the sigma bump
>>
Pixart Sigma finetunes better start rolling out soon or else I'm going to get real bored.
>>
>>100376878
You still seethe at ran to this day and he posted gens
>>
File: 2451814473.jpg (1.64 MB, 4000x4000)
1.64 MB
1.64 MB JPG
>>100376878
Tks, I been experimenting with a few sdxl models and upscalers to see how well they work with chibi prompts, trying to move on a little from counterfeit I dig your witch too btw
>>
File: PASigma_00330_.png (427 KB, 1024x1024)
427 KB
427 KB PNG
>>100376963
https://civitai.com/models/435669?modelVersionId=493577
>>
>>100376964
Hi ran
>>
>posts his "news"
>1.5 hours later
>good morning! whats going on?
>>
>>100376979
Yeah that one's getting boring already.
>>
>>100376825
>masteronic
fellow britfag or...?
>>
File: deza_00027_.png (2.84 MB, 2016x1152)
2.84 MB
2.84 MB PNG
>>100376971
>trying to move on a little from counterfeit
you mean counterfeitXL for the base gen or counterfeit1.5 for upscale?
>>
>>100376979
feels wrong to call that a finetune, at least how it was described to anon was as a test to see if finetuning worked at all
yeah its a finetune but lets be real and call it a test so anon doesnt start merging it or some dumb shit
>>
File: PASigma_00335_.png (231 KB, 1024x1024)
231 KB
231 KB PNG
>>100376999
Lack of gens shows the problem is on the creativity side
>>
whatever just wake me up when more fine tunes come out
>>
File: PASigma_00336_.png (425 KB, 1024x1024)
425 KB
425 KB PNG
>>100377020
Keep moving the goal post
>>
>>100377044
im literally on your side idiot i love using sigma
jesus the antisigma crowd has almost buckbroken you which is surprising considering how long youve been posting here
>>
File: PASigma_03226_.png (1.1 MB, 1344x768)
1.1 MB
1.1 MB PNG
>>
>>100377059
he knows he's a shill and interprets almost everything as negative feedback
>>
I didn't know sigmachads were also so ornery
Reminds me of someone specific...
>>
File: 2451814474.jpg (1.29 MB, 4000x4000)
1.29 MB
1.29 MB JPG
>>100377018
>you mean counterfeitXL for the base gen or counterfeit1.5 for upscale?
counterfeit3.0 and upscale on sd1.5 ye, I started with counterfeit and loved the colors but fighting that model every single time to get it to stick with my prompt is tiring sometimes
>>
File: PASigma_03232_.png (1.27 MB, 1344x768)
1.27 MB
1.27 MB PNG
>>
>>100377067
>interprets almost everything as negative feedback
why is this so common with sdg posters kek
>>
File: sigxl_0024.jpg (1.04 MB, 4096x4096)
1.04 MB
1.04 MB JPG
>>
File: 0.jpg (144 KB, 1024x1024)
144 KB
144 KB JPG
>>
File: PASigma_00339_.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>100377062
Spooky Haiti vibes

>>100377020
>>100377041
>>100377059
>>100377067
>>100377074
L
>>
File: PASigma_03234_.png (1.46 MB, 1344x768)
1.46 MB
1.46 MB PNG
>>100376765
Just some guy who makes pictures when he should be working :3

With sigma it's like writing a Dwarf Fortress RP, I'm addicted
>>
File: deza_00029_.png (2.69 MB, 2016x1152)
2.69 MB
2.69 MB PNG
>>100377079
>but fighting that model every single time to get it to stick with my prompt is tiring sometimes
I know exactly what you mean. I used to use counterfeit a ton cuz it has such a great aesthetic but it involves way too much cat herding, esp with XL. I wonder if PAG could help it much..
>>
>>100377111
yeah its pretty grate i just want to skip forward in time to when more flesh out tunes are around
dont tell anyone else i said that though or else theyll think im some sort of anti-fun anon or moving the goalposts or some shit
>>
File: ComfyUI_PixArt_00049_.jpg (1.47 MB, 2048x2048)
1.47 MB
1.47 MB JPG
>>
File: PASigma_03244_.png (876 KB, 1344x768)
876 KB
876 KB PNG
>>100377129
same. sigma needs anime, full body, feet, and coomer material for sure
>>
File: deza_00030_.png (2.73 MB, 2016x1152)
2.73 MB
2.73 MB PNG
>>100377111
>>100377152
these are dope
>>
>>
File: 00000-610461859.png (1.79 MB, 1328x992)
1.79 MB
1.79 MB PNG
>>100377119
>I know exactly what you mean. I used to use counterfeit a ton cuz it has such a great aesthetic but it involves way too much cat herding, esp with XL.
I'll probably go back to counterfeit on sd1.5 eventually if there is a prompt I really wanna try out, for now I got a different model I got bullied into trying out for sd 1.5 kek
> I wonder if PAG could help it much..
Oof you got me there, I got no idea how to use PAG
>>
>>100377152
>anime, full body, feet,
i think it needs more of everything, no? considering the sigmachads talking about how its a little underbaked (for a good reason) and it obviously doesn't have the same knowledge as XL
(im not anti sigma btw fuck you)
>>
File: PASigma_00343_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>100377129
Wanting for future != moaning about present. This I support
>>
>>100377108
Sigma esque

>>100377270
Glad you understand the difference, ornerykun
>>
>>100376765
Its debo and you know it
>>
>>100377293
nah, his gens are too shite to be him.
>>
>>100375699
That's pretty good.
>>
I love Sigma, but I don't think 0.6B is enough.
>>
>one year later faggots use debo's name to bait
He's still the worst in this general and a certified faggot that posted BBC porn not even two weeks ago
>>
>>100377336
>posted BBC porn not even two weeks ago
Link the post or I call bullshit
>>
>>
File: PASigma_00349_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>100377282
>ornerykun
Fitting, thank you
>>
File: PASigma_00351_.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>100377336
>>100377328
>>100377313
No gens always complaining
>>
File: Artwork_01085_.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>100377249
Not really, what I posted are the glaring holes I've found. I was schizo dumping Playground 2.5 pics (which mogs the hell out of XL base) in previous threads and Sigma is just so much more clean, comprehensive, and creative than any other model I have.
>>
File: PASigma_00352_.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
Sigma has put us in quite the pickle
>>
>>100377396
i probably need to explore latent space with it a bit more. to me at least it seems limited in its range but i could also just still be learning all the ins and outs of the model. i am a retard but i can feel the difference between its param count and XL for example
idk im just a nogen
>>
>>100375699
lollll, the tool that killed comfy, no custom node will save his ass now
>>
File: PASigma_00357_.png (2 MB, 1024x1024)
2 MB
2 MB PNG
It doesn't stop giving exactly what you put.. how do you put it down!?!?!?
>>
>>100377336
>>100377344

>>100228644
>>
Fuck you for making me walk on eggshells when talking about sigma now why does this always happen
>>
File: 1025Custom300230.jpg (182 KB, 1592x1160)
182 KB
182 KB JPG
>>
File: PASigma_00361_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>100377493
What do you think making gens is? Also did you remember to boomer prompt like you're in English class?
>>100377564
Just be positive?
>>
>>100377564
>he says in a stable diffusion thread
>>
>Californian NEET is also a faggot
Next you're going to tell me ran isn't black
>>
>>
>>100377574
>boomer prompt like you're in English class?
yeah i use a mix of both that and lists of descriptors
ive only used it for a few days and im not entirely entirely bored of it yet - at least going back to XL felt like going back in time

motherfucker i am positive or at least neutral! sigma is the future but you're crazy if you think its 100% absolutely perfect
>>
File: sigxl_0017.jpg (402 KB, 1792x2304)
402 KB
402 KB JPG
>>
File: PASigma_00366_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>100377649
Sigma is leagues above anything SD as far as comprehension. That's it. Lots of good nuggets between gens, but it's in no way perfect. Sigma's just the start, and shitting all over the present doesn't give one taste.
>>
File: PASigma_00365_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
*good taste
>>
>>100377696
>shitting all over the present doesn't give one taste
do not forget how to tell the difference between that and genuine complaints
>>
>>100377696
cute
>>
File: PASigma_00370_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: PASigma_00368_.png (910 KB, 1024x1024)
910 KB
910 KB PNG
>>100377726
Sorry they didn't cut the crust off for you. It's a research project blowing an established American company out of the water (in terms of real control of output).
>>
very cool
>>
>if a feature isn't as simple as "tick this box to be blown away" 90% of the userbase will complain and say it doesn't work
We live in a society lol
>>
>>100377848
Grateful that I am just smart enough to understand how to get T5 working
>>
>>100377848
That's actually the approach you should take if you want your product or business to be successful, if something is too hard for the average end user to use they won't use it at all.
>>
File: PASigma_00381_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>100377848
Nailed it. 1click, 1file, 1girl
>>
They killed this thread
>>
>>
>>100377960
Yeah, it's pretty much been 90% low effort Pixart/Sigma garbage
>>
Morning anons
>>100377774
>>100377826
Cute
>>
File: PASigma_00386_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>100378012
Good morning. Those were for you!
>>
>>100378000
retard
>>
>>100378077
Cope
>>
>>100377358
This model is seriously undertrained.
>>
File: PASigma_00388_.png (994 KB, 1024x1024)
994 KB
994 KB PNG
>>100378103
Okay, do something about it. Just gonna seethe?
>>
sigma bros SHOULD upscale their shit but its pretty telling how lowres pixart gens are better than 95% of xl gens
>>
File: c.jpg (554 KB, 1970x1444)
554 KB
554 KB JPG
>>100376287
cute
>>
>>100378122
Umad bro? Why u so mad?
>>
File: 2451814475.jpg (1.12 MB, 4000x4000)
1.12 MB
1.12 MB JPG
>>100378150
Yours is so cute too, lots of nice poses
>>100378012
Good morning.
>>
File: PASigma_00392_.png (934 KB, 1024x1024)
934 KB
934 KB PNG
>>100378227
L
>>
File: PASigma_03351_.png (1.73 MB, 1344x768)
1.73 MB
1.73 MB PNG
>>
>>100378065
fresh baked bread still warm with a bit of butter and some salt is such a great thing
your gens make me hungry, well done
>>
dead general
>>
How good is pixart sigma at western artists? I might try it if it's good.
>>
File: PASigma_00398_.png (662 KB, 1024x1024)
662 KB
662 KB PNG
>>100378444
Specific artists it's kinda shit at, but worth a try.
>>
sd3 when :(
>>
>>100378488
When it's clawed out of their cold dead hands
>>
File: PASigma_03355_.png (1.73 MB, 1344x768)
1.73 MB
1.73 MB PNG
>>100378134
image size limit silly boy
>>
File: 00092-2024-05-07NYdMmBGH.png (3.33 MB, 2048x2480)
3.33 MB
3.33 MB PNG
>>
>>100378488
when they are finished it will be so "safe" that it cant even do basic rectangles anymore because edges can be dangerous
>>
>>100378513
>>100378465
>not synthographers
>>
>>100378513
you dont know how2convert to jpg?
>>
File: 2451814476.jpg (1.21 MB, 4000x4000)
1.21 MB
1.21 MB JPG
>>
File: sigup_0001.jpg (3.28 MB, 7168x9216)
3.28 MB
3.28 MB JPG
bigger
>>
>>100378554
there we go
>>
File: image(22).jpg (541 KB, 1970x1576)
541 KB
541 KB JPG
>>100378301
> lots of nice poses
agreed, often it's a lot more like this
>>
File: PASigma_03368_.png (1.6 MB, 1344x768)
1.6 MB
1.6 MB PNG
>>100378549
convert your bitching into gens first
>>
>>100378577
kek that story is how my day often goes
end me
>>
>>100378592
you should also be called ornerykun kek
>>
File: deza_00031_.png (2.71 MB, 2016x1152)
2.71 MB
2.71 MB PNG
>>100378604
including the beer with a straw? thats always my favorite part of the day
>>
File: PASigma_00404_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>100378419
Om nom

>>100378554
You right. I'm not letting additional time for upscale get in the way of exploring the latent space though. There's new lands across these oceans that contain real contrast and control without a LoRA.
>>
>>100378648
no straw, just beer...
>>
File: 20240508_143233.jpg (286 KB, 1134x1126)
286 KB
286 KB JPG
>>100375528
>R34: community run archive of all coomer artists
>Gelbooru: community run archive of mostly weeb artists

Is there an archive of just.... How do I say.... Normal art? For example if I want to find and download a big archive of Van Gogh's Art, are there dedicated websites for that? I want to train a style Lora in a style similar to pic rel.


https://twitter.com/solisolsoli/status/1788237434477768867?t=0geQnHAqotb0Mid38awwfQ&s=19
>>
>>100378690
Museum websites, Wikipedia
No, you won't have tags
>>
>>100378690
wikiart.org but its nowhere close to what you're looking for
>>
>>100378690
>>100378728
also smithsonian open access, they have an api if i remember correctly
https://www.si.edu/search/collection-images?edan_q=&edan_fq=media_usage%3ACC0&oa=1
no idea why this hasnt been used for imggen yet its kind of insane
>>
File: sigup_0003.jpg (3.16 MB, 8192x8192)
3.16 MB
3.16 MB JPG
>>
>>100378766
With more players this will not be a issue, SAI is ran by skunk pussies
>>
>PIxart retards think shit talking SAI makes Pixart look good
hint: it actually doesn't
>>
>>100378766
>millions of high quality, museum quality images
>creative commons license
>api access
The perfect imggen dataset unironically
>>
>>100378815
Your enjoyment of SD should not be affected by someone using Pixart. Take a deep breath and leave 4chan.
>>
>>100378829
Too bad it takes 50 seconds to download each image because they run the website on Steve's home computer.
>>
>>100378848
Correct. The woes of free beer.
Worth it though for what you get.
>>
>>100378832
Nothing I said relates to enjoyment of SD.

Are you ESL? I'll wager you're a third worlder paid to shill this shit on here. In which case, you should be the one leaving 4chan. Ideally forever.
>>
>>100378863
It would unironically be faster scraping search engine images.
>>
>>100378865
Boo hoo someone was mean to SAI the company that consistently and reliably overpromises, underdelivers over schedule who, as we speak, are gutting the last model they'll ever deliver.
>>
>>100378874
Would you get the same quality / kinds of images?
>>
File: PASigma_00411_.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>100378766
NTA but ty anon!
>>
>>100378815
you're really gonna spend another whole ass day baiting?
>>
File: PASigma_03385_.png (1.42 MB, 1344x768)
1.42 MB
1.42 MB PNG
>>
>>100378895
Yes especially since what the Smithsonian has are archival giga scans really meant for true archivists not some guy want to look at pretty pictures.
>>
>>100378920
It's master baiting or mastur..
>>
>>100378886
Yeah, you're free to criticize SAI. The point I was making was that shitting on your competition does not make you look good. Same thing Comfy was doing when he was shitting on SD/WebUI which actually made him look bad.
>>
File: c.jpg (577 KB, 1970x1576)
577 KB
577 KB JPG
>>100378488
Not sure. Hopefully at some point?

Sigma is nice tho. I'm not even sure that medium term it's not NICER than SD3. People had to finetune all SD checkpoints so far for aesthetics and NSFW anyhow. Just try that for now?

>>100378604
Sounds like a fairly productive day? Eat some good food and hang out socially or do something even more useful (not necessarily for money) at times, it'll be surely fine.
>>
>>100378874
Also, SOA is pretty descriptive with their images.
>>100378927
I'm a guy that wants to look at pretty pictures and I browse it all the time. Maybe I'm misunderstanding what you're saying / don't understand how SOA is worse than scraping from a random search engine.
>>
>>100378935
Seems like a reaction to the fags who suck SAI's dick to me, at the start they were just posting Pixart images until fags started crying and acting like asses about it. Now it's just gloating. Don't be a bad sport next time, faggot. And don't defend a retarded company.
>>
>>100378966
>And don't defend a retarded company.
I've never once defended SAI no here nor will I ever. These Pixart shills are actaully worse than comfyfags and it's not even close.
>>
File: PASigma_00413_.png (1.46 MB, 832x1216)
1.46 MB
1.46 MB PNG
A visual description of the no-gen contributions
>>
>>100378962
Last time I checked what they have on their website are mainly extremely high quality images for the purposes of preservation and archiving and not for the enjoyment of the viewing public, meaning, in part they intend to capture blemishes and damage on camera. They're not really good for training unless your intention is to have a more clinical, archival aesthetic.
>>
File: PASigma_03395_.png (1.49 MB, 1344x768)
1.49 MB
1.49 MB PNG
>>
File: 00047-877306265.png (2.36 MB, 1824x1248)
2.36 MB
2.36 MB PNG
>>
>>100379016
>unless your intention is to have a more clinical, archival aesthetic.
That is exactly my intention. Now all I need to do is figure out sigma training and I won't have to keep posting SOA here in the hopes that someone else picks it up.
>>
>>100378999
Oh, so you're a concern troll and one of the reasons why they're dabbing so hard, GOOD JOB ANON
Keep your fucking mouth shut instead of throwing a hissy fit because your favorite toy got replaced by something better. They wouldn't be talking shit if you weren't being a massive fag.
>>
>more clinical, archival aesthetic.
I can't think of a model that does this desu. Would be really unique methinks.
>>
>>100379034
>The Pixart shill screeches and cries at others
Keep flooding these threads with your low effort, washed out garbage and samefag compliment every other Pixart gen. It's literally fucking pathetic. On par with ani/comfy's public homosexual erping on here.
>>
File: 00050-356606462.png (2.96 MB, 1824x1248)
2.96 MB
2.96 MB PNG
>>
File: Sigma.jpg (706 KB, 2048x2048)
706 KB
706 KB JPG
>>100379048
Hmm but wouldn't that just be for 2d printed objects?

With 3D there's not that much of an archival aesthetic, right? It's in the end probably the RAW photo which could be color graded any way and whatever smaller jpg came from the camera or RAW processor.
>>
File: PASigma_00425_.png (929 KB, 832x1216)
929 KB
929 KB PNG
>>100379068
L
>>
>>100379032
Just follow their instructions for installing and training (skip the toy dataset part)

The dataset requirements are obtuse but basically in the working directory they expect your json file to be saved as working_dir/InternData/data_info.json

data_info.json is this format:
{
height: image_height,
width: image_width,
ratio: width / height,
path: path_to_image,
prompt: caption,
sharegpt4v: caption,
}


who the fuck knows the difference between prompt and sharegpt4v

The rest is in the training configs they reference

The final gotcha is you need to run tensorboard and point it working_dir/logs if you want to see your running samples
>>
>people join the thread just to be angry at each other
what do they mean by this?
>>
>>100379032
They did publish checkpoint and lora training code:
https://github.com/PixArt-alpha/PixArt-sigma

But it has a few gotchas. Hope people integrate it into an UI soon.
>>
File: PASigma_00406_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>100379032
>>100379151
Oh hi. prompt is real prompt SD-style to gen the image, sharegpt4v is the caption model prompt in boomer. real_prompt_ratio in the config controls how much prompt shows up. 0 for off, 1 for always, and in between!

https://github.com/PixArt-alpha/PixArt-sigma/blob/master/configs/pixart_sigma_config/PixArt_sigma_xl2_img1024_internalms.py#L41
>>
>>100379129
>3D
SOA has tons of non-2D objects, for example like research scans of plants, pottery, costumes, cutlery, glass art, metalworks.... Briefly browsing the site gives a good glimpse at the sheer breath of objects they have images of. You can also sort by date range, type of object,,,, just browsing the types of objects they have is pretty mental.

>>100379151
Yeah when I tried last I kept running into python dep problems. I got close enough to want to keep at it, but it's definitely something you have to work at and not "one click 2 train".

Bet you didn't think a nogen like me would be so interested in conversation and not trolling, no? :P
>>
File: 00053-3556924421.png (2.83 MB, 1728x1344)
2.83 MB
2.83 MB PNG
>>
>>100379200
Ratio is definitely the width / height looking at the toy dataset json.

{"height": 896, "width": 1344, "ratio": 1.5,
>>
 Albums (bound) & books (513)
Archival materials (5,712)
Borders (ornament areas) (1,144)
Button (449)
Ceiling papers (549)
Ceramics (objects) (1,424)
Certified plate proofs (2,029)
Clastotypes (582)
Containers (512)
Correspondence (1,741)
Costume accessories (1,575)
Costume (1,692)
Cotypes (1,265)
Covers & letters (1,653)
Cutlery (669)
Decorative arts (8,361)
Design drawings (827)
Drawings (16,696)
Education and Outreach collections (4,090)
Embroidery (visual works) (2,455)
Exterior views (918)
Figures (representations) (3,220)
Fill papers (1,902)
Flags (541)
Folk art (455)
Friezes (ornamental areas) (792)
Furniture (995)
Glassware (497)
Graphic arts (3,139)
Holotypes (33,950)
Interior views (1,065)
Isolectotypes (2,429)
Isosyntypes (5,766)
Isotypes (52,798)
Jewelry (1,850)
Lace (needlework) (1,876)
Lectotypes (2,257)
Lithographs (779)
Living botanical specimens (3,889)
Metalwork (1,233)
Models (1,429)
Ornaments (2,051)
Paintings (6,199)
Paratypes (2,761)
Patents (1,082)
Photographs (14,035)
Portraits (977)
Postage stamps (1,400)
Prints (9,928)
Samplers (embroidery) (882)
Sculpture (1,411)
Seascapes (540)
Silhouettes (2,232)
Sketchbook folio (651)
Studies (visual works) (448)
Syntypes (12,360)
Tax stamps (495)
Taxonomic type specimens (142,476)
Textiles (5,951)
Tile wall facing (427)
Tile (1,753)
Trimmings (625)
Vessels (containers) (502)
Wall coverings (4,333)
Wall facing (1,033)


is what they list on the site
>>
>>100379200
> prompt is real prompt SD-style to gen the image, sharegpt4v is the caption model prompt in boomer
have you figured out if it's wise to use something like wd14-vit-v3 for the earlier and something like llava-1.6 for the latter to get better captions? or will this just cause trouble?
>>
>>100379216
If you're on Windows you either need to use WSL or install the Triton windows whl that's floating around. And as always you have to uninstall torch and install the correct cuda torch.
>>
>>100379249
nta
I'm using moondream2 for prompt, and WD for sharegpt4v prompt, with real prompt ratio at 0.7
>>
File: PASigma_00429_.png (1.67 MB, 832x1216)
1.67 MB
1.67 MB PNG
>>100379236
Indeed

>>100379249
I think people are only training on one or the other right now. T5 seems to love boomer, but be careful you don't train in hallucinations from the VLLM. They used one in the Sigma paper that's available too
>>
>>100379249
If you were feeling fancy you'd do a short, taggish prompt and a long prompt both from llava 1.6
>>
File: 00057-1575327029.png (1.14 MB, 896x1152)
1.14 MB
1.14 MB PNG
>>
>>100379277
good to know

honestly maybe moondream2 is already adequate, it just doesn't please my human interpretation as much with the relatively more misidentifications vs llama 16 and such better models.

>>100379290
that's where the choices come from

the long boomer prompt from llama 1.6 is one of the best possible, and so is the short taggish pretty precise prompt on wd14 [vit] v3

>>100379288
>They used one in the Sigma paper that's available too
IIRC that one relatively sucked by current standards in my own evaluations completely independent from using it with Sigma
>>
>>100379339
My suggestion would be you'd have llava 1.6 do a short, taggish description of the image followed by a long boomer description of the image. Another is you take the tags from wd then feed it to llava asking it to describe using those tags
>>
>>100379256
I'm on linux. I don't specifically recall what the last error I encountered was but when I'm at my home machine later today I will try and figure it out (probably)
>>
>>100379359
Why not aim to have the model learn both? The WD tags are quite excellent and so is the llava description.

Seems like the better plan to me?
>>
File: 00062-1870817315.png (2.93 MB, 1344x1728)
2.93 MB
2.93 MB PNG
>>
File: PASigma_00431_.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>100379368
Pro tip: take the gradio demo Dockerfile and make /bin/bash your entrypoint. Dependencies solved. Volume map your output/ directory to include models in pretrained_models and don't miss your dataset either.

Would need to additionally install tmux/screen and forward tensorboard ports in the Dockerfile for maximum comfort

Om nom
>>
>>100379405
WD tags don't work on everything
>>
>>100375652
>generated captions
ah, so that's why it feels like shit to prompt and reminds me of Dall-E
>>
>>100379427
They more or less do tho? Don't take the low confidence ones. And you DO have the llava descriptions anyhow.
>>
>>100379437
I don't see you writing captions. And the alternative is shitty "photo of dog" alt tags.
>>
>>100379437
they probably don't have the money to pay nigerians to write captions of images, so they get the best next thing

or would you like more Clit Eastwood?
>>
File: PASigma_00433_.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>100379437
Didn't get the boomer prompting memo?
>>
>>100379437
>generated captions
Got very much better vs. now last year.

IMO we can worry about human correction finetunes later and just use good models now for still a very good result?

People shit on auto-generated captions too much, they're probably better than purely human-written ones - only worse than human-corrected ones ATM.
>>
auto generated captions are pretty annoying, takes the soul outta imagen!

>>100379426
ty orneryanon!
>>
>>100375528
How do I do this with Ryzen? It never fucking works right.
>>
>>100379527
What's funny is it's a miracle SD 1.5 even worked in the first place given the state of its captions. Boomer captions are *way* better than before because it gives the model more points to learn from and more or less proved that even mid auto captions give the model enough opportunity to learn basic word rendering.
>>
File: mermaid_0003f.jpg (1.26 MB, 2048x2048)
1.26 MB
1.26 MB JPG
>>100379549
Auto generated captions are pretty handy, though you will need to manually correct them afterwards
>>
>>100379559
You literally just replied to a paste of tutorials by asking nothing specific
>>
>>100379570
Do you think I tried these tutorials and commented that? They are shit.
>>
File: PASigma_00438_.png (1.01 MB, 832x1216)
1.01 MB
1.01 MB PNG
>>100379549
Excited for new options!
>>
>>100379559
Ryzen CPU? Shouldn't be different unless you also have an AMD GPU, in which it's over for you.
>>
File: 00069-446788373.png (1.28 MB, 896x1152)
1.28 MB
1.28 MB PNG
Has anyone tried the incantations extension? Seems pretty cool.
>>
>>100379559
I heard there's a tool called Google that can answer questions immediately without requiring other people to help you.
>>
>>100379606
>which it's over for you.
TY
>>100379613
Yeah, I'll let the Jews spy on me.
>>
>>100379486
paying nigerians would be even worse. What we need is more and better scraped captions supplemented with maybe just a little bit of this kind of tagging. And we need the porn added back in, and we need SD to be able to actually diffuse completely from the starting latent even with low cfg. In this respect Sigma is actually better I think

>>100379491
the original coinage of "boomer prompting" was created (by someone else) to describe my prompts. then again, it meant something different back then. What you call boomer prompting I would sooner call autism prompting, and it stinks. Of course it's been what I'm using to prompt Dall-E and Pixart, because I'm not retarded, I always prompt based on the training data. But learning to prompt AI captions is gay as fuck, whereas learning to prompt SD1.x is an art form and a lot of fun

>>100379527
It got "better" in the same way finetunes like Juggernaut were "better" and Dall-E 3 was "better", ie, actually worse but you can autismprompt and get "what you asked for", so if you asked said nigerians (or the retards in this thread) to rate how well it was performing you would get better feedback

>>100379549
this
>>
>>100379626
And you're about to download and run python code? LMAO HOLY SHIT
>>
>>100379590
No I dont because you asked how to do something that is CPU brand agnostic by asking about CPU brands
>>
I'd rather feed all my prompts into an autogenerating thing and have it caption images based on my own style of proompting. Is that possible?
>>
>>100379633
>boomer prompting
>>
File: 00086-2537338876.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
Why is SDXL so shit
>realistic picture of a man force-feeding eggs to a sad woman using his hands.
>>
>>100379633
Boomer prompting works on SD 1.5 although back then we just called it Facebook blogging. You get some fun results if you write as if you're writing a caption for a Facebook post like "Woo wee here is my new 2014 Mustang, I love its red color it really makes my wife Edna look great today in her perky little shirt"
>>
>>100379658
Interrogator models
>>
>>100379638
>>100379653
You manage it just fine, how hard can it be?
>>
>>100379668
>realistic picture
take out realistic
also make sure to put all non photography mediums in the negs but yeah its kinda shit sometimes
>>
>>100379658
Yes, llava 1.6 is smart, it's an LLM, that means you can give it examples of how it should write captions and it will write captions in that style.
>>
>>100379682
I'm not scared of da juice and I don't need my hand held. You're fucked.
>>
File: deza_00032_.png (2.71 MB, 2016x1152)
2.71 MB
2.71 MB PNG
>>100379608
>incantations extension
whats this?

>>100379662
thats not what boomerprompting is. thats natural language prompting. boomerprompting is excessively verbose nlp, bordering on absurdity
>>
>>100379714
Why is it that you have to dumbest fucking takes? Boomer prompting is what you'd expect your grandma to ask the kind Stable Diffusion machine to make, they write prompts like they're talking to a person.
>>
I miss NAI
The leaking of that model was the biggest event in Stable DIffusion history. It's the reason SD 1.x models became so popular in the first place.
>>
>>100379668
>force-feeding eggs to a sad woman using his hands.
sigmachads is this possible?
>>
>>100379714
>>incantations extension
>whats this?
nta but it's PAG for auto1111 https://github.com/v0xie/sd-webui-incantations

have you tried the comfy equivalent?
>>
>>100379633
>ie, actually worse
No, it got better in the sense it's more accurate / more detailed.
>>
What is this sigma stuff people have been talking?
>>
File: 00000-1179452513.png (3.14 MB, 1280x1920)
3.14 MB
3.14 MB PNG
>>
>>100379714
>excessively verbose nlp, bordering on absurdity
Think you got autism and boomerism confused
>>
File: deza_00033_.png (2.68 MB, 2016x1152)
2.68 MB
2.68 MB PNG
>>100379749
>boomerprompting is excessively verbose nlp, bordering on absurdity

>>100379758
ah, cool

>>100379768
pixart sigma is a non-sd base model with impressive performance and prompt adherence
>>
i like setting CFG very low and boomer prompting. this lets the AI cook
>>
>>100379783
Talking to the kind machine is not "absurd" fucking retard. Jesus, you really just say dumb shit and haven't learned when it shut your underage mouth.
>>
>>100379783
Is it supported by automatic1111? Or do I need to use comfy for this?
>>
>>100379749
FUD.
You, are, a, liar.
>>
File: deza_00034_.png (2.8 MB, 2016x1152)
2.8 MB
2.8 MB PNG
>>100379795
>mad just to be mad
I guess you're having (another) bad day. hope things get better for you

>>100379807
>Is it supported by automatic1111?
I'm not sure but I dunno if I've seen anyone using it in a1111.
>>
>>100379795
>>100379788
I hate "talking to the AI".
The best and most adherent pictures I ever got, to this day, were using danbooru tags on models trained on Novel AI.
I just want to add tags to my prompt
>>
File: PASigma_00441_.png (1.13 MB, 832x1216)
1.13 MB
1.13 MB PNG
>>100379807
Vlad in next released, Comfy now plus https://github.com/city96/ComfyUI_ExtraModels
>>
>>100379832
Most people want more control than a tag search engine with a "I'm feeling lucky" button.
>>
>>100379714
https://github.com/v0xie/sd-webui-incantations

https://stable-diffusion-art.com/perturbed-attention-guidance/#Use_PAG_on_AUTOMATIC1111
>>
>>100379841
>Most people want more control than a tag search engine
And with 99% of the current models you get neither using boomer prompts.
>>
>hair wings
>sword in ass
people in glass houses...
>>
File: viera_0015.jpg (786 KB, 1664x2432)
786 KB
786 KB JPG
>>100379608
When I tried some weeks ago it wasn't working for forge, dunno if they fixed it.
>>
>>100379788
>setting CFG very low and boomer prompting.
Giving the model as much room to breathe as possible while still bending it to your will is a sign of great intelligence and synthographic competency. Very few anons possess this power and those who do not will forever seethe.
>>
File: deza_00036_.png (2.98 MB, 2016x1152)
2.98 MB
2.98 MB PNG
>>100379832
NAI is a different beast. iirc, it's purposefully designed for a specific prompt format. the lean into keywords/tags
>>
>>100379851
Which is better
"car, mustang, red, sunny day, desert, rainbow"

or

"A red mustang driving through a desert on a two lane freeway, the noon sun shines down on it, the morning dew causes a rainbow to show on its windshield"
>>
>>100379839
ugh. I hope it's not one of the cases where auto111 waits fucking months to release the update.
So far I wasnt really bothered with the wait, but this model looks promising, if anything because it seems to be cheaper to train models for it.
Something I saw on reddit:
> As a result, PIXART-α's training speed markedly surpasses existing large-scale T2I models, e.g., PIXART-α only takes 10.8% of Stable Diffusion v1.5's training time (~675 vs. ~6,250 A100 GPU days), saving nearly $300,000 ($26,000 vs. $320,000) and reducing 90% CO2 emissions.
26k to train a model? That can be easily crowfunded for any kind of purpose. And at the end of the day, nothing surpasses a specialized model trained just for what you want.
That's also why I'm still using NAI-based models for the stuff I publish on my patreon
>>
File: Sigma.jpg (521 KB, 2048x2048)
521 KB
521 KB JPG
>>100379668
They wanted a SFW shit. Force-feeding -when depicted at all- is almost always NSFW image material.

Add to it that the tech used in SDXL is not -as far as we can tell- actually good at multiple subject attention and interactions. Hence some hype for Sigma and SD3 and the like that do it better.

But also the addon features that try to correct some of it like controlnets, ipadapters, >>100379608 Progress is actually pretty damn fast but no one stop perfect solution for everything.
>>
>>100379682
Too hard for you apparently
>>
>>100379893
It's quite possible that with 2 5090s you could train Pixart Alpha in a few days. We're very close to the point where crowd funding is not needed.
>>
File: ComfyUI_PixArt_00054_.jpg (1.52 MB, 2048x2048)
1.52 MB
1.52 MB JPG
>>
File: 00081-3071652059.png (2.44 MB, 1248x1824)
2.44 MB
2.44 MB PNG
>>
File: hug.png (1.32 MB, 832x1216)
1.32 MB
1.32 MB PNG
>>
File: 00022-4041427135.jpg (401 KB, 1640x1304)
401 KB
401 KB JPG
>>
Also friendly reminder, data hoard. In a few years any of us might be able to train a model from scratch but every year they'll make it harder and harder to download the images needed.
>>
>>100379886
Depends. The earlier is about precise concepts that were learned fairly exactly.

The latter should be used to augment these tags with positional information, effect modifiers and so on and frankly could use a more exact language - but having the capability to prompt natural language too is practically useful too.
>>
File: 00084-2932686731.png (1.63 MB, 896x1152)
1.63 MB
1.63 MB PNG
>>
>>100380003
Tagging is always going to work, but long prompts give the model way more to work with. Even SDXL and SD 1.5 benefit from longer prompts. The more tokens the model has to work with the higher quality the outputs tend to be. That's why I support using LLMs to extend your prompts like they do with DE3.
>>
File: PASigma_00450_.png (1.16 MB, 832x1216)
1.16 MB
1.16 MB PNG
>>100379841
This

>>100379886
Tags missed dew. I pick more detail and placement.

>>100379893
He gets it!
>>
File: 00036-2835309815.jpg (268 KB, 1640x1304)
268 KB
268 KB JPG
>>100379997
>>
File: file.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
Sigma training still looking good boys. After doing some optimizations for dual GPU I got CAME working and switched back from AdamW 8 Bit. Sadly stuck on batch size 4.
>>
>>
>>100380076
looking good
it feels like I'm back in the days when trinart released for 1.5
>>
File: deza_00038_.png (2.71 MB, 2016x1152)
2.71 MB
2.71 MB PNG
>>100380076
good luck, sigma trailblazer
>>
File: 00087-3471128822.png (2.53 MB, 1824x1248)
2.53 MB
2.53 MB PNG
>>
>>100380076
hot
>>
>>100380125
Corn
>>
File: file.png (2.35 MB, 1024x1024)
2.35 MB
2.35 MB PNG
Almost figured out Zoras although it's in the realism stage, I'm putting realism back in after raping it with e621.
>>
>>100380105
NaiTrin 4eva
>>
>>100380128
hole
>>
>>100380157
Cream
>>
File: PASigma_00483_.png (1.33 MB, 832x1216)
1.33 MB
1.33 MB PNG
>>100380076
Woo!
>>
File: deza_00040_.png (2.54 MB, 2016x1152)
2.54 MB
2.54 MB PNG
>>100380148
first loras, then bloras, then doras, now zoras? we're gonna run out of letters
>>
File: file.png (690 KB, 822x417)
690 KB
690 KB PNG
>>100380183
Zora, like from Legend of Zelda. I want to see if it can learn something it has little to no knowledge of. Which it seems it can.
>>
>>100380203
are you using kv compression?
>>
>>100380215
No, the normal 1024 settings. Although I'm training with 896 images.
>>
File: PASigma_03559_.png (1.84 MB, 1344x768)
1.84 MB
1.84 MB PNG
>>
>>100380215
>>100380236
kv compression is for the 2048 model to run faster. It's a lossy compression.
>>
File: 00007-783580664.jpg (280 KB, 1512x1360)
280 KB
280 KB JPG
>>
File: PASigma_00502_.png (1.03 MB, 832x1216)
1.03 MB
1.03 MB PNG
>>
File: PASigma_03575_.png (1.5 MB, 1344x768)
1.5 MB
1.5 MB PNG
>>
File: 00004-835802365.jpg (210 KB, 1512x1360)
210 KB
210 KB JPG
>>
File: deza_00041_.png (2.4 MB, 2016x1152)
2.4 MB
2.4 MB PNG
>>
File: 00005-2072747551.jpg (323 KB, 1512x1360)
323 KB
323 KB JPG
>>
File: Sigma.jpg (557 KB, 2048x2048)
557 KB
557 KB JPG
>>100380025
> tagging is always going to work
it in fact currently needs to be trained

> long prompts give the model way more to work with
But so do do more precise more extensive lists of precise tags? However their meaning is more defined for one-shot accuracy.

Natural language is obviously a tool completely designed for one or multiple corrections. Maybe one day this works but until then the *far* more proven thing are tags possibly augmented by natural language.
>>
File: 0811313.jpg (159 KB, 1432x1320)
159 KB
159 KB JPG
>>
File: PASigma_03589_.png (1.74 MB, 1344x768)
1.74 MB
1.74 MB PNG
>>
File: PASigma_00517_.png (1.12 MB, 832x1216)
1.12 MB
1.12 MB PNG
>>100380314
Woooooow

>>100380349
You can disable tags during training by setting real_prompt_ratio=0
>>
>>100380396
>>100380396
>>100380396
>>
File: 00094-174340302.png (2.22 MB, 1824x1248)
2.22 MB
2.22 MB PNG
>>
File: 00095-3966627049.png (1.15 MB, 832x1216)
1.15 MB
1.15 MB PNG
>>
>>100378000
>90% low effort
You must suffer to make art.
>>
File: PASigma_03597_.png (1.76 MB, 1344x768)
1.76 MB
1.76 MB PNG
>>
File: 00101-2432550932.png (3.38 MB, 1344x1728)
3.38 MB
3.38 MB PNG
>>
File: 00102-1748438630.png (1.33 MB, 896x1152)
1.33 MB
1.33 MB PNG
>>
File: 00105-3408738295.png (1.62 MB, 896x1152)
1.62 MB
1.62 MB PNG
>>
File: bdXL__00004_.png (2.33 MB, 1680x960)
2.33 MB
2.33 MB PNG
>>
>>100380381
I assume training both wd v3 tags and llava captions as the Sigma devs thought is worth it; it probably feeds some good information.

We may figure out if this is actually true eventually.
>>
>>100380847
It's just an alternate prompt strategy, it's just saying to the model there are multiple ways to describe the same image and helps it become less rigid.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.