[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: 1702343914698579.jpg (412 KB, 1260x1680)
412 KB
412 KB JPG
Previous /ldg/ bread : >>101038464

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
StableSwamUI: https://github.com/Stability-AI/StableSwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Pixart Sigma & Hunyuan DIT
Comfy Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out

>Models, LoRAs & training


>Index of guides and other tools

>View and submit GPU performance data

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info

>Related boards
>1.5 is the best for realism
File: Comfy_My_wiries_00001_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
Does anyone know what "DPM-SOLVER" and "SA-SOLVER" for PixArt are and where to get them?
File: s.jpg (120 KB, 1280x768)
120 KB
120 KB JPG
maybe >>101046686 ?
Reality is an illusion.
You're not wrong (and I'm not talking about living a simulation/matrix type bullshit)
Nice! Cheers m8
File: s.jpg (77 KB, 1280x768)
77 KB
no problem
wait a fucking minute... are you telling me I've been using the wrong ones?
diffusion probabilistic model sampling vs stochastic adams solver. the second is faster but you can only use it if your name is adam
File: s.jpg (123 KB, 1280x768)
123 KB
123 KB JPG
Dunno. Most probably use extra nodes.
damn it, I just wanted the scheduler. It won't work with my workflow.
I thought SAI died, and Stable Diffusion with it, but it seems what died were local diffusion models.
busy training
Any prompt help to get just shower glass door reflection? I tried silhouette and didn't work.
where is collage anon?
this is fake bred
File: file.png (814 KB, 1024x1024)
814 KB
814 KB PNG
File: elongated dick general.jpg (2.16 MB, 3264x3264)
2.16 MB
2.16 MB JPG
try (stylized silhoutte trough shower glass)
File: tmp2h4wb2s9.png (1.39 MB, 1344x768)
1.39 MB
1.39 MB PNG
>BREADLY EVENT ANCHOR (Abstract Edition)
img2img this img, and reply with the results!
Recommended denoise: ~0.75
File: WeSoBack.jpg (458 KB, 1589x1422)
458 KB
458 KB JPG
>Benefiting from the inherent ability of the LLMs and our innovative designs, the prompt understanding performance of LI-DiT easily surpasses state-of-the-art open-source models as well as mainstream closed-source commercial models including Stable Diffusion 3, DALL-E 3, and Midjourney V6. The powerful LI-DiT-10B will be available after further optimization and security checks.
>10b image model + 7b LLM
no one will be able to run that (even with a quantized LLM)
the example images look incredible, hopefully they'll release something usable for vramlets too. can you offload image models to ram like with llms?
you can offload everything with ram, that's the low-vram option on A1111 or comfyUi, it will just be really slow though

The advantage of transformers architecture is that it's resiliant to quantizations, maybe we can quantize DiT models, is there a paper about that already?
>maybe we can quantize DiT models, is there a paper about that already?
there's this
I'm too much of a brainlet to see if it's good or not kek
File: 2406.11831.jpg (713 KB, 1624x1935)
713 KB
713 KB JPG
>We assign the text prompt of “a photo of {class}”
But why. Sounds antiquated and takes me back to early days of dreambooth.. I bet their ass not everything they have in a dataset is photography, so why confuse the future tokenizer. I guess it's mitigated by mixing these captions with synthetically generated ones, and whatever's in that CC12M.
>the example images look incredible
Picrel looks on par with the rest desu
I don't understand why they prompt for atmosphere instead of mood or something like that
>they prompt for atmosphere instead of mood or something like that
Not sure what you mean. There's overlap between the two words, but "quiet and peaceful" or "terrifying" do count as moods I'd guess?
>"quiet and peaceful" or "terrifying" do count as moods I'd guess
For sure. Perhaps I'm too used to 1.5 prompting. There's atmosphere and literal atmosphere which 1.5 mixes up all day everyday. With 1.5 it's better to describe mood, vibe or theme
File: 116775249263548508-SD.png (1.7 MB, 984x1216)
1.7 MB
1.7 MB PNG
how's it hangin anons?
I'm still not sure how you differentiate mood from atmosphere. For me, atmosphere relates moreso to visual hints, and moods are well.. moods, emotions. Glimmers of sunshine and chromatic abberation might be atmosphere pieces, as opposed to more vague, emotional notions like peaceful, serene, or breathtaking.
Very nice gen, don't mind if I snatch it for the next collage. I've been really impressed with some of the realistic gens lately, and those use to be my least liked back in the day. I'm still biased, but I'm starting to grow fond of the likes you just posted. Mind if I asked what model?

I keep catching myself on biting the bait, in spite of reminding folks not to do the same. Good otherwise, one thing to worry about less since yesterday. How about you?
training my 100th lora today. cool gen

yeah well clip seems like an autistic zoomer on twitter who takes everything literally, nuance is often lost
File: 116775249263548735-SD.png (1.43 MB, 768x1136)
1.43 MB
1.43 MB PNG
Im the exact opposite! Ive always chased realism all the way from 1.5

This is just PonyRealism2.1

I hope no janny thinks this is too much for /ldg/
File: c.jpg (88 KB, 832x1152)
88 KB
I'm not fond of describing the atmosphere subjectively.

"foggy dark scene with occult signs and chromatic aberrations" ok
"quiet and peaceful and snesual" wtf now you're just adding random effect shit with no control
Very tasteful, you should be fine. I've been warned for less. Wish some anons understood that this is the beautiful way to go about posting and censoring risque gens.
File: 116775249263548716-SD.png (2.09 MB, 1120x1384)
2.09 MB
2.09 MB PNG
lets keep it sfw hehe
File: 116775249263548746-SD.png (2.09 MB, 976x1360)
2.09 MB
2.09 MB PNG
we have Malenia at home

sorry Miyazaki
File: 116775249263548709-SD.png (2.58 MB, 1120x1384)
2.58 MB
2.58 MB PNG
lol, lmao even
File: s.jpg (103 KB, 832x1152)
103 KB
103 KB JPG
you should have to grind that "generate" button at least for a few hours, it's excellent 'design'
File: 116775249263548752-SD.png (2.07 MB, 960x1360)
2.07 MB
2.07 MB PNG
I agree!
Are there scripts to train loras or embeddings for Pixart?
File: 1232rwtrehtrhg.png (148 KB, 300x300)
148 KB
148 KB PNG
I like what you've got going too. I would only recommend picrel for the eyes/face. Maybe adetailer if you want to automate the process.
Is this with any refiner or inpainting or just an out of the box gen?
File: 116775249263548727-SD.png (2.29 MB, 1120x1384)
2.29 MB
2.29 MB PNG
nope, no refiner, only adetailer.
wouldve loved to fix the fingers, no patience today
onetrainer has support for training lora but I had some issues with some settings that would usually work. it did train one lora.

maybe there were fixes meanwhile, haven't re-checked.

the upstream repos also has lora training code
Can I run the LLM on one 3090 and the image model on my other 3090?
What's the best model for realism from your experience? Is there anything that is good at making things look like shakeycam/found footage?
i'm guessing that it would be easily possible, it's possible on other current mode and often even also how some of these models were trained
Can't say what's the best overall, but when it comes to Pony, I'm very impressed with: https://civitai.com/models/372465/pony-realism
The couple of realistic gens anon posted itt are good examples of it's potential quality.
File: pixart_00653_.jpg (2.54 MB, 2048x2048)
2.54 MB
2.54 MB JPG
File: 0.jpg (368 KB, 1024x1024)
368 KB
368 KB JPG
File: pixart_00655_.jpg (2.78 MB, 2048x2048)
2.78 MB
2.78 MB JPG
File: pixart_00657_.jpg (2.6 MB, 2048x2048)
2.6 MB
2.6 MB JPG
File: tmpdzcy0ihb.png (1.4 MB, 1344x768)
1.4 MB
1.4 MB PNG
>(even with a quantized LLM)
What the fuck do you even mean? Just run a cucked Q4 model on normal RAM, it's not like you need 400T/s to generate ONE image.
I can't quite make out the details. What's in it?
Peak /ldg/ humour.
the 10b imagegen model will be too much for a 24gb vram card though
>cries in 8vram
File: tmpr_y0zmgu.png (2.18 MB, 1680x924)
2.18 MB
2.18 MB PNG
File: tmphhfvo6lf.png (1.15 MB, 1344x768)
1.15 MB
1.15 MB PNG
There's plenty good photo models but they all have a studio photoshoot, staged artistic look, probably because most of them are trained on stock photos and professional photoshoots. I want to achieve something that looks like it was shot with a bodycam/dashcam/amateur smartphone or other rather unprofessional type of photo.
I want to do something like real looking cryptid footage or stuff like images of expeditions to another dimension/planet/fantasy world etc but captured in an amateurish way and showign mundane everyday scenes from these places. So far none of the models or loras I've tried really managed to achieve that look despite trying out all kinds of prompts/negative prompts.
File: tmpkgs7b78n.png (2.27 MB, 1680x960)
2.27 MB
2.27 MB PNG
Hell yeah. This remind me of CCTV kinda gens, love those.
Maybe you should look into these kinds of loras?
File: 0.jpg (358 KB, 1024x1024)
358 KB
358 KB JPG
This one also looks promising:
File: nakedgun3titles16.134.jpg (38 KB, 718x405)
38 KB
Nice, I can also try to make some Naked Gun intro style pics
You can also try prompts and dutch angle, chromatic abberation and fisheye lens.
File: file.jpg (77 KB, 1920x1080)
77 KB
I'm hoping the model will capture some of the amateur stuff. I've put a bunch of ghost hunting videos from YouTube into it. Ultimately the model should be responding to grainy, low light, grainy, poor quality, for images like this (screencap example). I also put a lot of urban exploration / liminal content into it.
If you have any example videos (ideally no watermarks) similar to this I'd happily add them into the dataset.
right up my uncanny valley
File: retro_14.jpg (200 KB, 1536x1200)
200 KB
200 KB JPG
Why is everyone in /sdg/ so on edge and angry all the time
Very home videoesque.
File: tmpku_hjuse.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
how do you guys avoid sally sameface? all images I get,its the same girl, even with that detailer thingy its her again and again and again
Impossible with the raped SDXL and SD 1.5 that are just merges of some autist's 10,000 steps of some girl's face.
with 1.5 I had ton of embeddings I had trained of celebs and models, I just used to mash them together in prompt and got a unique face, with XL and Pony its just impossible to avoid her.

looking for any and all suggestions!
Reminds me of the first FEAR game
It's unavoidable because the people who trained SDXL basically made the model memorize certain faces, there's no getting away from the AI Face. You can always try to put a random name in but the AI Face will always be there.
how much would it need?
damn, I was really hoping there was some way to avoid her.
File: 116775249263548799-SD.png (1.5 MB, 976x1200)
1.5 MB
1.5 MB PNG
Are there any loras that let you manipulate facial features so you can choose from various facial shapes, change nose length/shape etc?
SDXL (3.5b) asks for ~10gb of vram, so I guess that the 10b will ask for 28+gb of vram
Im looking for those, if someone knows something pls link.

since there are a ton of anime stuff, would they work?
inpaint, except with a different prompt
inpainted faces never really come out well
File: 116775249263548806-SD.png (1.88 MB, 976x1200)
1.88 MB
1.88 MB PNG
File: tmpsdjwe486.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
File: 116775249263548818-SD.png (1.8 MB, 976x1200)
1.8 MB
1.8 MB PNG
thanks anon
>anon would rather shitpost than gen
Use the old celeb name in the prompt trick. Maybe a wildcard of random names too.
File: Screenshot_1.png (39 KB, 1252x172)
39 KB
I have a question bros.. Why do i keep getting this error despite my GPU having 6GBs ram (yes its 1660 but 1660 isn't that bad, right?) I also reduced the gpu ram usage to the minimum and trying to gen a 512x512 image, and still get this error. could it be related to something else other than the RAM?
6vram is the bare minimum, and depending on what model you're using, they can be more demanding. With 6vram you probably want every little performance boost you can get. When it comes to models, you probably don't want to go beyond SD 1.5 or Pixart Sigma. Something like SDXL or Pony is probably out of the question. Guessing by the looks of your screenshot, you're probably using EasyDiffusion? I'm not sure how well optimized it is. You might want to try something like StableSwarm, Fooocus, Metastable, or https://github.com/lllyasviel/stable-diffusion-webui-forge
Also if I'm not mistaken, something like 1660 probably misses out on architecture implemented with RTX cards, which doesn't help either.
Yeah i'm using EasyDiffusion, was trying to use Pony. I will test with a SD1.5 model, and if the issue persists I will switch to Forge or something else, maybe easydiffusion is just not optimized..
1660 is pretty old desu, so it wouldnt be surprising that the gpu itself is the issue
That would explain it. I'm sitting on 8vram of my rtx 2080 laptop, and something like Forge gave me the boost I needed for Pony/SDXL. Not sure if you can do the same, but who knows. The UI I mentioned should help with performance, because they're based on ComfyUI, which seems to be better optimized. If that's not enough, you should be able to gen on SD 1.5 models. Pixart Sigma is also efficient, but it's not supported by every UI yet, since it's a new thing. Either way, good luck. Machine learning unfortunately is VERY vram hungry.
Run nvidia-smi to see if anything is using up VRAM.
you make me want to try out pony again. the first time i saw your gens i almost thought it was midjourney fr
File: tmp8zsofjzg.png (2.23 MB, 1928x1104)
2.23 MB
2.23 MB PNG
Realismanon, please teach me your ways.
Why does it say 12.52 GiB already allocated if you only have 6? I've never seen the "already allocated" number be higher than what my card has.
voodoo chakra magic, anon sacrificed his virginity to download extra vram
I once allocated more RAM using my HDD to play some vidya that needed a much bigger RAM.. maybe it's that
File: tmp2xsyv121.png (985 KB, 769x1024)
985 KB
985 KB PNG
that poster already has a name
what is it?
Finetune status?
pendejo puta madre
no hablo espanyol
its not comfytroon
if that is the case i will cut off my own penis and eat it
File: tmp6b_rizmw.png (107 KB, 356x359)
107 KB
107 KB PNG
I regret asking, and I never will again.
who the shit is cardosanon? dont bring sdg shit in here faggots
Cardosanon is a founding member of this general and the discord
File: 1717139111207438.gif (425 KB, 284x639)
425 KB
425 KB GIF
I've never really had a use for training my own model before so I know nothing about it, but in theory could I feed the thing an entire manga and make it replicate the mangaka's style exactly?
ty for the reminder <3

You're welcome to join anytime
no thanks
Don't be retarded
Rather be retarded than join that
They never learn...
So you are a retard then?
if you say so
You're the one who said it, actually, but yeah, guess I'm saying it too now.
Down the rabbit hole of deep intellecutal discourse.
there's a ldg discord btw
its just that dumbass shockedmonkey and his troon army
>in theory could I feed the thing an entire manga and make it replicate the mangaka's style exactly?
"Exactly" depends on how skilled you are but you can get extremely close.
I wouldn't join a club that had the low standards of letting me in.
File: 116775249263548832-SD.png (1.56 MB, 984x1224)
1.56 MB
1.56 MB PNG
you should try it out its good!

you dont know me
Any tips outside regular pony stuff?
File: highmelanincontentmeme.png (365 KB, 680x680)
365 KB
365 KB PNG
File: 116775249263548819-SD.png (3.15 MB, 1464x1456)
3.15 MB
3.15 MB PNG
Photography terms seem to help,
the detail lora is a must.
Then its just following pony prompting -- that score_* thing.
>and the discord
Which lora is that? The old SDXL detail tweaker?
File: 116775249263548820-SD.png (2.58 MB, 1176x1456)
2.58 MB
2.58 MB PNG
civitai. com/models/122359/detail-tweaker-xl?modelVersionId=135867
File: 0.jpg (274 KB, 1024x1024)
274 KB
274 KB JPG
these edits just show /ldg/ can't gen memes
Futures looking bright localchads
>make SD3 tensorrt engine
>try running it using workflows from https://github.com/comfyanonymous/ComfyUI_TensorRT/pull/30
>get picrel
wat do? it looks like a vae issue but i saved it from the SD3 Medium checkpoint
the manga you're looking for is probably already contained inside a fine tune or a lora desu
which one are you looking for specifically?
File: tmpk412kxkg.png (1.6 MB, 968x1240)
1.6 MB
1.6 MB PNG
I can't believe Florence is almost uncensored.
nice piclrel, anon
you just need to turn up brightness on your monitor
File: file.png (2.84 MB, 1024x1024)
2.84 MB
2.84 MB PNG
shit im retarded
Maybe try a different sampler? I remember hearing some are not supposed to work with it.
File: 00052-2622483916.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
try euler
File: tmpeufp8yge.png (1.2 MB, 768x970)
1.2 MB
1.2 MB PNG
File: file.png (439 KB, 1865x882)
439 KB
439 KB PNG
still the same
Maybe switch off sgm_uniform?
File: file.png (438 KB, 1889x887)
438 KB
438 KB PNG
File: 0032.jpg (326 KB, 2064x2856)
326 KB
326 KB JPG
File: image10.png (648 KB, 1999x1047)
648 KB
648 KB PNG
Try connecting TensorRT Loader directly to KSampler, without thhe ModelSampingSD3 node?
File: 0.jpg (325 KB, 1024x1024)
325 KB
325 KB JPG
File: file.png (457 KB, 1864x868)
457 KB
457 KB PNG
still the same, tried with euler,sgm_u and euler,normal
weirdly enough SDXL models work properly with tensorrt giving speed improvements of around 80%
this is why I'm never going to use uncomfyUI
I'm seeing a bug report from two days ago about similar behaviour. Try switching the amount of steps by 1, and maybe try updating the UI in case it was fixed?
except remeber to put that ModelSamplingSD3 node back while at it
File: file.png (310 KB, 1337x642)
310 KB
310 KB PNG
the ui and the extension are the newest version, will try genning images with 1-30 steps one by one, with the ModelSamplingSD3 node
oops didnt mean to upload pic
more like SpaghettiUI
is this an actual UI? lol this looks so autistic compared to Easy Diffusion's UI
done genning 30 images with step counts of 1-30, each one is broken
Because comfy is actually autistic and doesn't understand that nodes and settings that don't get updated often (if at all) should be able to be hidden. The basic node workflow should basically be a model selector, positive and negative prompt boxes and an output image, most of everything else is programmer garbage that should be tidied.
File: tmpohf5lqsv.png (2.66 MB, 2016x1134)
2.66 MB
2.66 MB PNG
I'm installing StableSwarm in the background, and cmd is talking to me in three different languages. Is it trying to intimidate me?
File: Sigma.jpg (250 KB, 2048x2048)
250 KB
250 KB JPG
It is good tho.

I don't agree with every single node's design but in general this is the right way to do it.

There are other UI that hide everything and serve users with only the most basic needs. Most definitely we're NOT just selecting models, positive/negative and output image, although if that's your thing you still can use the tinytera nodes or one of the alternatives.
show me a different UI that does PixArt sigma, I'll switch.
It's the right way in that these all need to be nodes. It's wrong in that it doesn't compartmentalize nodes and doesn't have global variables. And if you disagree you're another programmer with zero UX experience. I don't need to state that Comfy has a UX problem, because it does.
SD.Next seems to also run it.
thank you for trying to help anons, guess its sd3 just being broken like always
File: pixart_00669_.jpg (2.09 MB, 1664x2432)
2.09 MB
2.09 MB JPG
File: u.jpg (295 KB, 2048x2048)
295 KB
295 KB JPG
I am definitely not using this as a programmer.

OK just do your better UI to show everyone your UX experience kek.
>OK just do your better UI to show everyone your UX experience kek.
Metastable and StableSwarm, to name examples that utilize comfy as backend.
I don't know why autists who are fucking socially inept defend things they are fucking clueless about. ComfyUI universally is considered a shitty UX. It's not even close. I don't need to provide examples of good UXs, I only need to tell you that EVERYONE hates Comfy. And stop sucking the dick of things that suck just because that's all you can get. It's faggoty.
File: pixart_00675_.jpg (2.24 MB, 1664x2432)
2.24 MB
2.24 MB JPG
make a highly customizable workflow in a1111
go ahead, I'll wait
>UI Wars
Anon, you're describing visual programming. It's not customization, it's node programming. And as a node programmer, ComfyUI sucks compared to every other one. Stop defending shit just because it's a monopoly.
File: pixart_00677_.jpg (2.21 MB, 1664x2432)
2.21 MB
2.21 MB JPG
You'd need MORE functions to deal with MORE complexity to be better. Only things that don't get in the way of this guideline are actually better.

These other two UI just do less and are therefore simply worse. Or you can also see them as tools for a different audience that has different needs.

It's good. Feel free to make it better as per above guidelines or switch to the other UIs if you have different needs.
File: ComfyUI_SDXL_0007.jpg (2.36 MB, 2048x2048)
2.36 MB
2.36 MB JPG
>I use node-based ui's
>I don't want to use node-based ui's
Yes anon, I like Node UI's that are above the bare minimum. You know, like reusable node blocks? Or global variables?
Anon is right that the only reason comfy is used is because no one else has a "good" node based UI on the market.
Lacks all the dumbass community nodes and addons hence not worth it even if it's objectively a better user experience
File: pixart_00679_.jpg (1.9 MB, 1664x2432)
1.9 MB
1.9 MB JPG
Which thread has higher gen quality? /ldg/ or /sdg/?
No, I'm just commenting why Comfy will always be an autistic workflow that will never get traction. Mostly because ComfyAutist is completely resistant to any and all feedback so his product will remain mediocre out of spite and he'll eternally seethe about A1111 always having more market share despite have a literal monopoly on releases like SD 3.
two different markets anon
>casual consumers
>autists, enthusiasts, "professionals"
Wrong, just autists. Professionals will jump ship the second a professional, not amateur, node editor ships. ComfyUI is the Gimp of node editors, shit made for autists. Not enthusiasts. Not professionals. Autists.
>Professionals will jump ship the second a professional, not amateur, node editor ship
I don't disagree here. This is why they started the comfy org. He pulled the dev from stable swarm among a handful of other very talented devs. Good things are in the future for comfy.
>will never get traction
it long did to the point where it shows up in random intel AI presentations and so on

now just go and make it better in a fork or make a better alternative from scratch if you are so convinced
It's doomed because ComfyAutist is completely hostile to critique so he'll never change it because that is an admission of failure and apparently hostile to make good a good user experience. He has the "my way is right even when it's wrong" attitude and that's why he will always be Gimp and Blender of node editors (at least before Blender finally admitted maybe left click should be standard).
Oh my it showed up in a presentation!?!?!?! That must mean it's great! ComfyAnon, your UI sucks and your org is going to fail because this is how you act every time your shitty UI is brought up. And you've barely changed it despite it being called out for being shit a year ago.
>It's doomed because ComfyAutist is completely hostile to critique
This is precisely why he's bringing more people into the fold because he knows this is his main weakness (as he also previously stated).
That doesn't help when you have a caustic "my way is right even when it's wrong" personality -- he's going to burn all those people out because he's going to fight to death about petty bullshit especially anything that he programmed no matter how much of a turd it might be.
It's a possibility for sure, we'll see what time reveals.
Spaghetti is hard.
I've been a professional long enough that I can spot the red flags. He's not a programmer that plays nice and absolutely hates real feedback that isn't sucking his dick.
You brought up "traction" - it has very broad traction. You jump to "but my feels are more important"? Nah. Not particularly. Maybe if you make some improvement.
File: isthisbait.png (349 KB, 1024x1024)
349 KB
349 KB PNG
yes.... yes it is
It has extremely limited traction and is struggling to gain momentum and you're too blind to see this. It has a bad UX and you have to have autism to use it. It will always be third place or lower.
I'm guessing second place after the simple tool for "everyone".

Until someone does a whole blender UI x IDE type work tool for AI + imagegen with far more changes to the python or w/e foundation too.
Nice ritual
Maybe try sucking his dick if you want some changes then retard
File: 00158-2169118408.jpg (254 KB, 1470x1100)
254 KB
254 KB JPG
File: 0.jpg (281 KB, 1024x1024)
281 KB
281 KB JPG
File: PAPi_0003.jpg (1.27 MB, 2560x1536)
1.27 MB
1.27 MB JPG
What's wrong with your gens anon?
No I think he's a faggot who makes something as notorious as Gimp and Yandere Sim. I like to mock him and I'll delight in his continued misery of his self inflicted bullshit.
One more batch. One of the gens will be good.
File: tmp1ctmcfgt.png (653 KB, 1024x768)
653 KB
653 KB PNG
File: 00245-3863852316.jpg (208 KB, 1470x1100)
208 KB
208 KB JPG
> /aicg/ and /ldg/ unite!
Don't we have more in common with /lmg/?
File: 1715915187734972.png (1.96 MB, 1920x1080)
1.96 MB
1.96 MB PNG
File: 00273-2141420376.jpg (281 KB, 1470x1100)
281 KB
281 KB JPG
File: tmpp_05argf.png (810 KB, 1024x768)
810 KB
810 KB PNG
Holy shit Florence is so incredibly good and it's fast. And it doesn't even censor penis.
File: 00160-3913230161.png (854 KB, 960x536)
854 KB
854 KB PNG
Are there any easy-to-use webuis for Pixart Sigma, Lumina, and Hunyuan? It's pretty clear that stable diffusion is on its way out as one of the top opensource image generators, but I don't see anything other than comfyui for handling the other three mentioned on this page https://civitai.com/articles/5685/the-evolution-of-image-gen-with-sd3
Thanks no gen, I'll check it out.
>it doesn't even censor penis.
Which team do you play on?
huh? I thought it was just a vlm
I'm straight but I always test caption models on porn. Plus I'm making a model so I'm pretty equal opportunity when it comes to the dataset outside of the weird homo shit.
Boy, you're gayer than two priests fucking an alter boy
Yes, but that's good. It's fucking lightning, almost as fast as wdv3 and it's writing novels for captions. It's going to be the defacto captioning model.
No, I'm gay as a daisy in may.
Yeah I was retarded I thought you meant it was genning images. From my understanding (I haven't used it) that it is pretty damn good and punches well above it's wait given its parameter size.
Its only problem is you really only get three captions: short, shorter and novel and there's no direction beyond that which sucks when you have a concept and you need to nudge it because it's a dumb AI. Especially for my Youtube video snapshots, Llava does really well being told it's a video and the title of the video.
File: ComfyUI_temp_cuzgn_00041_.png (2.09 MB, 1752x1000)
2.09 MB
2.09 MB PNG
File: 00002-616486153.jpg (584 KB, 2570x1924)
584 KB
584 KB JPG
SD.Next should handle the matter. Stable Swarm uses comfy backend, so it might be possible there. Metastable SoonTM.
File: file.png (121 KB, 256x256)
121 KB
121 KB PNG
Dare I say, is that a face?
>1boy wearing jeans and a t-shirt
What am I looking at? Other than a 1boy wearing jeans and a t-shirt.
A 1.3B Pixart Sigma model being trained from scratch
Why 256x256 then. Shouldn't you benchmark it at 512 at least?
Because it's faster to train at 256 then 512 then 1024 then 2K than to start to train at 512. Almost all models are trained this way because if you give it too much information it takes way longer to figure things out.
File: Untitled.jpg (116 KB, 2048x2048)
116 KB
116 KB JPG
That's the cover image when I put it on Civit.
File: 00340-1146255614.jpg (429 KB, 1260x1680)
429 KB
429 KB JPG
File: tmpl8dc15ks.png (711 KB, 1024x768)
711 KB
711 KB PNG
Oh, that's very interesting. I heard about them being trained in progressive resolutions, but didn't think much of it in terms later finetuning by us. Back when I tried training loras for 1.5, I remember seeing better results training at 256 indeed. I guess there's method to this madness afterall.
Could get away with being a music album cover.
File: ComfyUI_SDXL_0101.jpg (2.14 MB, 2048x2048)
2.14 MB
2.14 MB JPG
I mean this isn't finetuning. This is a base model that was established from ground zero using my dataset.
File: file.png (147 KB, 256x256)
147 KB
147 KB PNG
This is the same prompt from yesterday.
Slow and steady she goes.
I'm mildly confused. So.. you're not finetuning a 1.3B Pixart, but instead using the same dataset, method, and trying to replicate the results?
So the original Sigma model's parameters is 0.6B, there's obviously limitations to that size. Using their research and methodology I've started a new model with expanded parameters (1.35B) which should be better. I've put together my own dataset of millions of images and I'm currently training it on two 4090s. The reason why I did 1.3B because it hits a batch size that's acceptable and should be doable even at 1024px. Besides having more parameters, my dataset is uncensored and has a mix of wikiart, datacomp (a new 1B llava 1.5 tagged dataset like LAION), e621, gelbooru, flickr, duckduckgo search images, and stills from youtube. Really whatever I feel like, there's a ton of video game content through ages, art, photography, etc. I'm trying to make it more than just a "uncensored" porn model.
Coolness !!! Let us know how it goes!
Oh, so you're that absolute legend I've been seeing around lately. Think I got confused because I didn't realise the parameter difference. Thank you for clearing that up. Once again, godspeed to your GPUs. Looking forward to the progress, however smol and incremental.
release the model under faipl
Pixart is copyleft
the license is too, check it out no joke
>Pixart is copyleft
as all things should be
Have you tried dynamic? You are using the right dimensions for the compiled engine but maybe there is a bug there.
File: 0.jpg (442 KB, 1024x1024)
442 KB
442 KB JPG
File: 00659-1146255616.jpg (226 KB, 1100x1470)
226 KB
226 KB JPG
File: pixart_00692_.png (3.4 MB, 2048x2048)
3.4 MB
3.4 MB PNG
>and security checks.
it's over
>>and security checks.
>it's over
Those Chicom imperial state mothafuggers take security very seriously.
File: pixart_00693_.png (3.49 MB, 2048x2048)
3.49 MB
3.49 MB PNG
yes. /sdg/ is exactly like /aicg/ as well.
sdg won
Anon sleep
File: stolen6.png (1.23 MB, 768x1344)
1.23 MB
1.23 MB PNG
File: 0.jpg (556 KB, 1024x1024)
556 KB
556 KB JPG
Isn't it time to merge back with the old thread?
If they accept alternate, open models and SD corporate looks like they can avoid backruptcy.
File: 1716588068224503.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
not as much trolling as previous days hence slower thread
whats your definition of winning?
Wasn't splitting the thread a NovelAI psyop to make local weaker? You know, the whole “divide and conquer” thing.
popularity, relevance, skill
File: thinker.png (250 KB, 732x768)
250 KB
250 KB PNG
File: 0.jpg (580 KB, 1024x1024)
580 KB
580 KB JPG
I like Sigma and I like SD models. They have their strengths and weaknesses. I don't like something that requires some remote resources that can change prices and terms on a whim. Having models that you possess and can run on consumer hardware is key. Don't get painted into a corner. Don't be dependent on a single supplier.
Someone tried to have a joint thread but then someone else double baked an SDG so nothing really changed
File: stolen3.png (1 MB, 768x1344)
1 MB
you left so you can't come back
File: ComfyUI_00085_.png (1.42 MB, 1024x1280)
1.42 MB
1.42 MB PNG
anons are just in it for the drama but who can blame them. we all fall for that trap
File: ComfyUI_00087_.png (1.56 MB, 1024x1280)
1.56 MB
1.56 MB PNG
File: 5654455151.jpg (344 KB, 1735x1455)
344 KB
344 KB JPG
Results are in for my first ever realistic skin LoRA for Hunyuan. Needless to say, the results are stunning. This is just with 100 quickly picked images where there is simple requirements: 1girl, no bokeh, mostly portrait, Asian. Unfortunately I was only able to find IG fashion thots, but as expected it has learned to make better images (that is, it learned how to make realistic skin). The settings were the same as they provided in their LoRA training script. As I said before, I think uncucking this model will be a piece of cake. Now, the LoRA is not perfect but it's a good sign of what is possible.
that ones real nice
File: stolen2.png (1.46 MB, 768x1344)
1.46 MB
1.46 MB PNG
thank you for the amazing gens and tips, really great stuff.
File: 00023-171242267.png (2.81 MB, 1280x1920)
2.81 MB
2.81 MB PNG
Tbh it's time to merge these threads into one again

Now that SAI is on its deathbed we can merge both threads to one called /ldg/

Comfy leaving was the last nail in the coffin
you guys lost and just need to come home already but you're too prideful
i think anon is too attached to sdg, they wont change no matter what
true but you have to hand it to them, they held out for a long time
>this is bait
File: Taylor_0002.jpg (453 KB, 768x1152)
453 KB
453 KB JPG
Keep in mind I did not use just high quality pics for this LoRA, some of them are low quality, so it is copying the grain that it saw in some of my pics. For what it is, and how quickly I trained, this is not a bad result. Although a full finetune is still needed to fix details of faces in backgrounds.
File: 00009.png (1.38 MB, 1280x768)
1.38 MB
1.38 MB PNG
I'm going to have nightmares now.
File: Taylor_0010.jpg (559 KB, 768x1152)
559 KB
559 KB JPG
File: 00011.png (1.14 MB, 1280x768)
1.14 MB
1.14 MB PNG
nightmare nightmare nightmare
File: Taylor_0020.jpg (532 KB, 768x1152)
532 KB
532 KB JPG
Fresh (for when anon wakes up)

File: stolen33.png (1.07 MB, 768x1344)
1.07 MB
1.07 MB PNG
Their mad that it's more difficult to run psyops in two threads as opposed to one

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.