[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 5-2-87422346.jpg (258 KB, 1824x1152)
258 KB
258 KB JPG
Previous /sdg/ thread : >>100153234

>Beginner UI local install
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io

>Local install
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI (Node-based): https://rentry.org/comfyui
AMD GPU: https://rentry.org/sdg-link#amd-gpu
Intel GPU: https://rentry.org/sdg-link#intel-gpu

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Auto1111 forks
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux
Vladmandic: https://github.com/vladmandic/automatic

>Run cloud hosted instance
https://rentry.org/sdg-link#run-cloud-hosted-instance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
Inpainting: https://huggingface.co/spaces/fffiloni/stable-diffusion-inpainting
pixart: https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma

>Models, LoRAs & embeddings
https://civitai.com
https://huggingface.co
https://rentry.org/embeddings

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>SDXL info & download
https://rentry.org/sdg-link#sdxl

>Index of guides and other tools
https://codeberg.org/tekakutli/neuralnomicon
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html

>Share image prompt info
4chan removes prompt info from images, share them with the following guide/site...
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg

Official: discord.gg/stablediffusion
>>
File: SDG_News_00263_.png (1.79 MB, 1560x896)
1.79 MB
1.79 MB PNG
>mfw Resource news

04/23/2024

>HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models
https://github.com/megvii-research/HiDiffusion

>Invoke v4.2.0a2 Adds Regional Control
https://github.com/invoke-ai/InvokeAI/releases/tag/v4.2.0a2

>AnyPattern: Towards In-context Image Copy Detection
https://anypattern.github.io/

>SVGEditBench: A Benchmark Dataset for Quantitative Assessment of LLM's SVG Editing Capabilities
https://github.com/mti-lab/SVGEditBench

>DMesh: A Differentiable Representation for General Meshes
https://sonsang.github.io/dmesh-project/

>PDM-Pure: Effective Purification in One Simple Python Script
https://github.com/xavihart/PDM-Pure

>IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
https://idm-vton.github.io/

04/22/2024

>MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
https://github.com/bytedance/MoMA/tree/main

>LLaMa3 Stable-diffusion prompt maker
https://ollama.com/impactframes/llama3_ifai_sd_prompt_mkr_q4km

>PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
https://physdreamer.github.io/

>Training-Free Painterly Image Harmonization Using Diffusion Model
https://github.com/BlueDyee/TF-GPH

>TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen
https://tv-100.github.io/

>Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
https://hyper-sd.github.io/

04/21/2024

>FlashFace Inference Code Released
https://github.com/ali-vilab/FlashFace

>ComfyUI MagickWand: Proper implementation of ImageMagick
https://github.com/Fannovel16/ComfyUI-MagickWand

>Moving Object Segmentation: All You Need Is SAM (and Flow)
https://www.robots.ox.ac.uk/~vgg/research/flowsam/

>Image Effect Scheduler Node Set for ComfyUI
https://github.com/hannahunter88/anodes/

>ComfyUI-Tripo: Generate 3D models using the Tripo API
https://github.com/VAST-AI-Research/ComfyUI-Tripo
>>
>mfw Research news

04/23/2024

>GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
https://ivl.cs.brown.edu/research/geodiffuser.html

>TAVGBench: Benchmarking Text to Audible-Video Generation
https://arxiv.org/abs/2404.14381

>Graphic Design with Large Multimodal Model
https://arxiv.org/abs/2404.14368

>MultiBooth: Towards Generating All Your Concepts in an Image from Text
https://multibooth.github.io/

>Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting
https://arxiv.org/abs/2404.14007

>Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion
https://arxiv.org/abs/2404.13993

>RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance
https://arxiv.org/abs/2404.13984

>Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas
https://arxiv.org/abs/2404.13944

>MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets
https://arxiv.org/abs/2404.13923

>Accelerating Image Generation with Sub-path Linear Approximation Model
https://arxiv.org/abs/2404.13903

>Regional Style and Color Transfer
https://arxiv.org/abs/2404.13880

>Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation
https://arxiv.org/abs/2404.13798

>Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control
https://arxiv.org/abs/2404.13766

>ArtNeRF: A Stylized Neural Field for 3D-Aware Cartoonized Face Synthesis
https://arxiv.org/abs/2404.13711

>Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
https://cs-people.bu.edu/vpetsiuk/arc/#

>PoseAnimate: Zero-shot high fidelity pose controllable character animation
https://arxiv.org/abs/2404.13680

>Rethink Arbitrary Style Transfer with Transformer and Contrastive Learning
https://arxiv.org/abs/2404.13584

>LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions
https://arxiv.org/abs/2404.13579
>>
>mfw MORE Research news

>Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap
https://arxiv.org/abs/2404.13573

>LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
https://arxiv.org/abs/2404.13558

>Motion-aware Latent Diffusion Models for Video Frame Interpolation
https://arxiv.org/abs/2404.13534

>FilterPrompt: Guiding Image Transfer in Diffusion Models
https://arxiv.org/abs/2404.13263

>PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition
https://arxiv.org/abs/2404.13299

>Generating Daylight-driven Architectural Design via Diffusion Models
https://arxiv.org/abs/2404.13353

>AdvLoRA: Adversarial Low-Rank Adaptation of Vision-Language Models
https://arxiv.org/abs/2404.13425

>Mixture of LoRA Experts
https://arxiv.org/abs/2404.13628

>GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
https://w-ted.github.io/publications/gscream/

>Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
https://arxiv.org/abs/2404.13686

>Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
https://arxiv.org/abs/2404.13784

>PGAHum: Prior-Guided Geometry and Appearance Learning for High-Fidelity Animatable Human Reconstruction
https://arxiv.org/abs/2404.13862

>Mechanistic Interpretability for AI Safety -- A Review
https://arxiv.org/abs/2404.14082

>Towards Better Adversarial Purification via Adversarial Denoising Diffusion Training
https://arxiv.org/abs/2404.14309

>GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting
https://arxiv.org/abs/2404.14037

>Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation
https://arxiv.org/abs/2404.13866

>Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
https://arxiv.org/abs/2404.14233
>>
File: PW_66839_.jpg (297 KB, 2048x1152)
297 KB
297 KB JPG
>>
Dude come on, three posts?
>>
File: pa.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>100155844
>HiDiffusion
There is no auto1111/ComfyUI implementation of this yet, right?
>>
File: fish.jpg (147 KB, 1024x1024)
147 KB
147 KB JPG
>>100155948
seems like a lot has been published
>>
>>100156059
A lot that he doesn't curate or cull.
>>
File: death_count.png (92 KB, 1484x200)
92 KB
92 KB PNG
>>100156048
the more you know anon
>>
File: GraphicNovel.jpg (60 KB, 565x401)
60 KB
60 KB JPG
I'm interested in using this application to create a graphic novel about a battle. See picrel. Something along the lines of the art style of this image. Do you think with some reading up I can get the application to create such similar looking images for a storyboard? Any advice on where I Should start with my reading so I get working towards that goal?
>>
File: 1girl.jpg (222 KB, 1792x2304)
222 KB
222 KB JPG
>>100156073
It is all related to SD and has a short summary/title, seems curated enough to me.
>>
File: Vintage Pop Art.jpg (74 KB, 923x438)
74 KB
74 KB JPG
>>100156082
Also, forgot to ask, what happens if my laptop is a 10 year old Dell, with some Intel HD Graphics 4600 Video Card, SSD, and 16gb memory. Will the laptop catch fire or will Automatic1111 refuse to work? Can someone send me a link from newegg to a machine that would be the minimal required to be able to work properly with this application? Sorry for the spoon feeding request. I'm trying to learn something that could be useful in making some money. Picrel is called vintage pop art. Looks like the old pulp fiction novels. I like it also.
>>
>>100156082
With lora training it could certainly do the approximate environment, looks, and one soldier. That may be enough for a draft/reference or not depending on what you do.

SD derivatives up to SDXL (SD3 is not publicly released yet) aren't that strong at reproducing interactions between 2 people yet.
>>
>>100156123
>It is all related to SD
Not arguing that. If he's going to have more than TWO FULL LENGTH posts at the start of each thread the rest should be in a pastebin.
>>
File: PW_66859_.jpg (454 KB, 2048x1152)
454 KB
454 KB JPG
>>100156082
The first thing you do is pick a UI! There are a few in the OP, but I would recommend ComfyUI :]
Take a look at all of em though to make your choice!
To get something like that I think you just need to find a model and LoRA(s) that fit a similar style, as well as prompting for what it is you need
civitai.com is where you'll find models and stuff! :]
>>100156134
Err it'll probably just refuse to work if anything or freeze haha i'm not sure
Anything with at least a GPU with 8gb vram will work, not fast, but it'll work hahaha
>>
>>100156123
Love the 3d model look on this one
>>
File: Sigma.jpg (140 KB, 1024x1024)
140 KB
140 KB JPG
>>100156082
BTW Pixart-Sigma can do a little more interactions between soldiers (not perfect either tho)? But you may not be able to train PixartSigma well yet, the tooling there isn't as advanced as for SD.

>>100156134
It will be very slow since all of it runs on a slow CPU.

A gaming type desktop machine with a 3060 w/ 12GB VRAM or something is more of a workable setup, the absolute minimum requirements are not actually a good idea / usable. It's not a single workflow but hundreds to thousands of tools, if half of them barely work or only in minutes you aren't getting much done.

Quite a few people here would likely be running DGX workstations if they were affordable and not as hot. So buying a GPU faster than a 3060 like a 4090 and a very powerful computer is probably not actually wrong for productive use either.
>>
File: 00041-2876609155.png (486 KB, 512x768)
486 KB
486 KB PNG
>>
File: 00061-1463310103.png (498 KB, 512x768)
498 KB
498 KB PNG
>>
>>100156134
Just look up a tutorial for runpod bro. You can get started in the next hour. Pay by the hour and get super fast GPUs. Shut down machine when not using. a5000 for $0.36 per hour will work. Super easy to get started with the invoke ai runpod template. Even the base models are really good nowadays, i got great results recently on just the sdxl model downloaded by default on the invoke template.
>>
>>100155824
cool picture
>>
File: soldat.jpg (82 KB, 888x583)
82 KB
82 KB JPG
>>100156204
>>100156171
>>100156145
>>100156161
Thank you all for the info and good vibes.

>>100156262
Thanks! I'll look into that immediately.

With this technology, can I upload a picture like picrel and try to train the SD to work with a particular image?
>>
File: PW_66879_.jpg (309 KB, 2048x1152)
309 KB
309 KB JPG
>>100156312
Any time, anon! :]
I hope you accomplish your goal!
>>
File: bimbo.jpg (75 KB, 896x1152)
75 KB
75 KB JPG
>>100156161
I sometimes fold or skip them instead of reading them when I'm not interested at the moment.

>>100156188
I have recommended the author's checkpoints a few times by now because they're really good:
https://civitai.com/user/WangKa/models?sort=Highest%20Rated
https://civitai.com/models/400329/pvc-style-modelmovable-figure-model-pony
>>
>>100156312
Yes find some good tutorial channels. You can probably use image to image, or Google clip interrogator on huggingface to upload that image to and reverse what prompts might give you an image that looks like that. I think there are some other tools like ip adapter but i haven't used that. Just experimenting with proompting and invoke will be a lot to play with. Invoke ai official youtube channel has some good tutorials on how to use it that will give you a lot of options to play with. You can also try some models on huggingface or civitai if that's not enough
>>
File: bimbo2.jpg (72 KB, 896x1152)
72 KB
72 KB JPG
>>100156312
> With this technology, can I upload a picture like picrel and try to train the SD to work with a particular image?
With a single image it's fairly difficult to predict if it will understand whatever you wanted it to understand or not really, but in some instances it works.

More typically you want sets of images then it can learn characters, photographic/art styles, location details, material textures, and many more "commonalities" between images as a LoRa, also based on the text tags you supply - potentially via a different "vision AI" that adds them with you only editing.

By no means do people always succeed on the first training tho, you could require many runs to get a sufficiently good result.
>>
File: princess494.jpg (340 KB, 1216x1488)
340 KB
340 KB JPG
I had a dream that I was on my former school campus and Donald trump stopped me and I took a piece of chicken from him and ate it and it was like onions chciken and it tasted awful. Then I was in my grade school classroom and my teacher was berating me for not paying attention and forced me to wash her clothes and her daughters violin, which I had to smuggle into a cabin.

Dont do drugs.
>>
File: me262.jpg (455 KB, 1920x1080)
455 KB
455 KB JPG
>>100156338
I appreciate such information. I'm reading up also on the OP and two top mega posts. Lots to read lol.

On that note, I was thinking, can I take public images of say a specific aircraft, and then have the SD show me variations of the exact same image. For example with picrel, ask/train SD to give variations of that exact image with different art styles, then even maybe add some background to it like a city?

>>100156370
I think you may have kind of answered what I just asked but, can you elaborate based on my comment here?
>>
>>100156222
>>100156244
very cute
>>
>>100156392
> can I take public images of say a specific aircraft, and then have the SD show me variations of the exact same image
Yes. It is however not ABSOLUTELY trivial.

There is a typical step inbetween where you tag images and train a LoRa against a checkpoint so it hopefully actually learns to "understand" this aircraft type. Then you can use the LoRa to ask for the image via the trained tags and also take the city or a forest whatever else from what was already trained in the checkpoint.

> maybe add some background to it like a city
We also have other techniques like background removal and inserting another background, IPadapters to understand structure or color composition of a reference image, controlnets to get a depth model of the image, AI to get a rough 3d model of the airplane in pictures... it's a lot of stuff.
>>
>>100156452
The requirement when exactly training images and tags work is not actually perfectly easily defined BTW.

But then again you also would have trouble to tell what / how many / how long artists would have to look at your photo and drawing collection of an airplane until they could draw it from memory and perhaps even distinguish model variants. It's probably a few dozen good images? But it could be less or more. Kind-of depends.

We actually can't tell in advance either on this technology in its current state. You just kinda try. it tends to work if you have enough examples.
>>
File: PW_66883_.jpg (290 KB, 2048x1152)
290 KB
290 KB JPG
>>
File: gunship.jpg (223 KB, 1536x1024)
223 KB
223 KB JPG
>>
>>100156337
why the hell whole checkpoint instead of lora
>>
File: ComfyUI_04303_.png (1.72 MB, 1216x768)
1.72 MB
1.72 MB PNG
>>
File: jacko.jpg (78 KB, 896x1152)
78 KB
78 KB JPG
>>100156543
Because it works better? It's not exactly the only person who trains checkpoints.

>>100156498
Where did she end up? on top of a forcefield below tentacles?
>>
>>100156529
gimme a whole fleet of these
>>
>>100156560
>Because it works better? It's not exactly the only person who trains checkpoints.
Dora would work well enough, the difference would be pretty much non-existent, this is stupid
>>
File: PW_66910_.jpg (363 KB, 2048x1152)
363 KB
363 KB JPG
>>100156560
LOL something like that! I think she would be safer inside the forcefield tho haha!
>>
File: jacko2.jpg (64 KB, 896x1152)
64 KB
64 KB JPG
>>100156570
Whether Dora or Pixart-Sigma or whatever other new thing works is likely unclear until you invest a lot more attempts into training an equal or better Dora.

I'm guessing it's a matter of "more predictable checkpoint training works well enough".
>>
>>100155844
is HiDiffusion compatible with A1111 yet?
>>
File: PW_66908_.jpg (230 KB, 2048x1152)
230 KB
230 KB JPG
>>
Are there any new interesting 1.5 models at this point, or it's all SDXL? Last thing i've heard about was easyfluff but it can't hold a candle to autism mix and animagine.
>>
File: 00059-1463310101.png (527 KB, 512x768)
527 KB
527 KB PNG
>>100156439
thanks!
>>
Is there any big downside to use Invoke vs Automatic1111? My dumb ass just can't get the latter to run, and it all seems about the models anyway and they seem to be usable with any method, no?

Also am I doing something very wrong or does the Draw Things app on iOS have some lock vs coomer art?
>>
File: jacko3.jpg (72 KB, 896x1152)
72 KB
72 KB JPG
>>100156827
I think for more generally applicable models it's basically all SDXL. People still do specialized stuff.

>>100157031
>it all seems about the models anyway
Depends. These hundreds of extensions (in ComfyUI's case custom nodes) unsurprisingly have their uses. But maybe you are OK with just prompting models?

Also there are multiple automatic1111 forks, maybe vladmandic or anapnoe works for you. Or even go via https://github.com/LykosAI/StabilityMatrix

Invoke was never as complete as comfyui or auto1111 forks so far but I think it's pretty workable by now.
>>
File: jacko4.png (722 KB, 896x1152)
722 KB
722 KB PNG
>>100157064
>People still do specialized stuff
* on SD1.5 - but it's not as much. Pony and other efforts definitely made SDXL more attractive.
>>
>>100157064
>But maybe you are OK with just prompting models?
I mean, at start even that bit seems pretty overwhelming with the silly amount of possibilities, and maybe the more simple UI isn't even that bad at start. If I were to move on to a fork, I could just copy-over the models already downloaded or?
>>
>>100157122
> I mean, at start even that bit seems pretty overwhelming with the silly amount of possibilities
Sure but there is also the advantage of just being able to do something you see online or in a configuration when you come across it. It's still almost all ComfyUI and automatic (/forks), not so much the other UIs.

> I could just copy-over the models already downloaded
Yes or in many instances you could also change configuration to point to the other tool's model directories.

That'd be easier with https://github.com/LykosAI/StabilityMatrix tho, it's one of the main things it does - point everything to shared directories.
>>
>>100157154
Alright, will check it out, thanks anon!
>>
sd3 when
>>
File: 00091-TFT_12401708.png (2.77 MB, 1536x2560)
2.77 MB
2.77 MB PNG
i'm posting this rabbit a lot but more have been focusing on getting different styles with pony as a base
>>
File: images-101.jpg (28 KB, 620x495)
28 KB
28 KB JPG
I don't know if this is the best place to ask but I don't know better.
What are some free apps for face swapping? This option is no longer available on faceapp.
>>
>>
>>
File: 00099-TFT_12401708.png (2.38 MB, 1536x2560)
2.38 MB
2.38 MB PNG
>>100157261
same img no PAG (that one was at 3)
>>
>>100155824
does automatic1111 still use a python version from 10 years ago?
>>
>>
File: jacko4.jpg (106 KB, 896x1152)
106 KB
106 KB JPG
>>100157203
At one point the release date was tentatively end of this month to some months into may, I don't think this has since been confirmed.

API access only for now. But you probably have a few new things to play with given all the newer SDXL/Pony trainings, PixartSigma and extensions.

>>100157281
We don't really do apps. Comfyui/Automatic do have quite many options to either "correct" or completely replace faces including some that do it with referenced materials (IPAdapter FaceID, Roop, Photomaker, even a bunch of the one shot talking head focused methods).
>>
File: 00107-TFT_12401708.png (3.09 MB, 1536x2560)
3.09 MB
3.09 MB PNG
>>100157354
>>100157261
and using autismmix instead of the merged model I made, which do people like best?
>>
>>
>>
I'm getting an error now after trying to install an extension.

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

Wtf? Do I have to reinstall everything now? I've tried deleting venv and reinstalling that.
>>
>>100157354
this is the better of the two.
>>
>>
>>
>>
File: 00039-1549780562.png (2.74 MB, 1464x1736)
2.74 MB
2.74 MB PNG
>>
>>100157642
cute
>>
>>100157456
>which do people like best?
all your stupid shit is awful, just stop and do something useful with your life
>>
>>
File: file.png (16 KB, 559x90)
16 KB
16 KB PNG
>>
>>100157768
Not even gonna bother until I see some good gens. The last sigma model kinda sucked.
>>
Does the one known as Debo still post here?
>>
>>100157688
no, keep yourself safe
>>
>>100157787
Yes they all post here. Literally without reprieve. If there was a god, they would see through the obvious avatar fagging hiding behind generative images in a generative image thread and ban them repeatedly until they learn to blend in with the crowd and discuss stable diffusion as a piece of technology rather than using it as a springboard to delve into how their latest bottom surgery post op checkup went.
>>
>>100157787
I've never once seen this person, but every thread I see 3 shizos malding about them.
I wonder if this Debo ever existed or if it's all in the mind of shizos who refuse to take their meds.
>>
>>100157847
newfag
>>
File: pixelart-Sigma.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>100157786
>The last sigma model kinda sucked.
I thought it is pretty decent at some things SDXL doesn't do well.

Will likely try if I can run 2k after this long queue is done.
>>
>>100157456
I preferred the previous one
The linework on that one was better imho
Posting them side by side would make things easier
>>
>>100157896
I actually find it easiest to compare them by opening each in a new tab and going between them than having them next to each other
>>
>>100157895
I think pixart has a lot of potential.
A pain to run locally as you need more than 12GB vram.
>>
>>100157904
Yes but I am at work and looking trough a 7inch screen at it, so
If I prefer your way too when behind desktop
>>
File: SoldierExplosion-Sigma.jpg (139 KB, 1024x1024)
139 KB
139 KB JPG
>>100157786
Another example. Also somewhat better than SDXL.
>>
File: SoldierExplosion2-Sigma.jpg (162 KB, 1024x1024)
162 KB
162 KB JPG
>>100157786
There's also sort-of a difference in how it does the terrain, layered smoke, and other stuff which I think isn't entirely just the training data. And prompt adherence is a little better too.

The training data overall wasn't as aesthetic though, and it has no chance winning at anime 1girl at the moment.
>>
I think the PW imitator is being weird
Noooticing
Also post more I’m bored
>>
File: Wizardry-Sigma.jpg (149 KB, 1024x1024)
149 KB
149 KB JPG
>>100157952
Up to 24GB VRAM it's pretty okay - typical developed world hobby expense. A vast number of people are able to do that.

Above that sucks ATM, stuff like $35k+ for a GH200 is getting pretty exclusive.
>>
>>100157960
>Also somewhat better than SDXL.
are u kidding?
>>
File: 00047-1463310089.png (512 KB, 512x768)
512 KB
512 KB PNG
>>
>>100158065
>24GB VRAM
The Radeon 7900 seems cheap for 24gb. Is it worth it?
>>
File: ComfyUI_01244_.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
>>
File: Pixelart3-Sigma.jpg (183 KB, 1024x1024)
183 KB
183 KB JPG
>>100158100
There are multiple more defined ground + air burst explosions with shrapnel and other things SDXL tends to struggle with as often things blend a bit too much.

It's not a comment on futuristic mechas being more imposing than rifle-carrying humans soldiers.

>>100158152
AMD was more trouble / less supported at the time when I considered my purchase, not sure about now.
>>
>>100158225
>humans soldiers.
these horrific abominations with mangled bodies and botched faces are supposed to be soldiers?
this is crap worse then SD 1.4
>>
File: sci-fi_comic_panels.jpg (210 KB, 1536x1024)
210 KB
210 KB JPG
>>
>>
Spiked forest kino
>>>/wsg/5523881
>>
>>
>>100158483
nice
>>
File: 00178-1092438295.png (2.63 MB, 1080x1536)
2.63 MB
2.63 MB PNG
>>
>>
File: 00166-TFT_12401708.png (3.15 MB, 1536x2560)
3.15 MB
3.15 MB PNG
this figure model is pretty cool
>>
File: ComfyUI_01255.jpg (2.78 MB, 4096x4096)
2.78 MB
2.78 MB JPG
>>
File: 00172-TFT_12401706.png (3.27 MB, 1536x2560)
3.27 MB
3.27 MB PNG
>>100158654
>>
File: Clipboard01.jpg (68 KB, 465x848)
68 KB
68 KB JPG
What's your preferred upscaler for hiresfix and photo or photo similar gens? Which one do I miss in my list which one is completely useless (seldom used one else then ESRGAN, R-ESRGAN, Ultrasharp etc)?
>>
File: 00017-1239975740.png (453 KB, 512x768)
453 KB
453 KB PNG
>>100158654
what figure model are you using?
i'm using an old hypernet i got ages ago from another anon.
>>
File: 0.jpg (412 KB, 1024x1024)
412 KB
412 KB JPG
>>
>>100156337
>>100158784
this one
>>
File: 00005-1239975728.png (470 KB, 512x768)
470 KB
470 KB PNG
>>100158815
ah, ok. thanks.
>>
>>100157446
>>100158737
this was so much better I assumed you found something else lol
>>100158815
>>
File: ComfyUI_01264.jpg (2.47 MB, 4864x3328)
2.47 MB
2.47 MB JPG
>>100158780
4x_NMKD-Superscale-SP_178000, i don't upscale much but it seems a decent one, but probably quite old.
>>
File: fig5.png (826 KB, 896x1152)
826 KB
826 KB PNG
>>100158654
> this figure model is pretty cool
definitely is
>>
File: fig6.png (929 KB, 896x1152)
929 KB
929 KB PNG
>>100158780
4x-Ultrasharp is fine for most uses.

>>100158784 >>100158903
BTW the SDXL versions of the Wangka models also are good but the Pony version has more characters and some more poses trained.

Anyhow it's probably the obvious PVC model to try.
>>
>>
File: loracompare.jpg (3.29 MB, 3226x2552)
3.29 MB
3.29 MB JPG
>>100158997
I always use loras even if pony in theory has the char in it, it is just much better
>>
>>100159020
and spam this board with the same anime 1girl over and over again until you become a hated avatarfag
>>
>>100159017
mmmm. chocolate heart cake and coffee in the morning.
>>
>>100155979
Just use your own diffusers script to run SD like any reasonable person. You'll never miss out on new stuff again that way.
>>
>>100159130
>own diffusers script
What language?
>>
>>100159106
I only keep using mirko because I know the lora and character is high quality when I am testing non character stuff like styles and models
>>
File: ..png (1.23 MB, 1216x832)
1.23 MB
1.23 MB PNG
>>
>>100158967
Three toes
Long tongue
It sure is <3
>>
>>100159148
python, it is really easy (was actually surprised how easy it is). They even have many optimisations that you can toggle just like Comfy and Auto.
>>
>>100159106
Just be honest and admit he's decent at what he does. He's not avatar posting you literally have two threads to look at where actual avatar posting is taking place anon
>>
>>100159148
import torch
import numpy as np
from diffusers import StableDiffusionXLPipeline, TCDScheduler
from tgate import TgateSDXLLoader
from datetime import datetime

# Enable float16 and move to GPU
pipe = StableDiffusionXLPipeline.from_single_file(
"model here",
custom_pipeline="./scripts/pag.py",
torch_dtype=torch.float16,
use_safetensors=True,
safety_checker=None
)
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)

# OPTIMISATIONS
pipe = TgateSDXLLoader(pipe, gate_step=6, num_inference_steps=18)
pipe.enable_freeu(s1=0.6, s2=0.4, b1=1.1, b2=1.2)
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
pipe.enable_sequential_cpu_offload()
pipe.unet.to(memory_format=torch.channels_last)

# PROMPT
prompt = "raw photo of "

negative_prompt = "(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art)1.4, (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name)1.2"

rng = np.random.default_rng()
seed = int(rng.integers(1e6, size=1)[0])

# EXECUTE PIPELINE
image = pipe.tgate(
prompt,
negative_prompt=negative_prompt,

width=832,
height=1216,

eta=0.2,

num_inference_steps=18,
gate_step=6,

pag_scale=5.0,
pag_applied_layers=['mid'],

generator=torch.Generator(device="cuda").manual_seed(seed),
).images[0]
image.save(f"./images/{datetime.isoformat(datetime.now())}_{seed}_.jpg")
This is what I have right now for a simple t2i pipeline.
>>
Why do retards equate subject matter to avatar use?
Look at a pw/ani/comfy post if you want to see actual avatar faggotry.
>>
File: 0.jpg (415 KB, 1024x1024)
415 KB
415 KB JPG
>>100159197
nice
>>
>>100159269
Pure envy
>>
>>100159257
What is pag.py?
>>
>>100159351
https://github.com/KU-CVLAB/Perturbed-Attention-Guidance It's supposed to improve generation detail but it is tricky to work with tcd.
>>
>>100158997
ok cool. Sometimes however I find ESRGAN gives better results, it makes the image a bit more grainy which adds to realism. The other models sometimes make the skin too smooth (works better for anime/cartoon style tho of course).
>>
>>100156082
With current models I see a lot of difficulties with that theme: Weapons are a shitshow, but maybe a lora can help. Positions, like prone or kneeling are going to cause issues for you.
If you don't obsess about consistency you can do it, but good luck replicating the same uniform twice.
My advice: IPAdapter can help you a lot don't disregard it.
I am working on a similar project: I am making a visual novel, the visual narration part is much lighter of course, and I can "turn off" the camera and rely only on text when I can't make the AI do what I want it to do.
Maybe in a few years we will be writing scripts and the AI will churn out full feature movies, but in the meantime, I think that with the current tools a movie script can be turned into some sort of enhanced storyboard, more like a visual novel than a comic book. When I am done with my current project I want to try something a little more advanced like that.
>>
Why can't we let samplers handle up scaling?
>>
>>100159415
They do?
>>
>>100159351
Not much, what's pag with you, py?
>>
>anon post good gens with a subject matter
>nogen falsely labels them a avatarfag
> tranny larps and avatar post for hours
>has been banned multiple times for this behavior
>silence
>>
File: ..png (507 KB, 672x384)
507 KB
507 KB PNG
>>
>>100159516
peepee poopoo
>>
>>100159376
very interesting
>>
I have a lifelong crush on Sarah Silverman and I’m going to gen naked pictures of her
There is nothing you can do to stop me, I just wanted you to know
>>
>>100159627
I remember you from last summer in /b/
>>
File: Facedetailer_00026_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
>>100159627
It won't work, there are no refridgerator Lora's.
>>
File: ..png (1.23 MB, 1216x832)
1.23 MB
1.23 MB PNG
>>
>>100159398
I tend to prefer the upscalers that don't add much detail because the last step usually is another sampler / hires fix, not the GAN itself.

But YMMV based on subject and settings.
>>
File: sigma_005.png (1.97 MB, 768x1280)
1.97 MB
1.97 MB PNG
>>
File: dZ1623111811.jpg (166 KB, 1416x1024)
166 KB
166 KB JPG
>>100159516
>nogen tries to start drama
>>
File: DislayGenX1116.jpg (202 KB, 1592x1160)
202 KB
202 KB JPG
>>100159017
>>
File: sigma_010.png (1.87 MB, 768x1280)
1.87 MB
1.87 MB PNG
>>
File: sigma_023.png (1.89 MB, 768x1280)
1.89 MB
1.89 MB PNG
>>
>>100159648
I am not that anon but I can identify with their sentiment
>>100159657
She looked 25 when 35, and 20 when 40, I’m convinced she sucks out the souls of tv bozo’s (trough their dicks ofc) and stays young that way
>>
>>100159790
>>100159808
>>100159828
I like how the orange pops
>>
How do I make the Forge UI script open a terminal to run? I want visual evidence that it's working, and also an easy way to end the process (by closing Konsole. I will also be using the noclose flag.
>>
File: ..png (500 KB, 672x384)
500 KB
500 KB PNG
>>
>>100159941
If your current way of launching Forge is to double click the launch script from a file manager then don't do that. Instead, launch a terminal/console, navigate to where Forge is installed and manually launch it from there.
>>
>>100159978
Well, that gets around that issue. I was trying to be a smidgen extra lazy, I guess it wasn't meant to be. Thanks anon.
>>
>>
>>100159805
tasty
>>
>>100159703
I like this car.
>>
File: ComfyUI_01268_.png (1.27 MB, 1216x832)
1.27 MB
1.27 MB PNG
>>100159868
It's literally Korean baby-foreskin cell injections, I think they call it “Hollywood EGF facial”
so kind of technically a Jew sucking the life out of the goyims children.
>>
Which local model is the best right now for anime girls and hentai? Still NAI?
>>
lol ai art is a joke
a woman's head can't be that big
that's bigger than mt everest
can you even imagine what would happen if she sat on your house
ridiculous
totally fake
>>
>>100160071
>its literally this nuts thing I made up
>>
File: color_wave.jpg (200 KB, 1024x1536)
200 KB
200 KB JPG
>>100159306
>>100159790
>>100159793
very good
>>
File: 0-AFH028302024.jpg (247 KB, 1288x1288)
247 KB
247 KB JPG
>>
>>100160161
https://www.theguardian.com/lifeandstyle/2018/dec/07/foreskin-facial-treatment-baby-salon-wrinkles

Read em and weep.
>>
File: ..png (1.2 MB, 1216x832)
1.2 MB
1.2 MB PNG
pothole in car-utopia
>>
>>100160133
>lol ai art is a joke
>a woman's head can't be that big
>that's bigger than mt everest
>can you even imagine what would happen if she sat on your house
>ridiculous
>totally fake

used this rant as a prompt
>>
File: 00178-3533321041.png (2.27 MB, 1024x1536)
2.27 MB
2.27 MB PNG
>>100160108
I think it depends on what you want from it. If you want it to support high quality hardcore explicit hentai it can limit the choice significantly.
>>
File: ..png (515 KB, 672x384)
515 KB
515 KB PNG
>>100160240
Vangelis - Ask the Mountains
https://www.youtube.com/watch?v=jDPyw9OQ4TM
>>
File: 1702987602685583.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>100160265
In early 2023 when LoRa had just come out I was trying to replicate the drawing style of the artist below but it wouldn't get it right, especially the ribcages so I kinda gave up then. Which model do you think works the best for this art style?
https://gelbooru.com/index.php?page=post&s=list&tags=yuzawa+
>>
File: ..png (1.34 MB, 1216x832)
1.34 MB
1.34 MB PNG
>>
File: BMP_10003_.png (2.66 MB, 1328x1328)
2.66 MB
2.66 MB PNG
>>100160133
>>
File: ..png (482 KB, 672x384)
482 KB
482 KB PNG
>>
File: 00208-392656936.png (2.35 MB, 1024x1536)
2.35 MB
2.35 MB PNG
>>100160499
I can't think of any specific model, sorry. But I think for the most part you can just prompt it on a model that is general enough. That won't be very easy though (but if you have at least a half working lora then it will probably help). Maybe it's worth it to just scroll through civitai and see if there is anything similar enough, download it and start from there? Btw I also like this artist. I think I will now try and see if I can generate something similar with the models I use.
>>
>>100160666
I- I don't think this is covered by my insurance...
>>
I installed forge, and symlinked my model and lora folders, but they are not showing up in the webui. Also, some extensions I installed the other night are no longer present. Anybody know what happened? Did I overlook something?
>>
File: sigma_032.png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>
File: ella_029.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
>>100160108
autismmix
>>
File: ComfyUI_06684_1.png (2.28 MB, 1536x1536)
2.28 MB
2.28 MB PNG
>>
>>100160071
This is extremely Jewish, like, wauw...
Also SD enlarges Sarah's nose automatically with just a picture and name prompt
not nice
>>
File: 00293-2093366921.png (2.2 MB, 1080x1536)
2.2 MB
2.2 MB PNG
>>
Can I use --always-gpu on forge yet?
>>
>>
I've seen some xl models with clip skip 2 in the recommendations but when I use it it makes literally zero difference, it's the same exact image
do I have to change anything in the UI to make the clip skip affect xl models or what?
>>
dont forget your taxes
>>
>>100161371
ui's automatically switch to clip skip 2 because xl was trained on the penulrimate layer. it is clip skip 2 always
>>
File: ComfyUI_06687_.jpg (1.25 MB, 2048x2048)
1.25 MB
1.25 MB JPG
>>
File: fooks1.jpg (735 KB, 1792x2304)
735 KB
735 KB JPG
>>100161381
I actually did them 2 days ago
now to wait for the hit, which might be far less than I was dreading
Mostly bc I've worked my tail off for pennies, in hindsight
fooks tiem
>>
File: sigma_035.png (1.94 MB, 1024x1024)
1.94 MB
1.94 MB PNG
The T5 encoders are okay, but dunno if it's worth the extra overhead yet. The text encode step is incredibly sluggish, so iterating on a prompt is pretty painful, but if you leave the prompt alone you can do lots of sampling/seed exploration at your regular gen rate it seems. Thus, lots of orange scrapper girl. Moving from a 8gb 2070 to a 24gb 4090 today tho, hoping it becomes a lot more feasible
>>
>>100161478
1bit quantization might be really great
>>
>>100161388
w-what did she just sit on??
>>
File: 00324-TFT_12401714.png (3.47 MB, 2048x2048)
3.47 MB
3.47 MB PNG
inpainted and cleaned up an image from the figure model, you could probably provide this straight as concept art to the 3D designers to make...
>>
>>
File: ella_033.png (1.11 MB, 832x1152)
1.11 MB
1.11 MB PNG
>>
File: ..png (504 KB, 672x384)
504 KB
504 KB PNG
>>
File: 1687221941933590.png (2.13 MB, 1200x1200)
2.13 MB
2.13 MB PNG
First gen with Forge. And I would never have been able to gen this resolution off the bat with a1111. So far, I'm impressed with Forge.
>>
>>
File: 00125-457547253.png (1.42 MB, 1160x1688)
1.42 MB
1.42 MB PNG
I used to be able to apply loras on SDXL models but ever since I updated a couple of weeks ago doing so crashes Auto. Is there a fix for that? like an argument I can use. I think I'm running out of memory.
>>
File: fooks2.jpg (893 KB, 1792x3104)
893 KB
893 KB JPG
>>
File: 00166-158206337.jpg (193 KB, 1024x1024)
193 KB
193 KB JPG
>>
File: BMP_09986_.png (1.77 MB, 976x1328)
1.77 MB
1.77 MB PNG
>>100161428
You only have to pay a fee if you owe money, IRS don't care if you're getting money back late
>t. didn't file for 3 years straight once
>>
>>
>>100161747
--medvram? Otherwise enable fp8 with fp16 loras in the settings. I have 6gb vram and am able to run sdxl with loras that way.
>>
>>
File: 0.jpg (564 KB, 1024x1024)
564 KB
564 KB JPG
>>100160167
>very good
thanks
>>
>>100161781
>didn't file for 3 years straight once
I did neither, I decided to fix it instead of worry about my letterbox and make myself go insane.
I'll get a fine, and will owe some, but whatevs, hang me for all I care, I have nothing but for what I possess.
The ironic case is ofc that, all over the western world, all the tax agencies already know everything about you, in more detail than oneself
They just insist on you telling them.
>>
>>100161795
>--medvram
already using that
>fp8 with fp16 loras in the settings
I'll try that
> I have 6gb vram and am able to run sdxl with loras that way
mine is 8gb, which makes it even weirder
I changed distros recently but I don't think it has anything to do with that
>>
>>100161837
Just claim 0 or put in extra if you're worried about being late each year
>>
File: ella_035.png (1.11 MB, 704x1408)
1.11 MB
1.11 MB PNG
>>
File: ComfyUI_06699_.jpg (1.3 MB, 2048x2048)
1.3 MB
1.3 MB JPG
>>
>>100161868
It is my express intention to not make any profit but all use it to grow the business
In EU I'll have to file regardless of what I do but now that I know how it get's exponentially easier to do.
>>
>>
>>100161968
>EU
Oh shit. Forget anything I said then, dunno about the laws there. Cute fox btw.
>>
>>100161844
I think my problem has something to do with this
https://nvidia.custhelp.com/app/answers/detail/a_id/5490
>>
File: 00028-2877872241.jpg (627 KB, 2120x1728)
627 KB
627 KB JPG
mfw haven't been able to pay the phone bill
>>
File: ComfyUI_06707_.jpg (1.29 MB, 1792x2304)
1.29 MB
1.29 MB JPG
>>
File: ComfyUI_temp_qmxhj_00033_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>100162029
ty and ty for the advice regardless
>cries in 21% vat
>>
File: ComfyUI_23_.png (1 MB, 768x1152)
1 MB
1 MB PNG
1girl
>>
File: moneyfox.jpg (218 KB, 1024x1536)
218 KB
218 KB JPG
>>100162281
>>
File: telephone awoo.png (597 KB, 1120x741)
597 KB
597 KB PNG
>>100162083
>>
so are the SD models nowadays any better than SD 1.4/1.5/whichever 1.x version was the best?
>>
File: ..jpg (110 KB, 1216x832)
110 KB
110 KB JPG
>>
>>100162373
I wish
>>
>>100162464
no, stick with 1.5 forever!
>>
>>100162464
Yes and no. Maybe, it depends.
>>
>>100162083
was it unexpectedly high or where you when you spent your money?
>>
>>100161837
>They just insist on you telling them.
Submission ritual
>>
>>100162464
always have been
>>
>>100162464
Ride the pony if you have a real man's GPU
>>
fixed major issues with my webui where green garbage images would be generated with some tags. The solution was to set emphasis mode from Original to No norm
>>
>>100162496
but really
watch how in 10 yrs, with the CBDC, you'll still have to file
>>
File: flower_fox.png (3.04 MB, 1160x1696)
3.04 MB
3.04 MB PNG
>>
File: taxman.jpg (153 KB, 1024x1536)
153 KB
153 KB JPG
>>100162472
>>
Morning anons
>>
File: ComfyUI_35_.png (1.31 MB, 768x1152)
1.31 MB
1.31 MB PNG
IP Adapter is so much fun.
>>
>>100155824
Can you recommend a model/lora for character portraits? Not anime. I'm going for semi-realistic fantasy aesthetics for a personal RPG prototype. Something like stableinkdiffusion back in the day.
>>
File: ComfyUI_16357_.png (2.02 MB, 1024x1536)
2.02 MB
2.02 MB PNG
>>
>>100162660
How are things over at SAI?
>>
>>100162660
She always look so happy wherever she is!
>>
File: ..png (421 KB, 672x384)
421 KB
421 KB PNG
>>
>anime style girl
>realistic background
>>
File: ComfyUI_16363_.png (2.38 MB, 1024x1536)
2.38 MB
2.38 MB PNG
>>100162684
Things are fine.

>>100162695
Yes, she's the cutest.
>>
>>100162660
>>100162827
What's the line at the top of these?
>>
File: fantasy_characters.jpg (166 KB, 1536x1024)
166 KB
166 KB JPG
>>
File: 00022-2541722287.png (1.14 MB, 832x1216)
1.14 MB
1.14 MB PNG
>>
I'm trying to use an earlier version of automatic but it keeps updating to the lastest when I execute ./webui.sh
how do I stop it from updating?
>>
>>100162520
how did you figure it out?
>>
>>100162907
what's the ARGS in webui.sh? I assume it works similarly to windows.
>>
>>100162912
Someone suggested it, I think
>>
>>100162912
By switching to comfy
>>
File: ComfyUI_16390_.png (2.36 MB, 1024x1536)
2.36 MB
2.36 MB PNG
>>100162840
It's an issue with DIT models.
>>
>>100161650
sweet stuff.
>>
File: 00062-1000559233.png (1.66 MB, 992x1456)
1.66 MB
1.66 MB PNG
>>
File: Maid Marian.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
I remembered the First Fox04vvw

Must be the first anthro fox I've seen
>>
Next Thread

>>100162844
>>100162844
>>100162844
>>
>>100163041
fkin typo
>>
File: 1689444375247969.jpg (7 KB, 128x112)
7 KB
7 KB JPG
>>
>>100163126
you again
>>
bake?
>>
>>100163041
kino



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.