[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


You're Not Alone Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106995676

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 00039-2972526174.png (1.93 MB, 1248x1824)
1.93 MB
1.93 MB PNG
>>
File: 1735142585245396.mp4 (3.76 MB, 564x576)
3.76 MB
3.76 MB MP4
PARTY!!!
>>
>>107001464
hot
>>
>>107001552
*heavy trap bass beat starts playing*
>>
File: 00047-612365413.png (2.02 MB, 1248x1824)
2.02 MB
2.02 MB PNG
>>107001555
chisato is literally 10/10 wife material and mongs the rest of the girls.
>>
Local LLMs for making creative not repeating prompts? Is it doable at all? /lmg/ ignores me.
>>
File: 00056-3989700339.png (2.42 MB, 1536x1536)
2.42 MB
2.42 MB PNG
>>
posting this last trick or treat lain, she came out nicely, now back to anime backlog watching
>>
>>107001586
i'm not sure what you mean by "not repeating prompts" but you can probably set up ollama and find some way to call it
>>
>>107001586
it is entirely doable.
I don't like doing that because I do actually want entire control over what I gen, and I'm not bored enough to have a 'surprise me' button to do some random gens.
I do actually use it currently to automatic sort my prompt tags, but it's really not needed at all, I just use it to tidy up

You have two choices really:
- llama-cpp-python (will start/close/be used withing your comfyui workflow)
- external openai compatible LLM (llama.cpp, ollama, lmstudio, etc...)

Now the real problem is that LLMs are WAY MORE EXPENSIVE to run compared to diffusion models. models in the 7b~ dense range require 8GB vram to run at non retarded quants (q8). With LLMs you could theoretically go up to Q4, some imprecision shouldnt matter much here. If using moe models, you can get away with running bigger models and offload the expert layers to CPU, while keeping cache and rest of layers to GPU. since these are one off gens, you can keep the context to a minimum (1000 should be more than enough).

if you want something really MINIMAL (500M params):
z-tipo (what I currently use to sort), it requires you to manually install the cuda version of llama-cpp-python and it lives within comfy.
there are multiple nodes in comfy, just search for llama-cpp-python, ollama or opena/OAI compatible shit
>>
>>107001586
You can cover 90% of prompts with smart use of wildcards.
>>
File: file.png (359 KB, 1962x1012)
359 KB
359 KB PNG
>>107001699
yeah or just use impact wildcards, this is my current setup
you can see how the normal prompt comes out and the augmented prompt.
SADLY tipo creates a trash augmented prompt. I just randomize artists really
>>
File: 00064-2779316287.png (2.36 MB, 1824x1248)
2.36 MB
2.36 MB PNG
>>
>>107001652
It means that it does not output "top with v cut" every time when I ask for a girl's outfit.

>>107001687
I have a setup, the question is more about models themselves and techniques to get a good prompt. I have tried abliterated 4b and 8b of qwen and they did not follow my prompts enough or were very uncreative.

>>107001699
Yes, but you have to get wildcards first. It takes too much effort for me.
>>
>>107001734
small models are garbage sadly.
You could try nemo instruct, or a recent gemma abliterated.
If you're asking for prompting techniques, then you'll have to play around with samplers, more randomness you want, the higher the temeperature. there are some sampler that help make the bot coherent with hight temp (but I forgot the name, I usually use llms for work and low temp), I'd suggest you ask chatgpt or lmg for this.
For prompting itself, it usually works better if you give the chatbot a list to choose from (but at that point it would be the same as using wildcards substitution) and the prompting techinque GREATLY varies between models, so there's not a general way to do it
>>
Oops didn't see the new thread
>>107000663
>just try it out yourself
I have and it's pretty shit sadly. 1girl already works just fine on sdxl. Boomer prompts start giving shit anatomy and body horrors pretty quickly. Prompting for text doesn't work beyond 1-2 words it seems. I think every time someone shills a model they should be required to present a complex gen that cannot be done with illust at a fraction of time and VRAM with metadata attached. I'm gonna assume from now on that "uuuh skill issue just gen yourself" people are all LLMs or paid indians.
>>
>>107001716
That's slick. Mine isn't as elaborate, I might have to change my setup.
>>
>>107001586
Find cool image you like, i2prompt it and then use that
https://github.com/1038lab/ComfyUI-QwenVL
>>
File: 1745319983509499.jpg (891 KB, 1336x2008)
891 KB
891 KB JPG
>>107001759
show me what this prompt looks like with base ilu at this res on the first pass
https://files.catbox.moe/atrr5z.png
>at a fraction of time and VRAM
when was this claim made?
>>
File: 1753393023324645.jpg (851 KB, 1336x2008)
851 KB
851 KB JPG
>>
File: 00065-1589373952.png (2.26 MB, 1248x1824)
2.26 MB
2.26 MB PNG
>>
Are there any AI that can edit video game and anime characters into being naked? Gemini doesn't allow that
>>
>>107001586
>/lmg/ ignores me.
There must be a reason
>>
>>107001804
>abstract, ghost, fireplace
Wow you're really going out of your way to demonstrate complex composition, character interactivity and anatomy with your gen
I retract my previous statement, shills seem to be just retarded
>>
File: 00078-225239598.png (1.79 MB, 1792x1024)
1.79 MB
1.79 MB PNG
https://youtu.be/Dh56pv7gESM
>>
File: 1748446280861521.jpg (735 KB, 1336x2008)
735 KB
735 KB JPG
>>
>>107002059
Where's the 1girl?
>>
File: input.jpg (181 KB, 1349x2185)
181 KB
181 KB JPG
>>107001841
Qwen Image Edit 2509 with clothing remover lora
lora: https://limewire.com/d/AvpLO#Gd7AyXiz1r
result (nsfw): https://files.catbox.moe/na96bw.png
>>
>>107001841
It does you just have to write it in the most verbose way that makes it think you're doing something artsy
>>
>>107001842
They have miku in op, I'm not surprised.
>>
>>107002093
i want to cum inside 2b
>>
>>107001819
bowsette a shit
>>
>>
>>107002180
sylvanas a shit
>>
File: chfp8_a_00008_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
tell me about neta lumina. I see it being pushed hard now. it's only a 2b parameter model, isn't it? why use it over wan or qwen?
>>
>>107002199
it knows artists, unlike wan or qwen which are limited to a very limited set of styles
>>
File: 251025-165641-wan5s_00001.mp4 (2.57 MB, 1088x1600)
2.57 MB
2.57 MB MP4
>>107001819
>>
>>107002226
ahegao lora a shit
>>
>>107002093
now try on realistic
>>
>>107002211
so does sd1.5, but because it's so small it can't compete now. is neta yume lumina's quality still good compared to the big ones?
>>
CeFurkan is back shillng
>>
Ran took everything from me.
>>
>>107002255
I've been posting some this and last thread. compared to sdxl (illu/noob) it doesnt need upscalers or detailers. Granted gens take way longer, but I'm generating at the resolution you see, and I don't mind waiting since it manages to oneshot most of it. You can additionally use NL, which helps a lot in posing the girl in the composition you want.
>>
anons what are your guys gen times on qwen image with and without 4/8 step lora?
>>
File: 1758656043186527.png (1.36 MB, 832x1248)
1.36 MB
1.36 MB PNG
>>107002082
hiding in the cabin
>>107002199
qwen is large and wan is a middling image model
>>107002255
it uses a 16ch vae if thats what you mean
>>
>>107002293
and last one. wish one of these stupid anime thots would come trick or treating me IRL. SAD.
>>
>>107002300
>qwen is large
as in qwen is not preferred because it's too big?
>>
>>107002324
maybe he meant to say hes poor. but qwen's problem is not its size (can still fit in 16gb with some offload at Q8 or completely at 24gb). The results are almost always GOOD meaning you dont need to re-roll your gens as much, but even fully fitting in a GPU, genning is slower (due to genning at a high 1.3MP size) and it's slopped and has bad styles knowledge/no artists
>>
>>107002199
>wan or qwen
Both need LoRAs to do anything even resembling kino.
>>
how do I speed up wan 2.2 i2v
>>
File: n9t2asb11ywf1.png (174 KB, 640x640)
174 KB
174 KB PNG
absolute legend?

https://files.catbox.moe/2dyn9a.mp4
>>
>>107002295
A lot or not much.
>>
>>107002452
bruh that isnt helpful. some numbers would help
>>
>>107002448
lost
>>
>>107002456
How many cuda cores you have?
>>
>>107002472
10,752
>>
File: 1741658881115885.jpg (738 KB, 1336x2008)
738 KB
738 KB JPG
>>
>>107002434
Use 2.5 instead
>>
>>107002474
Should be pretty quick then.
Think about it as ballpark. If an action is under 10 minutes it is still usable.
In the past and still, renders can take 8 hours per frame.
With AI slop that is condensed.
>>
>>107002487
I dont think anyone waits 10mins for an image
>>
>>107002496
I don't think you have ever been employed or done graphics for a client.
>>
>>107002496
Maybe English is a problem for you. Is it?
>>
>>107002503
youre talking to a motion designer. literally no one waits 10 mins for imagen. youre joking
>>
>>107002509
English is a problem.
>>
>>107002508
yes Im german, english isnt my 1language.
>>
>>107002448
based turk working hard
>>
>>107002448
Not bad anon...not bad...but BEHOLD! MY GOONJITSU!
https://files.catbox.moe/vqq4u0.mp4
>>
>>107002509
If you are such a professional you should already know...
>>
How to make Chroma good?
>>
>>107002541
it's funnier seeing him suck dicks desu, this is a bit too much and well, it's literally a woman with his head.
>>
>>107002549
delete chroma, download gwen + analogcore lora and some insta thots lora for 1 girls and youre done, way better realism than whatever chroma shits out
>>
>>107002554
>this is what qwenfags believe
Advised him that again when your model actually becomes non shit.
>>
File: 1739217727555737.png (512 KB, 875x355)
512 KB
512 KB PNG
the cartoon character in the red shirt is very fat and holds up a sign at the beach saying "tomorrow i'll gen 1girls", while Sonic the Hedgehog looks at him
>>
>>107002579
Why not take a full pic of robotnik?
>>
>>107002612
just to test if it still works, seems fine even with a cropped image.
>>
>>107002627
Yeah it did a good job that's true.
>>
File: 1734447632356783.png (814 KB, 792x1320)
814 KB
814 KB PNG
the pink hair anime girl is sitting at a table in a walmart staff room, smoking a cigarette while sitting at a white table. the walmart logo is on the wall.
>>
how do i make a comfyui tagger workflow with multiple images to txt for trainning lora?
>>
File: 1740344971354841.png (796 KB, 792x1320)
796 KB
796 KB PNG
>>107002643
>>
> Some nodes require a newer version of ComfyUI (current: 0.3.66). Please update to use all nodes.
> Requires ComfyUI 0.3.63:
> c46c74c1-cfc4-41eb-81a8-9c6701737ef6
qwen edit, wtf
>>
>>107002656
Cum ui has gone from being a nice little javascript python shit for images to literal malware.
Year ago it was still okay.
>>
File: WAN_00009_.png (1.9 MB, 1080x1352)
1.9 MB
1.9 MB PNG
wan >>>>>>>>>>>>>>>>>> qwen
>>
File: 1731419830992770.png (652 KB, 944x1104)
652 KB
652 KB PNG
the videogame girl is sitting at a computer and typing in a cave near a fire, on the back of the white CRT monitor is the text "LDG". keep her in the same polygon style.

why does a cave have power? it's a videogame cave.
>>
File: 1742610876525001.png (657 KB, 944x1104)
657 KB
657 KB PNG
>>107002697
>>
>>107002697
solar powered PC
>>
>>107002093
Prompt nodes don't have image inputs links, correct?
>>
The only way to train a wan 2.2 lora is with cloud, isn't it? Aren't you locked out of your computer for like a week with a 5090?
>>
>>107002758
>locked out of your computer for like a week with a 5090

>xhe spent multiple thousand $ on a single pc component but doesn't have an old gpu or money to buy a 70$ 1070
>>
>>107002758
There is a guy trained wan 2.2 lora with ~250p clips on high and ~400p clips and ~700p images on low, 3 seconds clips, with pretty good result. Should be not that long.
>>
ran is not satisfied with his discord
users need blogposting
i will post images and make ran seethe
>>
>Tsukuyomi
>>
>>107002780
I'm sure the results are fine, but it takes so long to train doesn't it?
It's my work pc.
>>
>wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022
got really fast movement with 3.0 strength
>>
>>107002830
link for lora
>>
>>107002842
bruh literally just type wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022 in google
>>
>>107002830
>>107002842

Man what the fuck are you retards doing with that insane low quality flashing with your light lora setups

New HIGH:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22_Lightx2v/Wan_2_2_I2V_A14B_HIGH_lightx2v_MoE_distill_lora_rank_64_bf16.safetensors

Old LOW:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22-Lightning/old/Wan2.2-Lightning_I2V-A14B-4steps-lora_LOW_fp16.safetensors

4 steps, cfg 1, unipc
>>
>>107002853
what about t2v?
>>
File: 1743770310436836.png (951 KB, 960x1088)
951 KB
951 KB PNG
the white character is wearing a white tshirt and blue shorts sitting at a computer desk holding a green lightsaber in his messy bedroom. A large STAR WARS sign is in the background and various STAR WARS merchandise. the image is black and white. keep their facial expression the same.
>>
File: lightx2v_1022_1.25str.mp4 (745 KB, 448x600)
745 KB
745 KB MP4
>>
File: lightx2v_1022_2.0str.mp4 (849 KB, 448x600)
849 KB
849 KB MP4
>>
File: lightx2v_MoE_1.25str.mp4 (736 KB, 448x600)
736 KB
736 KB MP4
>>
File: lightx2v_MoE_2.0str.mp4 (814 KB, 448x600)
814 KB
814 KB MP4
ultimately it's just RNG
>>
one more please?
>>
File: 1753483774945303.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
the man is sitting at a poker table in a casino, keep his expression the same.

was just a cropped headshot of kaiji. zawa zawa...
>>
File: 1747537332271143.jpg (53 KB, 500x479)
53 KB
53 KB JPG
>>107002245
https://litter.catbox.moe/k2q4xggehhc770ms.png
>>
>>107002968
>>107002958
>>107002951
>>107002949
These are great! Would be cool to see more.
>>
reasons not to use lightx2v?
>>
It's out.

https://civitai.com/models/1901521/v7-base
>>
>>107003075
It's noticeably shit when comparing to not using it
>>
>>107003101
you had my hopes up with that image.
>>
>>107003101
kek, dalle-mini is so sovlful even after all those years
>>
File: 1732994965175552.png (464 KB, 609x679)
464 KB
464 KB PNG
>ctrl-f Pony
>0 results
What do we think? I couldnt care less about weeb slop, or steven universe and furry faggotry. How's the realism? I doubt it can ever stand up to Chroma
>>
File: WAN_00046_.png (2.35 MB, 1080x1352)
2.35 MB
2.35 MB PNG
>>
>>107003146
just two more finetunes
>>
File: 1759517492477396.png (1.2 MB, 1128x920)
1.2 MB
1.2 MB PNG
>>107002295
qwen image edit, 8 steps, 1 megapixel images, rtx 3090
first gen: 156 secs
second gen, same image and prompt: 49 secs
change image: 91 secs
change prompt: 62 secs
disable 8 step lora, 20 steps: 95 secs
>>
>>107002643
now this is autism
>>
can you train qwen loras with 16gb? does it take 12 hours per lora? do the loras come out even remotely well with what i assume is 40 block swaps?
>>
File: chroma multi concept lora.jpg (1.67 MB, 3936x1264)
1.67 MB
1.67 MB JPG
Looks like multiple concepts for Chroma loras work after all. Just need to crank up early training lr without frying the whole thing. I think Chroma/flux are super sensitive to gradients that pull into opposite directions so you need to let them reserve their space during early training. Otherwise it will just overwrite everything and you'll get generalized mess with combined concepts.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.