[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor application acceptance emails are being sent out. Please remember to check your spam box!


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107284812

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 1733841816588819.png (625 KB, 1033x1305)
625 KB
625 KB PNG
>>107294986
>Calling it now: Flux2-Small, the only version they will release the weights for. Will probably be pretty good actually
https://xcancel.com/bdsqlsz/status/1992248711603454072#m
it might be Flux 2 small, he said that it won't be a chinese model
>>
>>107294079
>why does it look bad?????
There are wan 2.1 workflows that are simpler than look better than mine too


I was also asking because I'm not sure what the point of the second sampler in anon's workflow even is beyond an upscale step to get high resolution and I can't replicate it on a basic ksampler

Actually I was looking for the citivai workflow to show you but I found like 3 other ones so honestly I'll just do my own research tonight and find a really good workflow for text to image that uses as little custom nodes as possible
>>
File: 1743762478731903.png (1.79 MB, 1120x1440)
1.79 MB
1.79 MB PNG
>>
>>107295046
>SOTA in terms of portrait photography
>posts the sloppiest base SDXL-esque picture ever

Are these guys just slop blind? That literally looks like a 3D render.
>>
>>107295185
>Are these guys just slop blind?
they probably are, in china women have so much makeup they think plastic shit is the default skin or something
>>
>>107295208
>in china
>>107295046
>he said that it won't be a chinese model
zero reading comprehension award
>>
>>107295224
the guy that says it looks realistic is chinese, are you fucking retarded?
>>
>twitter screencap
>>
>>107295235
no u
>>
>>107295245
>X (the everything app) screencap
FTFY
>>
Do you think we might get local Sora in ~5 years? The porn (and electricity bills) would be unreal.
>>
>>107295267
This depends entirely on the GPU market.
>>
>>107295208
It's pretty bizarre, maybe he thinks more details means SOTA in realism rather than just a proper photoreal look, or he could be talking purely based on aesthetics (think MJ type of aesthetics). MJ's "realism" while up there in details and aesthetics isn't really strictly realistic either.
>>
>>107295267
>Do you think we might get local Sora in ~5 years?
Sora has IP characters and IP styles in there, no company is gonna give that to localkeks so, never
>>
is it still itoddler only kek
>>
is hyvid2 good as a t2i model at least?
>>
>>107295185
>one tiny image in a screencap from an unreleased model
>entire model is slop trash

please, please, touch grass
>>
>>107295158
That's pretty great, mind sharing prompt?
>>
>>107295389
Often a single image is all one needs to see. Like those guys calling Seedream 4 photorealistic but then they're posting slop that looks like SDXL.
>>
File: comfuier.png (107 KB, 1514x921)
107 KB
107 KB PNG
>installed comfyui
>put this safetensor file GonzaLomo XL/Flux/Pony in checkpoint folder and tried unet folder
>it doesn't show up in comfy so I drag the file in
>install the custom nodes
>now this

WTF. Any help?
>>
New Hunyuan is just 2-second long???
>>
>>107295612
no, 5 seconds takes an age but 2 is reasonable
>>
>>107295527
Click on the selector list and pick it once more. Also try to put it into checkpoint folder.
>>
>>107295288
>>107295185
SD1.5 as a video?
>>
How does hunyuan1.5 compare to wan2.2?
how does it perform with anime?
>>
>>107295906
the myriad of snake oil speedups wanx has notwithstanding; wan is better at sequential actions, a broader range of single actions, likeness conservation and clarity. Hyvid is 24fps, and... vramlet friendly
>>
>>107295906
>How does hunyuan1.5 compare to wan2.2?
it's bad, stick with wan
>>
>>107295906
the current model is incredibly slow. we need a 4steps version. the good news is that the t2v model is uncensored. i don't know about i2v
>>
File: 1734615573946135.png (2.9 MB, 1920x1088)
2.9 MB
2.9 MB PNG
The secret to good text-to-image with WAN is literally just using karras scheduler instead of simple

picrel 40 steps cfg 2.0 euler/karras low noise only
>>
>>107295612
Yep, what this anon >>107295626
said, 2 secs takes only 2 mins on my 3090, good for testing lots of different prompts
>>
Here is 1080p imagegen with the lightning Seko rank64 lora. 37 seconds 10 steps euler/karras

>>107296057
I'm getting flashbacks to December 2024 with this garbage
this shit is dead on arrival. The only thing I'd be interested in even downloading these weights for is looking into the fact that they mentioned using both FlashAttention and SageAttention in Hunyuan which is retarded since SageAttention is a replacement for FlashAttention so there may be a free lunch, but like who even cares when wan 2.2 still mogs
>>
>>107295923
The clarity in Wan may be there but in T2V the slop is more noticeable and the video's motions are slowed. A decent NSFW tune and Hunyuan comes out on top due to being easier to adopt.
>>
With the original 2.2 lightning t2v lora
>>
and finally the original wan 2.1 lightx2v lora

I think I am now a Seko-recommender over just using the 2.1 lightning lora for text-to-x purposes

>>107296106
>in T2V the slop is more noticeable and the video's motions are slowed
video motion was mostly fixed for me with the release of the Seko T2V lightning loras and I haven't really seen anything but improvements using it compared to previous lightx2v releases, especially the wan 2.2 ones
>>
>>107295985
>we need a 4steps version.

No need to slop it further. That's the issue with Wan. Takes up to 20 mins for a simple smooth non-jagged videos because the community has accepted color shifting speed LoRAs as the standard.

>>107296092
Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.
>>
>>107296142
>tested one prompt
>draws definite conclusion
>>
>>107296178
>color shifting speed LoRAs
never had the problem with this even for older loras, this is a quantlet/wf issue
>>
>>107294974
Is the "1girl and Beyond" rentry guide and it's accommodating workflows still worth using in the current year?
Would it be worth exploring different workflows for WAI? Any worthwhile benefits?
>>
>>107295267
>Do you think we might get local Sora in ~5 years?
we will definitely get some form of a video + audio model with lipsync in 2026. all the technology has been invented now it's just up to someone to collect the data and train it

at worst, it's going to to be wan 2.2 + audio addon but at best it'll be a fully new model. and of course it will be no less than 24gb of vram to run, probably 32gb minimum at Q6_K so mentally prepare for that because memory is not getting cheaper or more available any time soon

>>107296178
>Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.
seko is just the newest t2v release by lightx2v
seko v2 actually, i guess v1 sucked since i never heard about it
https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V2.0

>>107296186
you can't demotivate me deebster, I already got within 10% of the workflow with shark-autism nodes in this gen

>>107296198
its hard to take opinions seriously because people may have misconfigured something. i spent 11 hours 3d printing something and thought it was the model that was the problem and I was super pissed at all the retards that gave it 5 stars until I realized I made a mistake
>>
File: ComfyUI_00030_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
>>107296142
is this fucking AI ?
>>
>>107296142
>video motion was mostly fixed for me with the release of the Seko T2V lightning loras

It still looks pretty poor on some cases
https://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/64

And they all have something in common, slow motion and slopped (though even base model has the plastic skin so that can't be fixed easily).

Still waiting on updated I2V LoRA though, maybe it's currently better than I think it is for that.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.