[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.36 MB, 3264x3264)
1.36 MB
1.36 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102434568

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
Why the f do people need to offload to cpu the default worksheet for cogvideoX5b on a 16gb card?
Should i just use clip-l or something instead of t5xxl-fp8?
480x720 -40 frames.
>>
File: file.png (1.68 MB, 3400x1579)
1.68 MB
1.68 MB PNG
>>102446752
I'm not doing that, I'm putting t5 fp16 on my second gpu
>>
File: 0.jpg (189 KB, 1024x1024)
189 KB
189 KB JPG
>>
>>102446800
I'll try the pruned model next.
>>
Finally got the newly released Qwen2-VL-72B-Instruct running locally. Their 8bit GPTQ seems broken, at least in my virtual environment. I upgraded the optimum library and now it loads correctly and runs, but produces gibberish text.

The fp16 model works with bitsandbytes 4bit quantization. BUT you must not quantize the vision weights (the "visual" module), or else it generates completely coherent text but can't understand the image at all. bitsandbytes 8 bit quantization seems like it might take just a bit too much memory for 96GB VRAM, since you have to leave the vision weights in fp16. But it doesn't balance between GPUs well, need to play with the device_map and see if I can squeeze it in.

Even with bitsandbytes 4bit quant, it seems pretty good from early testing. Possibly surpasses InternVL2 40b for captioning NSFW images. I will need to do side-by-side comparisons to know for sure.
>>
>>102446869
>Possibly surpasses InternVL2 40b for captioning NSFW images.
I thought Qwen was cucked and couldn't do NFSW images
>>
File: file.webm (63 KB, 416x576)
63 KB
63 KB WEBM
>An anime woman with long, flowing hair, wearing a casual outfit consisting of a light blue hoodie and denim shorts, is sitting at a cozy outdoor café. She has a cheerful expression on her face as she takes a big bite out of a juicy hamburger. The background features a sunny day with a clear blue sky, a few fluffy clouds, and some greenery around the café. The table in front of her has a plate with fries and a drink, adding to the casual dining atmosphere.
it's not really good, I guess the model is just bad or I did something wrong with my workflow?
https://files.catbox.moe/2xh3yx.JPG
>>
>>102446962
It does seem kind of "soft censored", where it understands what's in the image but won't describe explicit details. But my use case is a bit unique, I guess. All my images are already tagged. I caption them with multiple VLMs, then feed all captions + tags into Mistral-Large with instructions on how to combine everything into a final caption. So for me the VLMs just need to accurately describe the overall scene, it's okay if they leave certain explicit details out, the LLM will just handle that by looking at the tags.
>>
File: CogVideoXfun.webm (935 KB, 840x1024)
935 KB
935 KB WEBM
>>
File: file.png (1.32 MB, 3272x1573)
1.32 MB
1.32 MB PNG
>A woman is burning the book she holds on her hands
>Nothing's happening
it's terrible at prompt understanding, the fuck?
https://files.catbox.moe/8kxzfr.webm
>>
>>102447304
You really didn't give it much to work with
awkward prompt and awkward starting frame
>>
>>102447356
>awkward prompt
how so?
>awkward starting frame
how so?
>>
>>102447366
>on her hands
broken english
You hold a book with your hands.

There is no fire and no way for her to start a fire. She is looking at a book she is holding in her hands. Based on her pose, it would be something like "a woman is reading a book that bursts into flames"
>>
File: CogVideoXfun2.webm (841 KB, 840x1024)
841 KB
841 KB WEBM
>>
File: ComfyUI_01158_.jpg (1.38 MB, 4096x4096)
1.38 MB
1.38 MB JPG
>>
File: file.webm (594 KB, 512x512)
594 KB
594 KB WEBM
>A woman drinking coffee
Bruh, do something

>>102447401
ok I'll try your prompt in a mn then
>>
>>102447435
did you try multiple samplers? and if yes which one seems to be the best?
>>
any happenings?
>>
>>102447492
we can use CogVideoX as a image 2 video locally, but the results are mixed >>102447435
>>102447458
>>
>>102447458
It depends on how they captioned the video. Find a video clip captioner and see what it writes.
>>
File: file.png (31 KB, 633x444)
31 KB
31 KB PNG
https://github.com/kijai/ComfyUI-CogVideoXWrapper
>start_img
>end_img
wait, does that mean we can make keyframes like on Luma?
https://www.youtube.com/watch?v=KshorouF0s4
>>
>>102447492
Getting blue balled by pixart
>>
>>102447540
did they actually say they're releasing something?
since they're working with Nvidia I expect their release to coincide with the 50 series launch
>>
>>102447524
Fuck, i didn't even see that.
>>
File: file.webm (187 KB, 512x512)
187 KB
187 KB WEBM
>>102447401
>a woman is reading a book that bursts into flames
:(
>>
>>102447555
Some kind of rep said in their discord "we hope to let it out in September" so who knows
>>
>>102447574
hope to usually means add a month
>>
File: CogVideoXfun3.webm (365 KB, 976x584)
365 KB
365 KB WEBM
>>102447476
cogvideoxdpm seems to do better but there isn't a big difference tbqh, i'm trying to get ride of this ghosting effect im getting

>>102447458
https://github.com/kijai/ComfyUI-CogVideoXWrapper/pull/61

try testing this, someone is using dual clip (t5xl and clip l) to prompt with cogvideo
>>
what kind of vram usage does that cogvideox thing have? can i runs it with 10gb vram? im assuming you can offload the t5 to cpu
>>
>>102447592
Especially in corpo dystopia, if Nvidia is making Pixart their model they're going to make sure 1) it lines up with marketing and 2) that it competes against Flux.
>>
>>102447639
>10gb vram
it doesn't get better from here anon
>>
File: file.png (37 KB, 662x351)
37 KB
37 KB PNG
>>102447639
>what kind of vram usage does that cogvideox thing have? can i runs it with 10gb vram?
for me, a base_resolution of 512 is asking for 15/16gb of vram, you could go for fp8 though, it would work for your 10gb card

>im assuming you can offload the t5 to cpu
yes you can, with the force/clip node >>102446800
https://reddit.com/r/StableDiffusion/comments/1el79h3/flux_can_be_run_on_a_multigpu_configuration/
>>
File: file.webm (392 KB, 512x512)
392 KB
392 KB WEBM
>>102447524
>wait, does that mean we can make keyframes like on Luma?
looks like it, for that one though the transition wasn't smooth at all lol
https://files.catbox.moe/3lazta.JPG
>>
>>102447672
>>102447680
cogvideo is a transformer right? wouldn't it be gguffable?
>>
>>102447729
videos by their very nature have long attention sequences, that's vram expensive
>>
>>102447729
it could yeah, if we can make it into fp8, we can gguf it
>>
>>102447639
>>102447672
>>102447729
>>102447743
>THE MORE YOU BUY THE MORE YOU SAVE


if Nvidia is working on anything, expect it to coincide with a 5000 series announcement where they either go all in with their jewish vram schemes, or finally give us what we want (and burn our houses down in the process).
>>
>>102447759
We already know what they're doing. 28 GB VRAM 5090 and 32 GB RTX TITAN AI, that's unlikely to change and they're both housefires.
>>
>>102447780
do we "know"? Not a single "leak" ive seen has been really all that credible, hell compared to the switch 2 leaks they might as well not exist.
>>
File: 00089-1075265336.png (291 KB, 512x512)
291 KB
291 KB PNG
This shit is driving me insane. I'm trying to learn RegionalPrompt/LatentCouple. I want a picture of two different anime characters and I guess I'm supposed to use Composable Lora based on my searches?
I've followed several guides and this is the best one I can do. 90% of the time it generates the blonde haired girl by herself. Sometimes it fuses her with the boy, but that's rare. I've only gotten two images of them together. I don't fucking get it man.
>>
File: 00091-1475306688.png (316 KB, 512x512)
316 KB
316 KB PNG
>>102447808
Like seriously what the fuck. I've even tried the mask shit and it still doesn't work.
Someone please, I'm going fucking insane.
>>
>>102447111 (me)
Okay I did more testing. Qwen VL 72b is slightly worse overall at describing an image compared to InternVL2 40b. And it is EXTREMELY fucking cucked. My setup is that I'm feeding in tags + image to a VLM and having it caption it, using the tags to help it. Here's an example, a woman peeing outdoors.

Tags: smile, squatting, spread pussy, outdoors, peeing, panties aside

Let's look at InternVL2's caption
>In the image, a woman with long, wavy hair is squatting behind a large rock in a public outdoor setting. She is smiling at the camera, revealing her spread pussy as she pees while holding her panties aside. She is wearing a light-colored jacket over a pink top, black shorts, and silver sandals. The background features a body of water and some distant buildings, with bare tree branches framing the scene.
Almost completely correct, except it's a black skirt, not shorts. Trying again:

>In the image, a woman with long brown hair is squatting in front of a large rock. She is smiling at the camera, wearing a white blazer over a light-colored top and a black skirt. Her panties are pulled aside, and she is peeing while squatting, with a stream of urine visible. She is also wearing silver strappy sandals. The background features a body of water and some buildings, with bare tree branches above her.
Completely correct, but neglects to mention her pussy.

Now let's try qwen:
>In this image, there's a person sitting on a rock near some trees. They're wearing a white jacket over a light-colored top and black shorts. The person is also wearing silver sandals. They appear to be squatting down, with their legs spread apart. There's a stream of water coming out from between their legs, suggesting they might be peeing. The background shows some buildings and possibly a body of water, giving it an outdoor, urban feel. The person has long hair and is smiling at the camera.
lol, lmao even
>>
>>102447793
They're all credible generally, because Nvidia likely does them to gauge sentiment and to get them out this winter they're already being put in boxes in a factory somewhere. Usually when you're a few months away from the announcement/launch everything is pretty solid. What you see on the various sites with speculated specs is 90% likely to be what you see and any differences will be minor as last minute changes.

With that said, I think the Titan AI will be a general good value though as a mini AI supercomputer. Probably faster than having a 4x3090 build for training any model that fits in its VRAM.
>>
File: 00026-318067804.png (1.81 MB, 1376x1072)
1.81 MB
1.81 MB PNG
>>102447808
>>102447830
oh shit nigger what are you even doing? it's easy as shit.
use latentcouple, just set your prompt and regions like in the instructions and that's it, thats how i got picrel.
>>
>>102447836
Oh I should mention, qwen VL will NEVER mention a person's gender. Even when directly instructed to do so, as I did in my example. It's always "person", "they", "them". And it will never mention anything related to NSFW stuff even when given in the tags. I actually can't believe the fucking chinks are doing this gender neutral troon shit now.

InternVL2 remains king for open-weights captioning, it seems.
>>
>>102447848
Titan AI? never heard of it, is it this generation's Titan X? Fuck i'd buy it if it's worth the monetary plunge.
>plus i need to finally get something on credit anyway
>>
File: 00093-1736028504.png (289 KB, 512x512)
289 KB
289 KB PNG
>>102447849
I took out my normal negative prompt and now it seems to be behaving. This shit is weird man.
>>
>>102447876
Rumored to be 600w or more, 60%+ faster raw performance than a 4090, 32 GB of VRAM which is also upgraded. Probably $2500.
>>
File: sonic sackboy shock.png (297 KB, 469x679)
297 KB
297 KB PNG
>>102447958
>32 GB of VRAM
>on a NEW TITAN
no way, that would be WAY too retarded and jewish even for the road nvidia is headed on. That would piss off even the most hardcore enthusiasts, to be paying over 2 grand for a tiny step up in memory for something INTENDED for high memory load apps like AI. No way.
>>
>>102447975
Pay $4000 for last generation's enterprise card. The fact is you're in the price range of people who have money and are using cards to make money. Once things are giving you ROI prices scale up.
>>
>>102447989
Then fuck these kikes, ill make my money off two 3090's, buying them used, and never give them a cent. Just because i upgraded to industry grade hardware doesn't mean i'd like to get fucked like an industry grade cuckold.
Don't exactly need a 600w 5090 to sell slopgens to indians.
>>
>>102447989
its a bot anon, don't get baited
>>
>>102448020
Or you could get a job
>>
>>102448032
>>102448036
all me btw
>>
File: file.webm (603 KB, 512x512)
603 KB
603 KB WEBM
>>102447615
>try testing this, someone is using dual clip (t5xl and clip l) to prompt with cogvideo
>>102447458
>A woman drinking coffee
kek now she needs to stop coffee, that shit is definitely bad for her health
>>
>>102447888
so what this means my Loras aren't compatible or something? I've been trying to figure out how to do this for awhile and I'm so damn close.
>>
It's really funny to see people complaining about the price of speed boats. There are much more expensive hobbies out there.
>>
>>102448077
>Now, paying for a Gym Subscription will cost 5000 dollars per month
>Why are you so angry? I mean, there are much more expensive hobbies out there
you realize how retarded your "argument" sound anon?
>>
>>102448057
bro clear out that entire negative prompt what the fuck are you doing, this isn't 1.5.
>>
>>102447540
>Called PixArt
>Not a pixel art model

What kind of idiots come up with this stuff
>>
>>102448057
you're genning at 512 resolution, sdxl models work at 1024
>>
>>102448151
really? I generally notice improvements but I'll take your word for it and try it out.
Please tell me what the fuck else I'm doing wrong, I replaced Remilia with Tohsaka to see if it was the Loras but this shit is apparently gonna take 20 minutes to see if my hypothesis is correct. GTX 1080...
>>
>>102448106
people spend $5000 on gym equipment for their home gym all the time, not quite the own you think
thinking of it, I heard you can rent GPUs, that's like a gym membership lmao
>>
>>102448207
but we're talking about a gym subscription there, why are you changing the subject?
>>
>>102448220
is a gym subscription owning a gym?
it's hardly relevant and it's arguing my case given everything in a gym costs thousands of dollars each, each treadmills is $5000+
>>
>>102448194
70%, it's not looking promising.

>>102448187
If I generate it any bigger than this my PC shits the bed. I'm stuck at 512x512, I did manage to get hires fix working finally, but fuck this fucking bullshit is still fighting me.
>>
>>102448194
>fellow 1080 enjoyer
i will go out of my way to detail helping you out
first off, ditch whatever retarded shit you're using right now and get
>pony autismmix or whatever higher rated anime focused model you need
>latent couple https://youtu.be/uR89wZMXiJ8?t=282
>make sure you're on reforged instead of standard forge, standard forge has AIDS and can't even unload the models correctly taking up all your system ram and crippling everything
>ALWAYS follow model instructions for what negatives you really should use, if any, most good models don't need negatives beyond score_1, score_2, score_3
>you can easily crank out 768x1080 at a reasonable upscale or just render the actual resolution you want up to i would think 1280 max i generally go for wallpaper sizes since pony works better with it + faster s/it
>99% of the time you just use euler A karras at 20 steps but it depends on the complexity of what you're doing and even then scaling fixes whatever fine details got crushed

>>102448240
yeah get reforged
>>
>>102448250
The latest version of Forge absolutely shits the bed on me. I switched to Automatic1111's because I was tired of it crashing on me. Do you use old forge or is there some black magic with nu-forge I can't find to make it not shit?
>>
>>102448231
>is a gym subscription owning a gym?
it's a hobby that has a cost, like what we're doing right now, so it's completely relevant
>>
>>102448267
reforge is forge but not retarded and actually doing the shit forge was supposed to be in the first place
i dont know what the forge dev did to make it somehow worse than a1111 lately but he did.
https://github.com/Panchovix/stable-diffusion-webui-reForge
>>
>>102448271
Yeah anon you can use Runpod and rent a 4090 for $0.40/hour. Again, not quite the own you think it is. I know you're actually 80 IQ but you're complaining about how great treadmills cost $5000 when we use your gym analogy. Given you're a greedy asshole I'd imagine you'd understand Nvidia's perspective.
>>
File: file.webm (286 KB, 512x512)
286 KB
286 KB WEBM
holy shit, that keyframe transition was smooth!
https://files.catbox.moe/57lyrk.JPG
>>
>>102448281
holy fucking shit dude thank you, how have I not heard of this before today.
If I scroll up and see it in the OP I'm going to cry.
Okay thank god it's not there I was going to feel so stupid. You're a fucking champion, anon. Thank you.
>>
>>102448306
when you faggots act like this you don't convince a single person to buy ((nvidia))'s garbage overpriced graphics cards, in fact it only incentivizes me further to buy them used.

>>102448319
yeah welcome to /g/ generals the OP's are never updated.
>and this is ComfyUI turf anyway
>>
>>102448306
You talked about "hobbies" there >>102448077
>There are much more expensive hobbies out there.
if you want to pretend that going to the gym (and therefore, paying for a subscription) isn't a hobby, then I can't do much more for you anon, you probably have a 2 digit IQ I'm afraid
>>
>>102448328
Oh no that personally affects me
>>
>>102448271
>>102448328
nta but are you the laissez faire captialist?
>>
>>102448344
you're quoting two different people bud, i disagree with the retard that's trying to get us to justify $2,000 for a new 600w 32gb gpu in 2024.
>>
>>102448337
You made a textbook definition of a strawman and worse attempted to make an analogy but didn't have the brain cells to think it through. It's particularly funny because going to the gym is an expensive hobby, so really just shows you're not only poor but also extremely fat.
>>
>>102448356
I'm not trying to get you to buy shit. I'm merely telling you you're bitching about how your luxury hobby is expensive and compared to just about every other hobby out there is quite cheap. Do you know most artists spend several thousands of dollars on art supplies every year? In the grand scheme of things, computers and AI is one of the cheapest luxury hobbies out there.
>>
>>102435695
If it's that hand fixing method that use MeshGraphormer, the way it's meant to work is hand detection (bounding box) -> fitting a hand mesh to the deformed hand as well as possible with MeshGraphormer -> render the depth map of the fitted hand mesh, which is thresholded to also provide a mask -> inpaint the region defined by the dilated mask, using a depth Controlnet with the hand depth map as an input.
For anime, based on what I've seen using the hand refinement stuff in the Acly Krita plugin, either the bounding box hand detector or the MeshGraphormer fails to detect/fit the hand, because neither were trained on anime I think.
While training a bounding box detector for anime hands would be doable and might already exist, a specialized MeshGraphormer would be harder. Maybe using a synthetic dataset with toon-shaded anime models, perhaps augmented with light img2img.

(reposting because new thread)
>>
>>102448306
NTA but I thought the big criticism against Nvidia was that VRAM itself is quite cheap and is marked up significantly on their GPUs

They could offer much higher VRAM in consumer GPUs, but they choose not to because then people wouldn't need to upgrade as much
>>
File: steve smug smile.gif (502 KB, 207x215)
502 KB
502 KB GIF
>>102448388
wow man i stopped reading after the first sentence. Good day.
>>
>>102448401
yeah that's the debo effect, that's so retarded you stop reading because you know it will burn some of your braincells aswell
>>
>>102448400
I don't agree with what Nvidia does but I also don't believe people should do things out of the kindness of their hearts for free either. Nvidia doesn't need to add VRAM so they don't. I think this will bite them in the ass in the end when a true AI competitor comes and profits off the bad will Nvidia is generating.
>>
File: file.png (1.23 MB, 1280x720)
1.23 MB
1.23 MB PNG
>>102448423
>I think this will bite them in the ass in the end when a true AI competitor comes and profits off the bad will Nvidia is generating.
I hope so anon
>>
File: 00050-60544550.png (1.66 MB, 1632x1152)
1.66 MB
1.66 MB PNG
>>
>>102448423
As an investor, not sure if I want that

As a hobbyist, I would love that but it seems unlikely since all of the current AI tech works of the proprietary cuda cores and such. Either a regulatory agency with guts would need to get Nvidia to stop monopolizing AI tech or AI tech would need to be prqactically rebuilt from scratch
>>
>>102448504
It's going to happen because there's too much money to be made. Anyone can essentially look at the pytorch code base and make hardware based around it and given we're talking about billions of dollars on the table, someone is going to take a bite.
>>
>>102448524
We'll see. GPUs have to be up there with some of the most difficult-to-produce technology that exists, and none of the existing competitors have made any headway so far
>>
>>102448493
pretty style
>>
>>102448580
When increasing profits cutting out your supplier is #1. Remember how Amazon replaced UPS and Fedex for shipping? It's how it's always done. Google, Amazon, Microsoft, or one of the many other major corpos will either get tired of paying the Nvidia tax or feel Nvidia is crippling their competitiveness. One of the reasons why Amazon cut out UPS and Fedex was because they're completely borked shipping over the holidays one year.
>>
>>102448623
And then Google/Amazon/Microsoft will pass the savings to us and provide consumer-priced high VRAM GPUs to the masses, right?

...right?
>>
>>102448650
Yes because to break into the market they have to provide a good value proposition. You have to remember there's major resistance to change so bribery is the main way to do it. We saw it with the Intel cards which have a very good performance/$ ratio.
>>
>>102448688
We shall see. My optimism for the free market died a long time ago, especially for the GPU market
>>
Does CogVideoX work on 6GB vram?
>>
File: file.png (59 KB, 1494x733)
59 KB
59 KB PNG
>>102448794
nope, if you go for bf16 it's asking for 16gb of vram, if you go for fp8 it's asking for 10.6gb
>>
>>102448746
Because if consumers are retards every time, they won't make or do the decisions to actually keep the market competitive. Look at people every time talking about how AMD makes good mid range cards and then people not supporting it and then pearl clutching DLSS and other shit.
Even with Zen, it took 3 iterations for people to start buying them. I remember actually going out to buy the first iteration of the 1700x. Thank god that was enough for AMD to keep chugging to Zen 2 which then people went out to buy.
So now I brought an Intel A770 like a fool and waiting for Intel to bring Celestial for the generation I'll upgrade to. It was easier than ROCm but because people can't comprehend installing a few more packages than normal, god forbid, we're "stuck" with a Nvidia monopoly for now. I wonder if I'm going to look like a fool for doing this.
>>
>>102447645
I don't think anyone knows the extent of which Nvidia has any say. I'm sure regardless the team wants it to compete with Flux but I can't say I'd be thrilled if Nvidia "makes sure it aligns with their marketing".
>>
>>102448524
>given we're talking about billions of dollars on the table
It's billions of dollars on the table in the enterprise segment.
Nowhere fucking near that much in the consumer segment. In fact, consumer shit is an extremely minor for this shit. Even more in fact, for the VAST majority of supposed consumer needs, the "correct" market approach for this is actually going to be iGPU's that are "good enough" for low level LLM use, which we're already starting to see getting pushed into the market.
We might end up seeing some large VRAM dedicated consumer cards starting to show up but that's not going to happen for another 2-3 years minimum due to design time and fab timeslots/cost. If intel were actually really fucking smart they would have scrapped literally everything they've been working on with Arc and redesign it from the ground up with AI use as the primary design goal and gaymen as the secondary, but I really doubt they managed to do it or at least, managed to do it in time for it to actually come into play any time in the foreseeable future.
>>
>>102448794
https://github.com/MinusZoneAI/ComfyUI-CogVideoX-MZ
2gb too short
>>
>>102448929
Nvidia would have a lot to say because they're paying them. What I meant by aligning with their marketing is they'd likely design the model around making the 50 series shine, something they can showcase during their announcement as a SOTA model.
>>
>>102448962
I want to believe that average consumers will eventually be able to shift to using AI locally for many tasks, and that they'll move away from subscription based services like ChatGPT as more powerful GPUs become widespread

Nvidia's business practices are the biggest blockade preventing this from becoming a reality right now
>>
>>102448962
It's like being in the 80s and saying PCs are enterprise only. We both know AI is a hundred billion dollar industry that impacts everything we know about life, it's just the next layer of computing from personal computers to the internet and now AI.
>>
File: CogVideoXfun4.webm (302 KB, 968x584)
302 KB
302 KB WEBM
my reaction when someone asks about local image2video
>>
File: fishbed compressed.png (2.97 MB, 3550x1913)
2.97 MB
2.97 MB PNG
Can someone please extend this drawing upwards using an AI generative funcion that extends pictures beyond their borders?
I need it for a custom phone case, and the sky needs to be way taller.
>>
>>102449057
My hope for pixart just dropped to the floor kek fuck this gay ass earth
>>
>>102449150
Well good news is they'll still want to sell 5060s. If I were to guess they'll do something like SAI and release a small model (5060s and 5070s) and a large model that will be (5080s/5090s)
>>
>>102449131
we can do video now?
>>
>>102449131
we'll get there, but yeah for the moment it's rough, but it's something, trust ze plan
https://blackforestlabs.ai/up-next/
>>
>>102449211
yeah we can >>102448308
>>102447615
https://github.com/kijai/ComfyUI-CogVideoXWrapper
>>
File: ComfyUI_01593_.png (343 KB, 512x512)
343 KB
343 KB PNG
>>102449228
woah, based
>>
>>102449213
https://reddit.com/r/SoraAi/comments/1fk4nfk/i_got_access_to_sora_and_made_this_kpop_inspired/
I wonder if they'll do something about Sora at some point, people are seriously catching up to them
>>
File: 00039-746937976.png (370 KB, 512x512)
370 KB
370 KB PNG
Anon trying to do Latent Couple again.
https://litter.catbox.moe/5zzkc6.png
Posting via catbox so I don't contribute more to image limit.
Any ideas? I read I was supposed to disable xformers, but it's not installed in this version.
Sorry I keep bugging you guys about this, I'm pretty stupid and this shit is sort of obnoxious to Google.
>Oh here's my issue!
>Posted 2 years ago
>Fixed in the newest version :)
>>
File: file.webm (362 KB, 512x512)
362 KB
362 KB WEBM
>>102448308
kek
>>
>>102449479
increase image resolution to 1024, reduce cfg to 3-4 and you will win. i am pretty sure a 1080 can handle sdxl at 1024 no issues, if it's taking forever then it's probably because of that nvidia driver thing on windows where it offloads vram to ram, i cannot remember what it was called, shared memory i think?
>>
>>102449479
>so I don't contribute more to image limit.
we never hit image limit so don't worry about it
>>
>>102449702
I also wanted to post Lain

>>102449680
I'm still getting the same error I think I'm missing something.
Am I supposed to be using the latest Python version, because it refused to launch until I forced it to use 3.11 like with Automatic1111's.
>>
>>102449730
what ui are you using? i think both forge and automatic1111 use an older version of python and are you using a venv? use a venv
>>
>>102449228
retard here, how do I install this?
pls help
>>
>>102449745
I'm using venv. Trying reForge right now at another anon's recommendation.

IndexError: index 1 is out of bounds for dimension 0 with size 1
index 1 is out of bounds for dimension 0 with size 1
*** Error completing request

If I'm reading this right, even though I have Latent Couple enabled, it's only recognizing one region for some reason, so it's failing to generate because it doesn't know what to do with the AND?
>>
File: file.png (105 KB, 3037x572)
105 KB
105 KB PNG
>>102449764
do you have comfy manager? if yes then you just install it that way
>>
>>102449789
sorry but i've never used reforge or latent couple before so i don't know how to help you there. if you're really desperate have you given comfyui a try? the learning curve is much higher but it has the best performance out of all the uis
>>
>>102449875
I've used UI's similar to ComfyUI in the past. I do not understand them. I'll keep unga bunga'ing this for now then.
>>
>>102449833
ok thats what I did, anything else I need to download?
>>
>>102449936
not really, you use a workflow, make sure you're using t5 for the clip encoder, generate, and then the repo will download the CogVideo model before running it for you, it'll take a while tho, that's a big motherfucker kek
>>
File: 1697480628758702.png (473 KB, 728x720)
473 KB
473 KB PNG
>mfw i fucked up comfyui trying to install cogvideo
someone..
test video inpainting for me..
please..
>>
>>102449883
by the way, you want that latent couple thingamajig to prompt multiple characters right? then give artidiffusion a try, their thing is that it's better at multiple subjects and prompt comprehension. just make sure to read the model page first to learn how to prompt it
>>
>>102449965
You are so close and yet so far...
>>
File: awsfwasfwasgfwasgwa.png (568 KB, 1713x923)
568 KB
568 KB PNG
>>102449789
follow the latentcouple instructions you ungabunga dingaling
stopping my 7 minute 4 image gen process just to screenshot my last latentcouple gen for you
>>
File: wasfgv wsafgvtwas.png (31 KB, 1508x198)
31 KB
31 KB PNG
>>102450031
>literally was in such a hurry that forge still showed what i was last prompting
kek
anyway format your prompt right, like in the instructions, background first then left character then right character separated by AND

>and if it gives you shit like duplicating a character just try my neg, like i advised earlier, dont go schizo with your negs and you wont have problems

and TURN your resolution up your gtx 1080 is calling you a pussy bitch, i bet you didn't even undervolt and overclock it either.
>>
>>102450031
>>102450052
That's what I'm doing but I can't get past this error. I'm missing something and I'm really not sure what.
I'm currently on a sleep cycle reset so I'm insanely tired, and I'm already pretty dumb so it's really hard for me to elaborate right now. Something is fuckered and it's refusing to work. It's being a little bitch ass.
I'll turn it up when I know this is working, I'm trying to just shit these out quick so I can see if it works or not. and I don't no, last time I tried to OC it the fucker almost died on me and my friend who actually understands hardware doesn't believe in overclocking so he won't help me.
I'm gonna keep trying to unga bunga but I think something is wrong with like... the install/packages or something. Regional Prompt crashes, Latent Couple crashes. Both with the same error. Regional Prompt sort of works if I don't use latent and I use the other one, but it also generates a garbage image so I dunno.
>>
>>102450129
you know what? I think we're sleepybros for life here, im so fucking brain gone from doing twenty different tasks right now on top of exhaustion, i DROPPED latentcouple for regional prompter when i realized that was working better
ill get some better instructions going in a second for you to follow when youre not brain gone, honestly my bad, i have no idea how i completely forgot this within the span of not even 24 hours
>>
>>102450129
>overclocking
nta but underclocking isn't exactly the same as overclocking... sort of, you should look into how to do it. reduces temp and power consumption while increasing performance, free performance!
>but it also generates a garbage image so I dunno.
did you gen at 512
>>
File: CogVideoXfun5.webm (243 KB, 1216x736)
243 KB
243 KB WEBM
finally managed to get decent image quality
>>
>>102450129
alright there you go, copy it verbatim or swap the regions depending on what youre trying to do
have fun, see you in dreamland.
>>
File: ComfyUI_01636_.png (1019 KB, 896x1152)
1019 KB
1019 KB PNG
>>
File: file.png (1.3 MB, 1024x768)
1.3 MB
1.3 MB PNG
https://civitai.com/models/773474/flux-lora-doom?modelVersionId=865100
was about fucking time
>>
File: 00023-3229026810.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>102450215
Anon if I was gayer I'd suck your dick, teeth and all.
Thank you so fucking much. Remilia's lora seems to be a bit strong, but this is so much fucking progress.
>>
>>102450370
yeah fuck with LORA strengths, trying to do a gen of stocking anarchy and princess peach with peach's tits out, it wasn't cooperating till i lowered stocking's lora and removed a detailer
its very anal about loras for sure.
>tomorrow morning or later tonight after a nice nap im absolutely going ham and learning how to do complex sex scenes with this
>>
File: file.png (2.19 MB, 1024x1024)
2.19 MB
2.19 MB PNG
>>102450360
>>
File: CogVideoXfun6.webm (506 KB, 1216x736)
506 KB
506 KB WEBM
now we're getting into something
>>
>>102450522
ehh, their face look normal now, what did you to to improve it?
>>
File: CogVideoXfun7.webm (402 KB, 848x1024)
402 KB
402 KB WEBM
good, now time to add RIFE and baby you've got a local img2video model going
>>
>>102450580
>RIFE
what's that?
>>
File: cogvideoxfun.png (105 KB, 1575x719)
105 KB
105 KB PNG
>>102450536
changed the tile height/width to 256 and that seemed to fix the weird glitching/ghosting I was getting.

This the workflow im using, its a mix of Kijai and CogVideoX-MZ
>>
>>102450635
>its a mix of Kijai and CogVideoX-MZ
didn't know they were compatible, that's cool

>changed the tile height/width to 256 and that seemed to fix the weird glitching/ghosting I was getting.
nice, did that increase the VRAM or something? I have no idea what does it do so I didn't change it kek
>>
>>102450630
frame interpolation, cogxvideo outputs 8 fps
>>
all i want are decent live2d-esk animations and i'll be a happy man
>>
File: 1726709696.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>
>>102448400
Cutting the memory bus down saves them die space as well
>>
>>102450657
nope, but vae tilling allows me to gen bigger resolution images without going OOM, i tested several height and width and all gave me weird seams or glitchy results, 256 seems to work fine in preserving image quality
>>
File: 1726710856.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>
Couldn't CogVideo be used to make full movies/tv shows?
Anyone can make a film or tv show by gluing together clips. Each final frame of the 2 seconds could be tweaked a little by a standard image-2-image model to adjust it slightly so the model doesn't get lost in hallucination, but then you could just use the final frame to make new videos. If you want higher FPS there just needs to be a slowmo LORA or finetune.

Here would be the workflow:
1. set up CogVideo with a "3x slowmotion" finetune (or LORA?)
2. generate an image from your favorite text-to-image model
3. have CogVideo use image-to-video on your image to make an 8fps 6 second clip at 3x SlowMo
4. Speed up the clip by 3x to make normal speed, thus the clip a 2 second 24fps clip (like a regular Hollywood movie)
5. Take the final frame of the clip and feed it into an image-to-image model with very low temperature just to adjust it and fix any defects
6. Using your touched up final frame image, loop back to step 1 and make the next 2 seconds of your movie
There still needs to be a solution to keep the narrative of your movie going, it's not like you can plan out a scene and guide Cog toward it.
Maybe you can just use a jump cuts and different angles?

But overall it seems like generating movies / TV shows (or even just scenes / skits / backgrounds for green screens) is already technically possible?
None of the steps above are currently impossible.
Am I missing something, or are we here /g/bros?
Full movie / tv show workflow coming very soon?
>>
File: 1726711102.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>
File: bComfyUI_114622_.jpg (1.36 MB, 3072x1536)
1.36 MB
1.36 MB JPG
awkward lookin fuckers
>>
>>102450850
Key frame anime / 2d animation will be what people will be doing first, combine it with a model like Flux trained on storyboards and key frames and you got something interesting.
>>
>>102450861
imagine if one of those comes up your toilet bowl while you're taking a shit and goes inside
>>
File: ComfyUI_01619_.png (1.35 MB, 896x1152)
1.35 MB
1.35 MB PNG
>>102450987
kino
>>
File: file.png (283 KB, 2773x1343)
283 KB
283 KB PNG
https://github.com/kijai/ComfyUI-CogVideoXWrapper/issues/54#issuecomment-2359159867
>Official one looks to be better quality, but it's still resolution locked to the same default as CogVideoX, while the "Fun" version isn't. Both are running with these nodes now to some extend, and the "Fun" version works with fp8, including fast mode, and the Comfy T5 so I'd say it's better than the official wrapper.
hmm... I thought the only difference between the official and "fun" is the resolution change, but turns out the quality is different too
>>
>>102451108
so is cogvideo being updated or is it the same stuff that we've been testing on that one space since release?
>might re check out the space in that case
>>
>>102451128
>it the same stuff that we've been testing on that one space since release?
it depends which one, personally I tested the space of both "official" and "fun" so... ^^'
>>
File: 00000-468246386.png (2.12 MB, 1496x1496)
2.12 MB
2.12 MB PNG
>>
Has anyone found a good way to prevent the Flux grid pattern overlay from showing up yet? Picrel. It's a good picture, but you can see the subtle grid pattern, most obvious on the admiral's forehead. Generated at FP16, so not a quantization artifact.
>>
File: 00000-2312638564.png (1.69 MB, 1216x1216)
1.69 MB
1.69 MB PNG
>>
>>102451144
do you use cfg > 1? show us your workflow anon
>>
File: 00000-3346979905.png (2.48 MB, 1024x1472)
2.48 MB
2.48 MB PNG
>>
File: 00000-1135797184.png (1.57 MB, 1088x1088)
1.57 MB
1.57 MB PNG
>>
holy KINO cog is AWESOME
>>
File: CogVideoXfun8.webm (1.32 MB, 848x1088)
1.32 MB
1.32 MB WEBM
>>102451108
lets test it
>>
>>102451144
I haven't noticed this at all, can I try your prompt?
>>
>>102451241
scary stuff
>>
>>102451237
kek, looks like the fish is stuck on her hand, reminded me of this
https://youtu.be/vXpXKfQ18mw?t=35
>>
>>102451251
yeah its funny that this is what BING has to offer, i can do better in XL.
which makes me want to, i just need to figure out what i would prompt to get that kind of extreme camera angle and focal point.
>>
File: 00001-523217599.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
>>
File: CogVideoXfun9.webm (1.37 MB, 848x1088)
1.37 MB
1.37 MB WEBM
>>102451249
indeed
>>
>>102451263
gentlemanly fellow
>>
>>102451237
wonder what a finetune of this thing could do
>>
>>
File: 00002-620219287.png (1.9 MB, 1024x1536)
1.9 MB
1.9 MB PNG
>>
>>102451289
we'll get something better soon, CogVideo will be deprecated by BFL video model, 2 weeks, trust the plan
https://blackforestlabs.ai/up-next/
>>
File: 00012-3912686238.png (1.82 MB, 1024x1472)
1.82 MB
1.82 MB PNG
>>
looks like 2023 in here
>>
File: bogsylvania.webm (211 KB, 720x480)
211 KB
211 KB WEBM
was it really always this good at reading/understanding prompts? damn.
>An ominous looking man stands in front of a wide window, as lighting strikes in the background, casting strong shadows and flashes of light in the foreground, he slowly shakes his glass of wine.
i probably promptleted the last bit which is why he's not moving his arm but wow


>>102451323
i trust this plan given this wasn't even fathomable a year ago
>>
File: 00340.png (993 KB, 1072x1072)
993 KB
993 KB PNG
>>
File: file.png (412 KB, 544x500)
412 KB
412 KB PNG
>>102451342
>he prompted?
>>
>>102451323
western shemale model only have 0.000001% soul of china super models (currently in training) heaven and earth cannot compare
>>
File: 00029-4182980268.png (1.88 MB, 1024x1024)
1.88 MB
1.88 MB PNG
>>
>>102451362
Wait for my all American model.
>>
File: ComfyUI_00840_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
File: 00035-2982740582.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>
>>102451362
>china super models (currently in training)
such as?
>>
>>102451383
我爱北京天安门 16b
>>
>>102451390
kek
>>
>>102451383
i upload to github soon
>>
>>102451397
2 weeks lets goo
>>
File: dark knight lau.jpg (84 KB, 1460x801)
84 KB
84 KB JPG
>>102451397
>I go to github, out of openAI jurisdiction
>>
>>102451362
>western shemale model
kek, indeed
>>
File: file.png (233 KB, 546x711)
233 KB
233 KB PNG
A new model is like a blank canvas that could go anywhere. Thanks Pixelprose and Coyo for having a good diverse dataset for priming a 16 channel VAE.
>>
>>102451362
>western shemale model
not the right time to say that when Qwen2-VL just got out today and this model describes every single human as a "character" and "they", it doesn't know what a man or a woman is
>>
>>102451417
you're training a 16ch VAE anon? for which model?
>>
File: ComfyUI_00774_.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>>
>>102451440
would
>>
>>102451431
I updated Pixart Sigma to use Flux's VAE. I'm also experimenting with using Dynamic Routing in the training for the attention, text encoder, and layers. I'll be training a 1B Pixart model using this.
>>
File: file.png (1.47 MB, 800x1920)
1.47 MB
1.47 MB PNG
Welp I guess it's haunted
>>
>>102451420
china communist country,,,, everybody equal,, no need to know if man or woman still can work at temu warehouse,,, not because shemale
>>
YYYYEEEAAAAAHHH COG CAN DO DANCING RAT ON A STUMP WE ARE SO BACK
>>
File: file.png (100 KB, 640x640)
100 KB
100 KB PNG
>>102451462
>no need to know if man or woman still can work at temu warehouse
kek
>>
>>
File: ComfyUI_00780_.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>
>>102451511
give her armpit hair
>>
https://reddit.com/r/aivideo/comments/1fk9dn8/kling_15_is_amazing/
holy fuck Kling upped up their game big time, guess that they felt that pressure from MiniMax, damn I wish we would get the same kino on local...
>>
File: 00039-107687397.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
My dumbass passed out thankfully only for an hour. Thanks again Anon you've gifted me a dark and terrible power and I promise only to use it for anime girls.
>>
>>102451680
kek i maybe got half an hour of real sleep tops
what lora strengths are you working with there? I let 10 process while i took a dump and it looks like maybe a few of them are usable.. I wonder if it's REALLY struggling with even just one lora at the low cfg scale i set (apparently instructions say to do that if loras are having trouble)
https://files.catbox.moe/0txlgj.png
>>
>>102451585
shitty vfx ai videos are so overrated, that redditor doesn't show anything good, minmax can output really good quality humans with physics
>>
File: CogVideoX-I2V_00004.webm (462 KB, 720x480)
462 KB
462 KB WEBM
>>
>>102451775
slowly transitioning into a man
>>
File: CogVideoX-I2V_00006.webm (634 KB, 720x480)
634 KB
634 KB WEBM
So CogVideoX has better quality, I just wish it wasn't resolution locked
>>
>>102451831
guessing its better *because* its resolution locked and that's just how good its gonna be till a model update.
>>
File: 00043-440361678.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>102451706
I had them both set to 1. I just moved Remilia down to 0.8 because she's bleeding over Rachel sometimes. Gonna move her up and down from there, but this does JUST WERK (once you follow the instructions fully...)
>>
>>102449131
how did the chinks manage to do this
it's so over 2bh
>>
File: ComfyUI_33913_.png (1.38 MB, 768x1024)
1.38 MB
1.38 MB PNG
>>
File: ComfyUI_33918_.png (1.32 MB, 768x1024)
1.32 MB
1.32 MB PNG
>>
>>102451457
AI is dangerous. It's all fun and games until AI destroys humanity.
>>
File: 57063.jpg (2.65 MB, 3600x3600)
2.65 MB
2.65 MB JPG
>>102452390
perhaps humanity deserves to be destroyed?
>>
File: 00059-2017228288.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
Fuck it I'm hanging out with you guys. I'm more of a /v/irgin than /g/entooman but those threads are full of discordfaggotry and circlejerking.
I don't care what cunts on Discord are doing. I just want to make anime girls.
>>
File: cogvideotest.webm (502 KB, 720x480)
502 KB
502 KB WEBM
wish this shit would work with more than 8fps and being able to generate longer videos.
>>
>>102452695
I haven't done video editing in a few years but you should be able artificially fix the FPS in pretty much any video editor.
>>
>>102452692
ask cogvideotest anon to animate your gen and make them kiss
>>
>>102452726
Hey that's a good idea!

>>102452695
Hey anon would you do me a favor?
>>
File: ComfyUI_01654_.png (620 KB, 720x480)
620 KB
620 KB PNG
>>102452700
>but you should be able artificially fix the FPS in pretty much any video editor.
how?
wouldnt I need to generate more frames for that?

>>102452761
>Hey anon would you do me a favor?
sure what you need buddy?
>>
>>102452769
Can you make 'em kiss?
>>102452692
Think about it, it'll be great practice.

... I'm also the anon that mentioned fixing the FPS. It's some fucking black magic with frame interpolation. Adobe Premier was the only decent one I remember, but apparently AI ones are pretty fucking good now (of course). I haven't fucked with it in like 5+ years though and I remember it was black magic when I was using it. Even some of the easier to pirate Paid software should have decent frame interpolation without AI influence.
>>
File: cogvideotest2.webm (684 KB, 720x480)
684 KB
684 KB WEBM
>>102452798
>Can you make 'em kiss?
sure
>>
>>102452820
what on earth happened during that pause to make her brain orgasm like that
>>
>>102452820
10/10 thank you anon. Next time I boot into Windows I'll find a totally legal copy of Sony Vegas or whatever the new hotness is and fuck around with one of your gens for you.
>>
File: ComfyUI_01597_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>102452831
kek I have no idea.
I kept the prompt simple with
>"two anime girls kissing each other"
>>102452832
>Next time I boot into Windows I'll find a totally legal copy of Sony Vegas or whatever the new hotness is and fuck around with one of your gens for you.
based
>>
File: eimiturningaround.webm (297 KB, 720x480)
297 KB
297 KB WEBM
>takes 6 fucking minutes for 6 seconds of video
>>
File: 1725579540233413.png (574 KB, 712x480)
574 KB
574 KB PNG
>>102450360
Nice, looks better than mine
>>
>The comfy T5
What exactly is this?
googlet5, llamat5, smooth t5 ?
Is it inbuilt to comfy, how big is it etc?
16gb vramlet here trying to not oom in Cog.
>>
>>102447458
>>102447458
This is what happens when you use budget animation studios.
>>
i wonder if we'll at least get 2d anim capability first really soon
i want THAT at least. man. being able to gen my favorite waifus riding dildos and shit on my ancient hardware would make my whole year. possibly life!
>>
Does adding more samples for an image gen drastically change the seed at all? I notice the more complex i want a scene, even if i'm keeping the style and subject, like say she's bending over but now holding a burger with one hand, in order for the model to not both duplicate hands, add a burger to censor her pussy(?) or break some other detail, i need more samples, then it'll do it fine.
I never figured it had THAT much sway, thought it was mostly a detail thing.
>>
>>102449097
I want to believe that too, but that doesn’t make it plausible
What makes you think that they will switch from subscription services to local?
>>
>>102452820
Brings back memories
Weren’t we here half a year ago?
>>
File: ComfyUI_33980_.png (905 KB, 1024x1024)
905 KB
905 KB PNG
>>102454143
>>
>>102454531
nice gen
i got my answer anyway https://civitai.com/models/288584?modelVersionId=324692&dialog=commentThread&commentId=538229
>sweet spot at 21 cfg 3-5
https://www.dropbox.com/scl/fi/7kxnrg3z88rqlubsfbn2i/xyz_grid-0000-1012054079.png?rlkey=8gkahu4e1m2p0cim5c56td54g&e=1&st=zq1366yh&dl=0
https://www.dropbox.com/scl/fi/diwn1i4tntdm5ezxn5iio/xyz_grid-0000-10120540792.png?rlkey=ky40uikdi0fgjhglv00sqlms0&e=1&st=oog7vdac&dl=0

at least for autismmix.
>>
>>102454549
not him but if you don't have sucess you can look into modifying how cfg is applied during the image process, i.e. x.x for up to 80% then x.y for the last 20%
i cant recall the actual term but its something like CFG scheduling or adaptive CFG.
>>
>
>>
File: 0.jpg (233 KB, 1024x1024)
233 KB
233 KB JPG
>>
https://www.redgifs.com/watch/cuteslowvulpesvelox

rate my animated attempt
>>
https://github.com/cubiq/PuLID_ComfyUI/issues/69#issuecomment-2360010118
>it's a bit more complicated than I'd hoped, so it will take some more time as it's not my top priority now, sorry guys. the model is still not perfect anyway so you are not missing too much
bruh

>>102455519
not bad, you used the regular model or the "fun" one to do this keyframe?
>>
File: 0.jpg (320 KB, 1024x1024)
320 KB
320 KB JPG
>>
>>102455533
regular but i think i need to find a way to manage vram cause with only 12gb things are not getting as far as i want to
>>
File: file.png (105 KB, 3694x849)
105 KB
105 KB PNG
this is a joke, c'mon
>>
in comfyui how do i make a custom node that detects when i click on it (in the front end)
>>
File: file.png (850 KB, 3840x1750)
850 KB
850 KB PNG
can't make CogVideoX I2V work, did I do something wrong? All I did was to use the workflow that was on the repository
>>
how the fuck do you get openpose to work with xl? all it does is output gray nothingness
fucking downloaded the models i needed and picked dw openpose as the preprocess
and fuckin animatediff doesnt work anymore so i cant use that either
cog-sama save me..i just wanna animate ten fucking frames god damn
>>
>>102455519
using this
https://github.com/MinusZoneAI/ComfyUI-CogVideoX-MZ/blob/main/examples/workflow_I2V_GGUF_Q4_0.png
Vram is 50% of my 16gb vram and 35% of my 64gb ram
I added force set clip device in to offload the clip to cpu, and res is 720x480 for 41 frames and takes 350 odd seconds on a 4060ti
>>
File: 00166-3605627581.png (483 KB, 768x768)
483 KB
483 KB PNG
gonna stick an icepick up my nose and penetrate my brain folds
>>
>>102455647
kek I wanted to use the dual clip encoder but it doesn't seem to work for the regular weights
>>
>>102455695
Forgot to add
>Are you using this...?
sorry.
>>
File: file.webm (74 KB, 720x480)
74 KB
74 KB WEBM
>>102455647
>>102455738
>A woman is reading a book that bursts into flames
I WAITED 15 MN FOR THIS REEEEEEEEEEE
>>
How the fuck do you guys sleep when you could be generating one more image, it's getting late
>>
>>102456069
when flux came out I was sleeping like 4 hours a day and I always said
>just one more gen just one more gen
>>
>>102456117
That is literally me right now, two left in the pipeline and it's 2am. I gotta move my pc to the other room and batch this or something.
>>
>>102456069
ive been dicking around with posing/how2animate the past 4 hours pushing past exhaustion purely out of anger and spite
why the FUCK doesnt shit work nigger DICK
>one more image maybe itll work this time
>wow nice tits but its still wrong
>one more image
>>
>>102456139
catch up on some sleep bro
>>
File: 00082-4030818080.png (2.43 MB, 1120x1440)
2.43 MB
2.43 MB PNG
>>
File: bComfyUI_114595_.jpg (1.2 MB, 3072x1536)
1.2 MB
1.2 MB JPG
>>
File: 00001-4243765995.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
Last gen before I passed out last night. I'm up in the morning and not the late afternoon for the first time in a week, feels good man.
>>
>>102455534
nice
>>
>>102450813
Probably my favorite gen since the beginning of SDG/LDG
>>
>>102455715
Do a flip
>>
>>102455533
Cubiq has been exposed so bad after flux was released, dude hasn’t been able to implement any of his schizo stuff and his flux findings on twitter are really bad
>>
File: output2.webm (237 KB, 608x416)
237 KB
237 KB WEBM
Not bad
>>
>>102457303
I'm more surprised of the fact no one managed to implement PuLID on ComfyUi yet, it's been more than a week at this point, I saw this but it's not really working too:
https://github.com/lschaupp/PuLID_ComfyUI_FLUX
>>
Is AMD Radeon Pro VII 16GB good for this?
>>
>>102457356
They are waiting for someone else *cough Illya cough* to implement it webui, then work from there
>>
>>
i would like to make an extension for my panel that displays my comfyUI status. does anyone know if there's any way to communicate with the server running in my terminal?
>>
File: 00006-1313012301.jpg (182 KB, 1984x1288)
182 KB
182 KB JPG
Sometimes I love hitting Pony models with just random style prompt cards and nothing specific and just see what happens.
>>
>>
File: 00007-498211980.jpg (174 KB, 1984x1288)
174 KB
174 KB JPG
>>
File: output6.webm (174 KB, 608x416)
174 KB
174 KB WEBM
I got a video with bad faces. I extracted the last frame, ran facedetailer and ran cog again with that as last frame. With a character lora it would turned out better but it's workable.
>>
File: 00012-3747794771.jpg (229 KB, 1984x1288)
229 KB
229 KB JPG
>>
File: 00022-235855301.jpg (216 KB, 1984x1288)
216 KB
216 KB JPG
>>
File: output.webm (468 KB, 720x480)
468 KB
468 KB WEBM
Posting some shitty cogi2v gens I did. Anyone have much success with anime style starting images? Seems like it's only really trained for realism.
>>
>>102457402
Many such cases
>>
File: 00014-1429373809.jpg (203 KB, 1288x1984)
203 KB
203 KB JPG
>>
File: bComfyUI_116084_.jpg (623 KB, 944x1200)
623 KB
623 KB JPG
>>
File: 00031-2980058575.jpg (203 KB, 1288x1984)
203 KB
203 KB JPG
"asian" in negatives means blue eyes
>>
File: 00038-2884515977.jpg (178 KB, 1288x1984)
178 KB
178 KB JPG
>>
File: 00014-1746523043.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
Regional Prompt is pretty entertaining.

>>102458272
>Seems like it's only really trained for realism.
Weeabros we keep losing...
>>
File: output.webm (489 KB, 728x486)
489 KB
489 KB WEBM
>>102458486
It does ok sometimes I guess.
>>
For those of you messing around with the new CogVideo model what kinda gen times are you seeing? CogVideoX-5b-I2V takes like 5 minutes for me on a 4090 with fp8, just wondering if that's typical or if there's some tricks to get it to go a little faster.
>>
>>102458641
Oh shit yeah that's not bad at all. Still a bit uncanny but not as bad as I was expecting.
>that titty jiggle
lol. amazing.
>>
>>102458641
did you prompt jiggle for that or is just rng? also its is cogvideox-fun or the official release?

I've tried with the official release but the bodies look static
>>
>>102458706
reduce steps to 30, anything below that will give you glitches and 30 seems to maintain quality
>>
File: output3.webm (486 KB, 728x486)
486 KB
486 KB WEBM
>>102458931
I prompted for the jiggle. It doesn't seem to be that much RNG, all the prompts I tried describing jiggle have been getting it. It seems to like to blur towards the end sometimes. It's with the official CogVideoX-5b-I2V release.

Have another.
>>
>>
>>102458706
>CogVideoX-5b-I2V takes like 5 minutes for me on a 4090 with fp8, just wondering if that's typical or if there's some tricks to get it to go a little faster.
don't worry, we'll get a 50X speed bost with the next Nvdia cards, trust the plan
https://x.com/tsarnick/status/1836516258877182299#m
>>
>>102459113
Can't wait to spend 5000 bucks for it.
>>
File: file.png (1.42 MB, 1574x1000)
1.42 MB
1.42 MB PNG
>>102459136
to be fair, if I have to pay 5000 bucks to generate a video in 6 sec instead of 5 mn, you damn right I'm gonna pay
>>
>>102459258
people would pay $5000 for a GPU that generated pornographic videos reliably, we're still early but give it 10 years when AI software and hardware has stabilized and you'll see people spending $5000 or more for a setup
>>
>>102459113
if thats legit true I would suck Jensen Huangs microdick and buy such a card on day 1
>>
>>102459258
If the speedup is anywhere close to what he says (even just 2-3x honestly) then it's a no brainer, still gonna suck to spend 5000 bucks on a GPU.
>>
File: ComfyUI_Flux_14141.jpg (413 KB, 832x1216)
413 KB
413 KB JPG
>>
>>102459579
that the dark fantasy lora?
>>
>>102459347
>>102459337
that Blackwell chip will only be for Data Centers right? it's not for the upcoming 5090 if I understand it well
>>
File: ComfyUI_Flux_14149.jpg (377 KB, 832x1216)
377 KB
377 KB JPG
>>102459641
https://civitai.com/models/721039/retro-anime-flux-style
>>
File: bComfyUI_115892_.jpg (456 KB, 720x1080)
456 KB
456 KB JPG
>>102459681
oh that one, it kept giving me long ass hilts too. did good on poles and staffs tho.
>>
>>102459651
if thats not the 5090 I'm going to kill myself and blame it on Jensen Huang
>>
>>
Come and get that next loaf...
>>102460029
>>102460029
>>102460029



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.