6b EditionDiscussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107332452https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>WanXhttps://rentry.org/wan22ldgguidehttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQdhttps://gumgum10.github.io/gumgum.github.io/https://huggingface.co/neta-art/Neta-Lumina>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
6B for life
yumebros.... our days our numberd?
>SexB
>>107334502does anistudio support z image yet?
>RuntimeError: CUDA error: HIPBLAS_STATUS_ALLOC_FAILED when calling `hipblasCreate(handle)`NEVER UPDATE COMFYUIFUCK
Prompt:人物特征:东亚年轻女性,齐肩中长发,发尾微内扣,深棕色头发,带有空气刘海;佩戴黑色大圆框眼镜,镜腿有蓝色细节;化淡妆,唇色为淡粉色,左耳戴一颗小巧耳钉,颈间有细链项链;穿着米白色针织上衣。风格氛围:清新日常的自拍风格,光线柔和,人物面带自然微笑,神态亲切。背景细节:后方有带有 “L NREODNAZ SOVEK ALLERHAO” 字样的圆形标识,背景墙是大理石纹理与木质材质的组合,带有暖光照明。Translated to english:An East Asian young woman with medium-length, shoulder-length hair, the ends slightly curled inward. Her hair is a deep brown, styled with wispy, airy bangs. She wears black oversized round-frame glasses with blue detailing on the temples. Her makeup is light, with lips in a soft pink shade. A small stud earring adorns her left ear, and a delicate chain necklace rests at her neck. She is dressed in a cream-colored knit top.Style & Atmosphere:A fresh, everyday selfie style. The lighting is soft, and she smiles naturally, her expression warm and approachful.Background Details:Behind her, there is a circular sign with the text “L NREODONAZ SOVEK ALLERHAO.” The wall features a combination of marble texture and wood grain, illuminated by warm lighting.
>>107334491WAN seems to avoid the "AI Slop" look, but it's hard to get it to follow lighting prompt (i.e blue light, etc) and the text in the image is weird.Qwen + Wan seems to be the best combo: Qwen for the prompt adherence and good text and Wan to remove the "AI Slop" look---https://files.catbox.moe/24i01w.png
Remember that china has won (but seriously though, when will they release it? I'm getting tired of that edging)https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture
nb4 z is a total flop and the anon who pretended to hype it the most spends days trolling like "AHAHAHA LOCALKEKS WERE SO EXCITED BUT I KNEW IT WOULD BE SHIT"
IT'S COMING
>>107334550flux bro. its over.
>>107334550sovless vs sovl
>>107334550Black Forest Labs? ... Nah.More like Big Fucking Losers.
HAIL ALIBABAHAIL CHINAHAIL XI JINPING THOUGHT
>>107334562This will most certainly happen.
>>107334574It never began. Eurocucks can't stop losing>>107334575truthnvke>>107334581all my niggas chicoms https://files.catbox.moe/lhf9on.png
>>107334566Niggel prease let this be true!
>Decoupled-DMD: The Acceleration Magic Behind Z-Image>DMDR: Fusing DMD with Reinforcement Learningthis is big, DMD was fucking great for SDXL
>>107334566AIEEEEEEE LETS GOOOOOO
Retard here.What exactly does turbo mean? Can the turbo version be finetuned or do we need to wait for the base one for that?
You guys get your hopes up way too high, every, single, time. Just wait till it releases.>>107334545>he pulled?
>>107334562>>107334586why would it happen, we can see non cherry picked images on the site (made by regular users) and those images all look goodhttps://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture
我爱北京天安门,天安门上太阳升;伟大领袖毛主席,指引我们向前进。
>>107334608>What exactly does turbo mean?they explained everything herehttps://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/summary?version=master
>>107334608>Can the turbo version be finetuned or do we need to wait for the base one for that?it's really hard to undistill a model, look how much we struggled with flux dev and flux schnell, we'll wait the base model for that
>>107331027Flux2 scheduler glues sigmas to the ceiling at extreme resolutions and that seems to be part of the problem. Genning with regular beta 0.9/0.8 helps, though dithering isn't gone for good.(I'm not repeating that experiment, though)
alternative collage for last thread
>>107334620>>107334626>>107334629Thanks! Guess we're waiting a bit longer until shit gets real.
>>107334595those 2 methods are those being used to make 8 steps good quality right? it's not linked to the base model?
have you started learning chinese to prompt yet?
>>107334579a particularly malicious individual might opt for an alternative, semi-rhyming substitute for "lab", but i'm glad we have class here in /ldg/
>>107334566https://huggingface.co/collections/Tongyi-MAI/z-image
>>107334614>You guysThese are Chinese bots, mostly. They always swarm on release. Some of their models turn out legitimately good, though.
>>107334630incredible research anon
>>107334642bfc
BFL on suicide watch
>>107334660>BFL on suicide watchgood, they fucking deserved it
--ALERT ALERT--A REDDITOR HAS SUCCESSFULLY PROMPTED AMERICANA THOThttps://www.reddit.com/r/StableDiffusion/comments/1p7b016/just_a_few_zimageturbo_shots/JESUS that's detailed for a turbo model and low resolution.
>one of the z-image devs liked the Neta repositoryLUMINABROS WE ARE AVENGED
>Flux 2 is released>all people are talking about on r/Stablediffusion is the upcoming Z-Image modelAIIIIEEE SLOPMAN SAVE MEEEE
>>107334670look at the hair, it's like a realplayer screengrab
>>107334677really? give me his twitter account plz
>>107334660When was you when Flux 2 was kill ?
>>107334556cute boy but what's with the bra?
>>107334620>fits comfortably on a 16GB GPUThat's they key to success, it's in the roughly same spot as XL when it came out when 8GB only were starting to become the norm. 12GB should be able to run and train loras with it without too much problems too, meaning that alot of people will be able to use the model right away without unholy jewish tricks, unlike the fuckhueg novelty models that kept dropping and failing.
god i hope they can get it out by lunchtime
>>107334550turn the flux guidance down
>>107334660BFL announced their upcoming video model, Wan is released, BFL quietly cancels their video modelBFL announces Flux 2, Z-image is released...China is so far ahead at this point, only thing the west is better at is crippling censoring
>>107334550Eh
>>107334699actually three of the huggingspace team members also did the Lumina paper
>>107334745>>107334749Not even a pajeet would believe these are real
>>107334745to what?
>>107334589the eurocuck model was cencored (like all of yuropooria) so who cares lol
>>107334748Don't forget that one>BFL releases Flux kontext dev, gets BTFO by Qwen Image EditAlibaba 3 - BFL 0
>>107334766If I resave it as jpg 40 times, will you be convinced?
>>107334771isnt bfl funded by musk?
>>107334769>But they are already available.where?
>>107334502>>107334635Good collages with many images means healthy bread :-)
>>107334752>Why do you never give up>Because the lumina devs never did Bigma is finally real lmao
>>1073347681.9 gets interesting but you start to get artifacts, start at 2.5 and go where your heart desires.
>>107334781I am retard, sorry.
>Husbando, you buy too much ram to run Flux 2 dev, now we are homeress
I tried Rachel Green from Friends, desu it's way closer than I expected
>>107334845>don't worry honey, i was one of the original rammaxers back in llama 1 times
>>107334773It's not that it's too clean, heck you can see lots of macro blocks on the bangs as is, it's how the lightning affects the skin, you can lower the image quality as much as you want, it will always look uncanny and is a clear result of them training primarily on synthetic data
Temper your expectations.
>>107334871>Tempers your nuts on a forge
>>107334869>it's how the lightning affects the skinthis, those slopped models can't help but to make the skin too smoth and shiny, only Chroma (and now Z-Image) managed to surpass that uncanny valey
>>107334779IIRC X used their image service initially, but that was like a year ago, now everything runs on the Grok model
>>107334864>sort by color>tidesurprinsingly coherent text. especially the "sort by color" sign, since it's out of focus
>>107334752Based
>>107334869>training primarily on synthetic datazero facts, all schizo posting
>>107334550
>>107334845Hmm. So the secret trick to making it look realistic is simply to slap heavy jpg compression on it?It looks fine in the mini preview, but you shouldn't open it lol kek
>>107334901then explain why that 32b model is so slopped and that 6b isn't
>>107334888>model not even released yet>absolutely deepthroating the devs
>>107334773>If I resave it as jpg 40 times, will you be convinced?>>107334913>So the secret trick to making it look realistic is simply to slap heavy jpg compression on it?no need to say the same argument twice we saw it debo
>>107334915define slopped
>>107334919>t. seething bfl employeeget back to the lab and make a better model than this bloated shit nigger
>>107334871Impossible, did you see ldg when wan2.5 released? Oh it was full blown kicking, screaming, pants shitting meltdown when they announced api only, kek.
Is that badit still lets me slop
>state of the art western model is released>chink shills and poorfags flood the thread to do damage control for their master Xikek
>>107334946>Oh it was full blown kicking, screaming, pants shitting meltdown when they announced api
>>107334925It's nice that others notice it too. You can't recognize anything in the pictures, not even the teeth are visible because of all the artifacts.You chink vram poor botBut I'm happy for all third-worlder that they also have something to play with.
>>107334957>state of the art western model is released
>>107334919I hope it's a good model too but this much hype before we even get our hands on it is insane. What a fucking madhouse.
>>107334957link?
>reddit reposting general
>>107334946>t. the guy who was shitting his pants when 2.5 announced API only >>107334562
A natural-light museum exhibit photographed with a neutral documentary tone. The image shows a detailed diorama behind glass, lit primarily by soft overhead museum lighting and ambient daylight leaking from nearby windows. In the center of the display is a lifelike model of a fictional prehistoric animal labeled “VRAMLET.” The creature resembles a medium-sized mammal with a long, droopy nose similar to a tapir’s trunk but slimmer, sagging downward in a comical way. It wears oversized, thick-rimmed nerd-style glasses that sit awkwardly on its snout.In front of the VRAMLET is a retro beige CRT computer on an old desk surface. The monitor is bulky, with a curved screen and visible ventilation slits. Gray smoke billows from the computer’s vents and keyboard, curling upward toward the diorama’s lighting. The VRAMLET model is posed as if actively using the malfunctioning machine—forelimbs resting on the keyboard, head tilted forward as if concentrating.Museum placards and descriptive text sit off to one side, slightly out of focus. The background includes typical exhibit elements: faux foliage, painted prehistoric landscape mural, and textured ground materials like resin dirt and artificial rocks. The glass barrier in front reflects faint silhouettes of visitors, preserving the natural candid feel. The overall style is realistic, mildly humorous due to the glasses and smoking CRT, and consistent with standard museum photography.
>>107334978I mean, the images do look good, so if this what we'll get from ComfyUi then we are definitely saved
Didn't realize it would work to slot in color match, figure'd it'd break the merges. Very very nice.
>>107334967>Nooooo, small models can't be good! You have to stack moar layers to get better results! Why did I pay for an RTX 5090 then???No refund.
>>107335005Agree to the terms and conditions goy, and we MAY let you use it to generate pictures of kittens
Prompt:在画面中央,一位年轻女子,有着长长的、飘逸的棕色头发和柔软的刘海,温柔地凝视着观众。她的表情平静而沉思,精致的五官和眼睛似乎蕴含着安静的惊奇。她穿着一件轻便的无袖服装,与飘逸的环境巧妙地融为一体,让她的皮肤和头发捕捉到周围的光芒。她的姿势很放松,几乎就好像她是场景的一部分,而不是观察它。这里的环境似乎是一个梦幻般的冬季花园,花朵被闪闪发光的冰晶包裹着,营造出超凡脱俗的景观。背景柔和模糊,充满冷蓝色调和散景亮点,让人联想到冰冻的花朵或发光的霜。前景还以这些冰冷的花朵形式为特色,有些部分处于失焦状态,增加了深度并强化了神奇的氛围。整个环境感觉时间暂停了,仿佛大自然在冬季的魔力下暂停了。灯光在塑造气氛方面起着至关重要的作用,在女人的脸上和肩膀上投射出温暖的金色光芒,而周围的世界仍然沐浴在凉爽的蓝色和白色中。
>>107334992If they still look so badly compressed locally, it's DOA
Z-Image looks interesting. I somewhat doubt it's more lightweight architecturally than Lumina 2.0 in practice though meaning it's even less likely to be an "immediate SDXL" killer I'd say. Flux 2 is decent also, a bit huge though. Looks way better than Flux 1 for photographic gens out of the box is any case.
>gen image assets and day dream about that perfect weg you gonna make with themI don't even feel the need to play video games at this point that much, this is just too good.
white synthographers crave big chang cock
>>107334986It's just him, he does it all the time.
>>107335031Flux 2>Oversaturated colors>plastic skin>low detailsZ-image>actual kino skin>subtle colors with a smooth color gradient>really nice details especially the hair32b vs 6b btw
>>107335064>big chang cockbut alas, this doesn't exist.
>>107335079>himpoopdickschizo?
>>107335031how the FUCK do labs see something like the left most image and say "wow it looks so realistic... SOTA! :D" idgi
res_3m/linear_quadratic 28 steps, i guess you don't have to use the flux scheduler at all
A documentary-style photograph taken inside an older subway station with dirty off-white tiled walls and overhead fluorescent lighting. The main subject is a large rectangular advertisement poster mounted along the platform wall. The original ad has been replaced with a clean, minimalist promo for Flux 2, the new AI image generator from Black Forest Labs. The design is corporate and sleek: large bold sans-serif title “Flux 2” centered on a bright white background, with a smaller tagline beneath such as “next-generation image synthesis” and a Black Forest Labs logo along the bottom edge.The ad has been heavily vandalized with multiple layers of graffiti in different handwriting styles, all in the sloppy marker-pen aesthetic commonly found in real subway ads. Examples of the graffiti include:“fluxbros… it’s over” scrawled unevenly across the top in fading blue marker.“sovless” written diagonally near the center in aggressive, jagged lettering.A block of political graffiti written in red marker, formatted like a chant:“HAIL ALIBABAHAIL CHINAHAIL XI JINPING THOUGHT”(clearly presented as vandalism, not part of the ad).More random doodles, arrows, crossed-out text, and crude speech bubbles occupy the rest of the space, giving the scene a chaotic, defaced look. The lower edge of the poster shows wear, peeling corners, and grime. The composition is realistic, slightly wide-angle, capturing the gritty authenticity of subway-ad vandalism.
cant imagine not having at least 96gb ram btw
>>107335082Z-Image isn't kino at all there IMO, it looks like it has the same JPEGmaxxed problem as Cosmos 2 and HiDream kinda
>>107335138Because its been converted to jpeg on generation as the original image on the promo site is only jpeg, then it was encoded again into jpeg for that comparison. When generating you will be saving directly into png.
> JESUS that's detailedlol chinks
>>107335138cope
>>107335157anon you are making apologist explanations for an unreleased model, have some self respect
>>107335031
I would take faux jpeg compression over hyperslop 2.5d soulless bullshit any day. You are retarded.
>>107335157i have seen this cope before
Prompt: A gritty, documentary-style photograph inside an aging subway station with stained off-white tiles, metal grime, and cold fluorescent lighting. The main subject is a large rectangular advertisement poster mounted on the wall. The ad itself promotes Flux 2, the new AI image generator from Black Forest Labs. The design is sterile and corporate: stark white background, bold black sans-serif “Flux 2” headline, a small tagline like “next-generation image synthesis,” and a Black Forest Labs logo near the bottom. The poster has been aggressively vandalized with layers of graffiti made using different markers and handwriting styles, giving it a chaotic, real-world subway defacement look. The graffiti includes: “fluxbros… it’s over” scribbled in uneven blue marker across the top. “sovless” written diagonally in jagged red lettering. “cringe model desu senpai” scrawled across the lower center in rushed bubble-letter style. “>mfw they release a better model than mine’s the day I announced it” written in green marker, using an imageboard-style quote arrow. A crude graffiti drawing of a frog, low-effort and lopsided, in black marker beside the text. Additional random scratches, lines, scribbles, and half-erased tags clutter the edges. The poster’s corners peel slightly, and grime accumulates along its edges, reinforcing the worn, urban, photo-journalistic realism.Sampling Steps: 9Sampler Method: eulerCFG Scale: 7.5Random Seed: 1451342554Size: 864x1152Elapsed Time: 2.5 s
>>107335119is this chroma?
>>107335179This
>>107335172anon, you are making libel for an unreleased model, have some self respect
>>107335179>I would take faux jpeg compression over hyperslop 2.5d soulless bullshit any day. You are retarded.100% this, artifacts >>>> plastic
>>107335166> 6b bro> turbo bro
>>107335242i'm actually pretty pumped for the new model too, but it's not a panacea before it's even been released
>>107335191>>107335172Heres the zoomed in corner of your own png image on the left compared to the encoded with the default ffmpeg jpeg preset on the right, retard
Does the Z-Image use any existing vae/TE or is everything new again?
>>107335270fair enough
>>107335284I guess you have to look at Comfy's code to see the answer https://github.com/comfyanonymous/ComfyUI/commit/e9aae31fa241a6a63a368800146ea91629d4e8c2(can't help you further I'm a codelet)
>>107335179those who prefer the second one should not be taken seriously and they likely reside in their states custody in some kind of prison or mental facility
>prompt: womanhagbros...
>>107335297based
>>107335031the blur is so beautiful to look at on Z-image, it gets stronger and stronger in a really smooth transition,
>>107335272Doesn't look Indian though.
>>107335302please do not mock my adult woman fetish
>>107335284Text encoder is new. Just a few months old and afaik nothing else used it.No clue about which vae it uses though.
>>107335280Here's zoomed in more.There is a reason why jpeg is so much smaller in size, retards.
>>107335272>>107335319yeah he looks like a spic
>having a melty posting pixels from jpegs to show how compression worksishygddt
>>107335356>snakebite At least give XL a fighting chance kek
>>107335356>"The man is giving a thumbs up">Only Flux 2 pro doesn't respect that, it has a 24b text encoder btwAIEEEEEEEEEE
qwen edit v3 where2509 is fine but I wonder what is changed.
>>107335284>>107335295The code seems to indicate it's using the Qwen3 4B text encoder which makes sense. It's smaller than T5 thankfullyThis is also fun because there are various sizes of TE in this architecture so maybe we could try projecting the smaller 0.8B output into it, for instance
>>107335351
>>107335356sdxl: looks slightly like Ai-slopZ-image: amateur photograph pictureFlux.2 pro: professional photograph picture with slightly too much post editing or filtering
>>107335382*i mean it's smaller than t5-xxl, also i meant 0.6B
we're still scratching the surface i guess.i wired up the Q8 GGUF to the clownsharksampler and ran this with res_3m/kl_optimal and the detail boost node. 28 steps. it's still dog slow but it's fun to fuck around with. see y'all when z drops.>a capture from an old tv show. a scene depicting a gothic woman as she stands on the set of a 90s sitcom. she wears a black high-waisted thong with skulls on it and a long sleeve very short crop top. she has heavy goth makeup her shirt says "Death" on it in gothic letters. she has wide full hips and a slender stomach. her hair is black with dark purple highlights. she looks as if she is about to say something, looking left off screen, mouth open. behind her are all the makings of a 90s living room, a beige sofa, a brass lamp on a lamp table, a bookcase full of spooky toys etc. on the wall is a pentagram themed hanging. beside stands a stuffed black goat, wearing a gold pentagram chain. the goat is almost as tall as she is, vhs artifacts are visible on the image
>>107335351Did BFL take notes from Lodestone and train on troons?
>>107334540a moment of silence for the finetuners' wasted computeand for all the unreleased illustrious models kek
>>107335225Yes
>>107335351>that jawline
>>107334540The examples from even Neta (not Yume) look better than what we've seen from Z but obviously we need to wait until release to do direct 1:1 comparisons.
>>107335356>1080x778I see how you're downsampling all the comparison images to hide how badly artifacted z is!
all models that get deprecated by another model indirectly contributed a large amount of pressue on the company to actually publish a model that is better than that previous model
>>107335387>staircase in zimage>not slopped outlol
>>107335447bro just learned what is competition
>>107335445you understand that if that were the case then z-image would be even more impressive since its easier to create details in large pixel space and then compress them down than rawdog a locked resolution, right?this is literally how supersampling in games works as an antialiasing method that is the best but the most costly.
>>107335356I mean, for sdxl, that looks pretty decent. Might have to break out the old sdxl workflows.
>>107335453im reminding the retards here that dont seem to understand it as they shit on specific models as completely worthless all the time while wondering why companies ARE realising specific models and then also wondering why they are not at other times
So while waiting for the actually relevant model, I tested Flux 2 vae a bit.Decent improvement but nothing revolutionary or mind blowing.It's fucking 128 channel but also compresses images 4 times higher than previous one. 16x16=256 times compressed instead of 8x8=64.After Wan 2.2. 5B I was skeptical of whether stacking copious amount of latent channels can compensate for decreased resolution properly, I guess I am proven wrong.It's the current SOTA for VAE quality.
>>107335447It doesn't always work like that, though.
Z-Image can almost perfectly reproduce a lot of cars. Pretty damn impressive. It knows the difference between old & new model years, even
lul
>>107335464that works for antialiasing in games, but it's not how sampling of photographs or illustrations works.
>>107335503
>>107335520
>>107335517if you gen an image at a higher res than those z-image generations, which are 1MP, and then you downscale everything to 1MP to match z-image, you are doing the exact thing that i described and those initially bigger images will ultimately have more detail
>>107335496>16x16=256 times compressed instead of 8x8=64They just moved the 2x2 patching into the VAE instead of at the input to the model. So the diffusion model input is 16x16 total compression, same as most other models (which do 8x8 in VAE then 2x2 patches separately at model input).
>>107335510The realism is on point, even in the background the details are good
>>1073355341998 -> 2025
modelscope won't let me login (not receiving the verification code), does anyone want to run this prompt?>A drawing of hatsune miku with dreadlocks and light black skin skateboarding in New York at night. She is holding a smartphone on her left hand and a multicolored ball on her right hand, she has a red t-shirt with text on it that says: "MIGU". A pikachu can be seen on the top of her head. Her speech bubble says "Hard to keep me in style huh?", neons, 50's comic book style
>>107335538>will ultimately have more detail*compared to what those non-Z models would have if they were to gen something directly at 1MP, i mean
i'm retarded and don't know how to install flux
>>107335560https://justgetflux.com/
>>107335560just look at a youtube tutorial
>>107335534>>107335549Neato
https://catbox.to/3KkRlTrKEsIKJGf/preview
>>107335319>>107335329My bad. It missed the nationality, but got the skin color>Right Panel: A candid, slightly blurry outdoor shot of a man with dark hair and a beard. He is wearing a blue long-sleeved shirt and is squatting on the ground in a wooded area with dry leaves and dirt. His pants are pulled down around his ankles, and he is looking back over his shoulder at the camera with a startled or distressed expression.
>>107335031Flux 2 can never get realistic lighting and depth right. It always looks flat.
Hmm, neither is perfect, but considering the size difference Z-Image is pretty impressive.
>>107335539Oh?I just lazily read debug tensor shapes and drew conclusions about model structure from that, thanks for the correction.Then it sounds like 128 channels is indeed overkill with limited returns.
uh oh Z-Image is pretty ghibli slopped
>>107335538If you have a natively 12MP photo and a 2MP photo, and you downscale the 12MP photo to 2MP, you won't notice a difference in quality as long as you didn't do something stupid like nearest neighbor rounding. You won't get a supersampling benefit like with raster CG graphics. It won't be distinguishable that the formerly 12MP image was originally much more detailed.
>>107335594even flux 2 pro looks less realistic than that 6b model, damn that's brutal, those chinks caught lightning in a bottle
>>107335566>indian accent
>>107335600may I see it?
It seems Z-Image is by people who were poached from Lumina 2.0
I got banned for posting hitler on a pony, are you fucking kidding me?
>>107335631that's why you go local my nigga
https://xcancel.com/bdsqlsz/status/1993733868181246163#m>wait it release tomorrow.aww... :(
>>107335631Previous threads would get insta nuked if one out of the two dozen gens in the collage featured a pony. They look for any reason to report because they hate the fact that this thread exists.
>>107335645For those who can't test it out on the chink site you have fal nowhttps://fal.ai/models/fal-ai/z-image/turbo
do normalfags still believe that all ai has a piss filter and looks like ghibli or have they learned yet
>>107335631Maybe post that edgy fetish stuff on the appropriate boards. That you are already ban evading to whine also tells it own story.
>>107335631some anon must've reported you thats the only way to get banned here
>>107335686>some anon must've reported youit's this anon -> >>107335685
>>107335602the difference is all ai models currently have a problem when they are generating at a set resolution, they are generating everything to look mostly right unless you get too close, for example skin details and small fingers on a large full body image, they can look good when you're looking at everything, but when you look closer with your eyes to the screen you will notice small things that dont make sense physically, things that would have been fixed if the model was generating in large resolution right away and then downscaled its outputin other words, its different because models when generating are basically searching for a solution in a limited pixel space, the bigger the pixel space, the more refined the solution they can find for everything (as long as they are trained to support those resolutions)this isnt the case for real photos because real photos are not diffusing a solution from nothing to something they vaguely "remember", and thus creating a lot of small mistakes, instead real cameras have direct access to and are just compressing what is already "perfect", the real world, where all the physics have to be right from the get go
>>107335685imagine being this much of a faggot
>>107335663Thanks.>A drawing of hatsune miku with dreadlocks and light black skin skateboarding in New York at night. She is holding a smartphone on her left hand and a multicolored ball on her right hand, she has a red t-shirt with text on it that says: "MIGU". A pikachu can be seen on the top of her head. Her speech bubble says "Hard to keep me in style huh?", neons, 50's comic book style
>>107335697Why would that be the case?
>>107335714now translate the prompt to chinese
>>107335680This isn't a gen.
>>107335714bruh where are the dreadlocks? :(
>>107335722it is
>>107335712>hurr durr i got banned and directly ban evadedBans have become the new slap on the wrist, actually it happened years ago, but here you whine around like you could finally post again after a 30 day ban. Fucking crybabies.
>>107335740>Fucking crybabies.says the faggot crying when he sees an edgy image btw
>>107335748There is a time and a place, newfag.
>>107335663>https://fal.ai/models/fal-ai/z-image/turbo>A young white woman skateboarding in Tokyo, she is holding a Kasane Teto plushnoo it doesn't know teto!! (this shit is so fast though I got that image in less than a second)
so tomorrow is when 1asiangirlGODS win...
Pleasantly surprised with basic flux 2 1girl results considering all the hate
>>107335759its basically better qwen overall with no low seed variance issue, its just that it wont see much community support nor loras given the size
>>107335738Prove it.
>>107335722checked
>>107335756
>>107335714
>>107335785This also isn't a gen.
>>107335802don't make me pull out the reddithttps://www.reddit.com/r/ChatGPT/comments/1p6lcj6/comment/nqx1l03/?context=1
>>107335817Why are you posting cloud in the local thread? Are you dumb?
>>107335631kek i also got banned once and muted the next time for posting an hatsune miku pony in lmg thats just the state of things when eunuchs are in court
>>107335829No, just lonely :(
>>107335794
>>107335829I'm just here to T you off (again) ;^)
>>107335756an manga of Hatsune Miku with a speech bubble saying "no troons allowed!"
I've been flapping to a lot of AI porn lately, I may as well make my own. Is the guide in the OP still good? And am I doomed with amd gpu (6700xt)?
>>107335847wanna be friends?
>>107335901>6700xtI think that's sufficient for generating 1girl with SDXL.
https://xcancel.com/LumaLabsAI/status/1993735476264481126#m>Introducing Terminal Velocity Matching: a scalable, single-stage generative training method that delivers diffusion-level quality with a 25× fewer inference steps, now trained at 10B+ scale. damn
ALIBABY WONBLACKED FOREST LOST
>>107335901>12gb vramnot great not terrible
>>107335945I will care when I see a decent model using it.An entire graveyard full of breakthrough methods that promise 10x quality at 100x speed.
>>107335901SDXL is as far as you can push with comfy speeds with that card without heavy quantization.Could have run nunchaku flux if nvidia.Like you can run larger and newer models of course but the speed will suck.
>>107335901You could get away with some small wan gens. As for the guide, well.... its a mess and slightly dated but its still usable. Definitely go for gguf/native instead of kijai so, thats Bullerwins quants for your models https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/tree/main, https://github.com/pollockjj/ComfyUI-MultiGPU for your wan/model nodes. Would definitely save up for stronger gpu for the long term
Kek so Z image is basically Flux 2 Pro but on 6B? We really are getting BTFO'd by Chinaman.
>>107335901static: get illustrious (wainsfw v15 is the latest I think)moving: wan 2.2 + lightx2v + nsfw lorasalso, qwen edit can make lewds of people or change their clothes.
>>107335714>>107335796
>>107336038And they will release Z-Image Base which will be larger and better quality than Z-Image Turbo, but obviously not as large as Flux 2BFL is so done
>>107336058>wainsfw v15why would you set anon up for slop like that
>>107334957Chroma is already z image tier at photorealism, nice to know China has caught up in one of their base models though. But they love to pretend Western models don't exist
>>107336078for 2d anime it's waiNSFW or hassaku, knows almost every character
>>107336076>largerit's still 6b, the only difference between base and turbo is that turbo has a step distillation method, but they're the same size
>>107336094And yet they make the most soulless plastic sloppa imaginable.Use Noob V-pred 1.0.
>>107336038>>107336076ok chang
>>107334550Illustrious still on top!
But anon... how can you prefer FAKE ai bug women instead of strong white women you can meet in real life?
>1boy,JoJo's Bizarre Adventurit's this simple lmao
bbw bros we are so back
>>107336150wut
>>107336094>2dYou mean gross 2.5d and not even the semi respectable "realism" kind
>>107336168>big bug woman
>>107335594Can it do soles?
>>107336187try it you can make 5 free images before it starts asking you for moneyhttps://fal.ai/models/fal-ai/z-image/turbo
>>107335714>>107335723it's never been this over
>>107336228we'll see if the prompt adherence is better on the base model though
>>107335723They have a seperate reasoning model, don't think that anon is using it.
>>107335031>32b looks worse than 6bso this is the power of safety...
>>107336266Just distillation. Always has been an issue.
>>107336276>Just distillation.Z-image turbo is distilled too lol
>>107336085True, but it is slow, if you can get the same quality but MUCH faster then it's pure winThat said Chroma is THE best model for realistic NSFW, time will tell if Z-Image can be enhanced with lora/finetune to be as good, it certainly won't be out-of-the-box
>>107336085>Chroma is already z image tier at photorealismno it's not, chroma has bad details compared to z-image
>>107336102Really ? I thought the Z-Image base would be larger, not just distilled.Chroma trains relatively fast (much faster than Flux and Qwen), but if Z-Image Base (which is the one you will train on) is just 6b, that should theoretically bring training time down by half compared to Chroma.Chinks be based!
>>107336320where are these mythical chroma nsfw gens?
>>107336150nose ring theory is pretty confirmed at this point...
>>107336085Chroma is a deformation fest. I haven't tried Z image but if it can hold better coherence than Chroma while being faster, it has more potential.Main factor is if training NSFW into Z image is more feasible than fixing Chroma.
>>107336326It may also have worse prompt comprehension. We'll see.Of course Chroma will also have better NSFW training. There's no way they put an equal amount of lewd into a Chinese model.
>>107335167With what? In comparison to what? Wtf did this mean lmao?
>>107336373it's funny because it makes them look like cattle.
>>107336362Have you been living under a rock
>>107336228>>107336243>>107336253We're so back
I fucked around with it a bit over FAL. First impressions:Not super great but knows nipples.Doesn't know genitals too well.Can't conclude this with a high degree of confidence, but doesn't seem too poisoned? Will draw people in sex like positions when asked, just deformed genital-like blob between them. I am going to guess that this can probably be finetuned like how we beat genitals into SDXL. Unless the model is somehow completely unresponsive to training or requires flux levels of surgery.Doesn't know too many celebs.Of the celebs it knows, it doesn't mind adding boobs to them.That's all for know, gonna try more artistic stuff, text, copyrighted charactersOh and sadly it doesn't seem to respond too well to short prompts, it seems some word salad is needed. (Though I need more tests to conclude this confidently)
>>107336454show some images nigga, I don't care about your wall of text I want to see the images with my own eyes
>>107336468All the images included ponies so I can't post them.
>>107336426He's just trying to be a contrarian-fag
such quick threads and not due to trolling or bot posting nice
>>107336500half of these posts have to be trolls, i refuse to accept anon is this retarded
the local z-image model still isn't relased yet, right? all the posts so far are from SaaS?
>>107336518yea it releases locally tomorrow
>>107336468Suck my balls dipshit but you can get one.ibb DOT co SLASH V0wPTmc5
>>107336525>dipshitdebo's favorite word
>>107336525Hopefully the model isnt garbage to finetune, certainly has potential
>>107336153bruh even most dedicated anime models can't do jojo's style out of the box wtf. a booru finetune of z image would be insane>>107336276>>107336285z-image makes small optimizations to model architecture and is already BTFOing the bloatmaxxed, benchmaxxed models. this shows that if model bakers would actually use some more of the fucking optimization papers that get published, we'll be able to improve the size/quality ratio even more.it's almost over for bloatmaxxers, apicucks, and jews
>>107336518oof, not gonna lie but it hurts my eyes to see the mess of chroma radiance after seeing so many Z-image kino today
>>107336153damn impressive did i waste time learning prompt-fu? wtf
>>107336538>it's almost over for bloatmaxxersbased
>>107336518>all the posts so far are from SaaS?Characterizing them as SaaS is incorrect. It's normal for anon to showcase gens from new local models via demos hosted on the cloud. It's actually retarded for a lab to NOT host a demo to showcase their new shit.
>it'll come out within the hour!>well, sometime today. get ready!>actually, it'll be released tomorrow but for real this time!
>>107336575Do you think Comfy would waste his energy implementing the inference code if he knew the model wouldn't be released? lol
>>107336575https://huggingface.co/Tongyi-MAI/Z-Image-TurboAHAHAHAH, ITS HERE LMAOOOOOOOOO
>>107336575>>107336518IT'S UPhttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>>107336595>U. G. Krishnamurtihello sar
>>107336592>>107336595https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformerhow to you fusion multiple safetensors models though?
>>107336610please do the needful saar
>>107336612WHY do the model bakers do this?? is it so fucking hard for them to just publish a single fucking file???
>>107336612>how to you fusion multiple safetensors models though?and this shit is on f32... bruh :(
Fresh when ready >>107336625>>107336625>>107336625Fresh when ready
>>107336520ty. it does look interesting enough.>>107336542have there even been any comparable z-image images? to me it looked like the images posted were all realistic, not anime-comic 2d-3d 1girl>>107336595 >>107336592Great timing.
>>107336655>to me it looked like the images posted were all realistic, not anime-comic 2d-3d 1girllook the previous thread there's some anime images of z-image
>>107336655wtf is that tiling
Question, if I may:Why are y'all posting all this lame shit here, while the technology is way better these days? Look at what this dude posted (mildly NSFW): >>107336734
>>107336681retard is using some shit lora or bad lightning lora x qwen combo
>>107336980This thread is for GPU-poor weabus.
time
to
go
What's the new anime meta?
>>107336525y