/g/ - DGX Spark - Technology

Anonymous

DGX Spark 04/05/26(Sun)21:49:26 No.108537357

File: small_nvidia-dgx-spark-bundle.jpg (41 KB, 708x370)

DGX Spark Anonymous 04/05/26(Sun)21:49:26 No.108537357 Archived

I'm planning to get picrel, will be using it mostly for inference but I'll eventually get into training and development.
Requesting /g/'s opinion on this, I'll greatly appreciate if anons that own this hardware chime in and tell me their experiences. Thanks in advance.
inb4
>anon that's not for inference, it's slower than a 4090
I know memory bandwidth is slower, I'm ok with that

Anonymous
04/05/26(Sun)21:53:27 No.108537376

Anonymous 04/05/26(Sun)21:53:27 No.108537376

>>108537357
if you're going this far just spend more to future proof yourself

Anonymous
04/05/26(Sun)21:55:03 No.108537381

Anonymous 04/05/26(Sun)21:55:03 No.108537381

>>108537376
please elaborate on how can I future proof myself

Anonymous
04/05/26(Sun)21:55:30 No.108537384

Anonymous 04/05/26(Sun)21:55:30 No.108537384

>>108537381
spend more money

Anonymous
04/05/26(Sun)21:56:04 No.108537389

Anonymous 04/05/26(Sun)21:56:04 No.108537389

>>108537384
yeah but spend more money in what? 2x 5090?

Anonymous
04/05/26(Sun)22:12:48 No.108537461

Anonymous 04/05/26(Sun)22:12:48 No.108537461

>>108537389
if you are serious about trying to self host LLM's you should look into getting a decommed server, don't bother with this mac mini shit designed to sit in a datacenter for 2 years before becoming irrelevant

Anonymous
04/05/26(Sun)22:16:48 No.108537474

Anonymous 04/05/26(Sun)22:16:48 No.108537474

>>108537461
that's really not different than getting a high end workstation board with a lot of memory so not interested in that, I'm requesting experiences with this device

Anonymous
04/05/26(Sun)22:19:17 No.108537479

Anonymous 04/05/26(Sun)22:19:17 No.108537479

>>108537357
Last I heard this thing thermal throttles the shit out of itself to the point of being a waste of money.

Anonymous
04/05/26(Sun)22:20:07 No.108537488

Anonymous 04/05/26(Sun)22:20:07 No.108537488

>>108537479
Interesting, do you know if the ASUS one also thermal throttle?

Anonymous
04/05/26(Sun)22:21:12 No.108537496

Anonymous 04/05/26(Sun)22:21:12 No.108537496

>>108537357
my understanding of those is that they're for prototyping before rolling out onto enterprise level nvdia based systems so is maximized for that rather than brute force inference power. saying that, nvdia seems to be prioritizing its own products rather than hardware it sells to vendors, so the price is probably going to be more stable.

i understand the last batch of updates improved it for gaming. might be good.

Anonymous
04/05/26(Sun)22:26:58 No.108537516

Anonymous 04/05/26(Sun)22:26:58 No.108537516

>>108537496
Thanks for the insight anon. I'm more really interested in the perf per watt of this device compared to the 3090 that I currently have for inference. The fact that I could easily carry this mini pc with me and gen wherever I move is really appealing.

Anonymous
04/05/26(Sun)22:27:52 No.108537523

Anonymous 04/05/26(Sun)22:27:52 No.108537523

Unless you're already a data center admin that is going to be managing dgx servers or you need this to learn the OS for a cert/job then just spend that 4k on tokens.
The spark will be deprecated in a year when rubin comes out and you can always have the latest model/hardware in the cloud instead.

Unless you're going to be running it for the next 2 years straight 24/7 (you won't) then it's not economical.

Anonymous
04/05/26(Sun)22:44:11 No.108537607

Anonymous 04/05/26(Sun)22:44:11 No.108537607

>>108537516
>perf per watt
wel i understand they're arm systems with a custom linux distro stuck on top? i am not sure how much power that will save with a fucking blackwell welded into it but for what it's worth anyway.

in case you missed it
https://www.tomshardware.com/pc-components/gpus/nvidia-dgx-spark-review

Anonymous
04/05/26(Sun)22:53:04 No.108537649

Anonymous 04/05/26(Sun)22:53:04 No.108537649

>>108537523
>suggesting to buy AI tokens
opinion discarded
>>108537607
>they're arm systems with a custom linux distro stuck on top?
It's plain ubuntu with nvidia drivers built in and likely some other library / optimizations added
Thanks for the article, this idles at 30W and 160W under GPU load which seems to be related to the high speed NICs it has onboard, still that's half the power consumption of my 3090

Anonymous
04/05/26(Sun)22:56:50 No.108537670

Anonymous 04/05/26(Sun)22:56:50 No.108537670

>>108537649
>still that's half the power consumption of my 3090
well then you're winning. apparently it comes with all the software to network your generations with your x86 as well.

Anonymous
04/05/26(Sun)22:59:44 No.108537687

Anonymous 04/05/26(Sun)22:59:44 No.108537687

>>108537649
It only has 128gb of ram, any model you could fit on it will be cheap as fuck and last forever if you're paying for the tokens.

Anonymous
04/05/26(Sun)23:03:38 No.108537703

Anonymous 04/05/26(Sun)23:03:38 No.108537703

>>108537357
i bought 2x asus ascent gx10 boxes
haven't set them up yet bc i've been dealing with migraines, but i will probably post about them in a week or two once i finally get them all set up

Anonymous
04/05/26(Sun)23:08:08 No.108537721

Anonymous 04/05/26(Sun)23:08:08 No.108537721

>>108537357
is there ever a reason to actually use this or the strix halo things for local AI over a macbook pro with 5x the unified ram?

Anonymous
04/05/26(Sun)23:09:12 No.108537726

Anonymous 04/05/26(Sun)23:09:12 No.108537726

>>108537703
please create a thread anon with your observations anon, let me know any keywords you plan to use so I can monitor.

>>108537721
porn sloppa gen

Anonymous
04/05/26(Sun)23:10:46 No.108537733

Anonymous 04/05/26(Sun)23:10:46 No.108537733

>>108537721
Not unless you need it specifically for work

Anonymous
04/05/26(Sun)23:12:30 No.108537743

Anonymous 04/05/26(Sun)23:12:30 No.108537743

>>108537721
cuda. the dgx spark is meant as a workstation for prototyping enterprise shit that uses the nvdia ecosystem. it comes with a lot of networking hardware and software to facillitate that usecase.

Anonymous
04/05/26(Sun)23:13:16 No.108537748

Anonymous 04/05/26(Sun)23:13:16 No.108537748

>>108537721
oops I meant mac studio, forgot the macbooks only go to 128gb too apparently, but yeah the studio seems like the direct competitor and if you want to run the big models like deepseek/kimi/glm it'd be one of those vs 4+ anything else, assuming you're going for these low power all in one devices

Anonymous
04/05/26(Sun)23:16:35 No.108537761

Anonymous 04/05/26(Sun)23:16:35 No.108537761

>>108537743
that makes sense, I guess you have to go nvidia for a lot of use cases outside of standard inference

Anonymous
04/05/26(Sun)23:18:02 No.108537770

Anonymous 04/05/26(Sun)23:18:02 No.108537770

>>108537726
i'll probably just make sure "asus ascent gx10" is in the thread OP
most likely it will be the weekend of the 10th, or if not then, the weekend of the 17th

Anonymous
04/05/26(Sun)23:28:23 No.108537834

Anonymous 04/05/26(Sun)23:28:23 No.108537834

>>108537761
yeah pretty much. even if all this shit falls apart nvdia would still be standing as a major software vendor by that alone. their software runs in most non-ai data centers as well and it all comes with long term support contracts. jensen isn't stupid.

Anonymous
04/05/26(Sun)23:29:49 No.108537842

Anonymous 04/05/26(Sun)23:29:49 No.108537842

>>108537770
thanks anon, appreciate it
>>108537687
I don't care, I still want to run locally even if it's expensive

Anonymous
04/05/26(Sun)23:32:51 No.108537865

Anonymous 04/05/26(Sun)23:32:51 No.108537865

>>108537748
>>108537761
basically if this means something to you then you're in the pimps know crowd
>onboard ConnectX 7 NIC running at up to 200 Gbps.

Anonymous
04/05/26(Sun)23:44:37 No.108537932

Anonymous 04/05/26(Sun)23:44:37 No.108537932

>>108537865
not only that but it comes with 10Gbit copper networking as well

Anonymous
04/05/26(Sun)23:50:10 No.108537967

Anonymous 04/05/26(Sun)23:50:10 No.108537967

Not to go too off topic, but I was looking into an entry level solution before I go for a more expensive solution like the DGX Spark. I've been running some small LLMs locally on different 8 to 12vram GPUs (Nvidia and AMD) and want something a bit better. Basically it's either buy a cheaper ($1000-$1600) AMD or Mac box with unified memory for VRAM maxxing or a used 3090 on ebay. Mostly use LLMs for logic questions, translating, and code, but would like to be able to train it and have it use tools for web searching or random shit I think of.

Anonymous
04/05/26(Sun)23:55:05 No.108537990

Anonymous 04/05/26(Sun)23:55:05 No.108537990

>>108537726
I have a Spark, currently running some 15 services in a mix of GPU and traditional home server applications.
No complaints - it’s small, quiet, ideal to run 24/7 at home as a personal server.
On AI, I have a fine tuned llama3 LLM that knows a lot about my personal finances and provides investment advisory. Also ComfyUI for some elaborate, custom pr0n.
The power hog loud old x86 tower is now a gaming machine that is only on when running a game.

Anonymous
04/06/26(Mon)00:48:42 No.108538226

Anonymous 04/06/26(Mon)00:48:42 No.108538226

>>108537990
Thank you anon, how's the performance from your perspective? (subjective), does it feel snappy?

Anonymous
04/06/26(Mon)03:49:29 No.108538803

Anonymous 04/06/26(Mon)03:49:29 No.108538803

>>108537357
https://old.reddit.com/r/LocalLLaMA/comments/1scf1x8/dont_buy_the_dgx_spark_nvfp4_still_missing_after/

Anonymous
04/06/26(Mon)07:55:28 No.108539759

Anonymous 04/06/26(Mon)07:55:28 No.108539759

>>108538803
seems NVFP4 works on the Spark
https://build.nvidia.com/spark/nvfp4-quantization/overview

Anonymous
04/06/26(Mon)13:52:20 No.108542257

Anonymous 04/06/26(Mon)13:52:20 No.108542257

>>108537357
I have the Jetson nano. It works but the last version of Linux 4 Tegra they officially support is Ubuntu 20.04. You have to go with Armbian or unofficial images to be able to run 24.04 or later.

From what I read online the DGX spark situation seems a bit better but Nvidia's tooling is absolute dogshit and I'm certain you're going to have a little bit of a bad time. Watch out.

Anonymous
04/06/26(Mon)14:35:16 No.108542597

Anonymous 04/06/26(Mon)14:35:16 No.108542597

>>108542257
thanks anon, interesting feedback, is the nano fully functional with armbian / unofficial images?

Anonymous
04/06/26(Mon)15:12:22 No.108542883

Anonymous 04/06/26(Mon)15:12:22 No.108542883

>>108542597
It is functional but you're going to miss out on some modern software. For example I was unable to run plezy on it because the GLIBC it supports is too old.

Anonymous
04/06/26(Mon)15:31:00 No.108543015

Anonymous 04/06/26(Mon)15:31:00 No.108543015

I have the Spark (Asus GX10 version). I can't recommend it, but it's hard for me to judge fairly because I have a TR workstation with 384GB.

My company bought the Spark for testing, but it was too slow for the particular application we were working on, and now it's idle. I thought it might be good to use it to run local agents that would test code on my workstation, leaving the workstation's memory available, but I haven't been able to find any useful models in the 128GB class. The focus is on either much smaller models that would work on a regular GPU, or much larger ones that want a cluster.

I also have a Strix Halo board that I'd take over the Spark, although it's not very good either. It's somewhat slower than Spark, especially PP, but they're both slow enough that it's hard to care. But with Strix Halo, it's much cheaper, can have a faster GPU added rather easily, and is x86 (the ARM processor in the Spark complicates things).

Anonymous
04/06/26(Mon)15:45:17 No.108543107

Anonymous 04/06/26(Mon)15:45:17 No.108543107

>>108537357
I have a gx10 and it works great. This thing is essentially a 5070 with big vram. I've use it for testing stuff like flux2 dev and multi agent. If I have the money I would definitely buy 2.
As for power efficiency, there's not much you can do to undervolt or power cap. The only config worth using is the power mizer mode (something like that I forgot the name) which lets the gpu frequency drop when idle (default is max when idle).
An underrated use is to run multiple services together. If you rarely run them simultaneously then spark is a good fit because models are already in vram. MoE runs great, too.
And no, I tried overclock the vram but it's actually system ram so you can't overclock with nvidia-smi. BIOS also doesn't expose vram or system ram overclocking settings.

Anonymous
04/06/26(Mon)16:00:04 No.108543228

Anonymous 04/06/26(Mon)16:00:04 No.108543228

>>108542883
I don't understand, how is it missing modern software if it's using an unofficial/alternative OS? Or is it locked to a particular ligc / kernel version and it doesn't work with anything else?

Anonymous
04/06/26(Mon)16:02:38 No.108543245

Anonymous 04/06/26(Mon)16:02:38 No.108543245

File: maxresdefault.jpg (94 KB, 1280x720)

94 KB JPG

>>108537357
>Requesting /g/'s opinion on this
idk about DGX Spark but I have DJI Spark

Anonymous
04/06/26(Mon)16:03:21 No.108543256

Anonymous 04/06/26(Mon)16:03:21 No.108543256

>>108543015
I see, yours a particular scenario but interesting for sure. I wish I could just get an AMD and be done with it.
>>108543107
Interesting, will take a not of the frequency. Fucking nvidia makes it so hard to finetune their hardware
>>108543245
>DJI Spark
Sounds jeet coded anon, I'm sorry

Anonymous
04/06/26(Mon)16:09:21 No.108543295

Anonymous 04/06/26(Mon)16:09:21 No.108543295

I suppose if you don't do training then mac studio + nvidia egpu is an option worth considering

Anonymous
04/06/26(Mon)16:20:48 No.108543353

Anonymous 04/06/26(Mon)16:20:48 No.108543353

>>108543228
NVIDIA officially supports it up to ubuntu 18.04 (!). I found prebuilt images for 20.04, and if you want a 24.04 image you have to build it yourself.

GLIBC on the 20.04 image is ancient and thus I can't compile the dependencies modern plezy uses. You need to upgrade the OS to do that.

Anonymous
04/06/26(Mon)16:23:38 No.108543373

Anonymous 04/06/26(Mon)16:23:38 No.108543373

>>108543353
thinking you'll need cross compiling, interesting

Anonymous
04/06/26(Mon)17:08:34 No.108543676

Anonymous 04/06/26(Mon)17:08:34 No.108543676

>>108537357
Spark owner here as well. I got the Lenovo variant. I used it mainly to build processing pipelines and for experimenting. The 128gb of memory allow you to load multiple models concurrently, but it’s a tight squeeze and the unified memory is somewhat of a foot-gun if you don’t size properly.

Anonymous
04/06/26(Mon)17:10:16 No.108543683

Anonymous 04/06/26(Mon)17:10:16 No.108543683

>>108543373
I actually just failed to build the 24.04 image due to some limitations. Trying to use this: https://github.com/pythops/jetson-image

It seems it's ubuntu 20.04 tops. It's pretty much ewaste now, it wouldn't be if Manjaro ARM was available.

Anonymous
04/06/26(Mon)17:16:01 No.108543721

Anonymous 04/06/26(Mon)17:16:01 No.108543721

>>108537389
just buy a superpod

Anonymous
04/06/26(Mon)18:41:35 No.108544233

Anonymous 04/06/26(Mon)18:41:35 No.108544233

File: Screenshot_20260406_183948.png (271 KB, 1531x1300)

271 KB PNG

>>108543683
are you sure your board is orin nano? I just built the 24.04 image successfully out of curiosity

Anonymous
04/06/26(Mon)18:50:41 No.108544297

Anonymous 04/06/26(Mon)18:50:41 No.108544297

>>108544233
I have the regular B01 jetson nano. No good.

Anonymous
04/07/26(Tue)02:38:08 No.108546544

Anonymous 04/07/26(Tue)02:38:08 No.108546544

At $3000 price tag it's hard to beat its value.

Anonymous
04/07/26(Tue)12:36:57 No.108549789

Anonymous 04/07/26(Tue)12:36:57 No.108549789

completely pointless product, waste of ram

Anonymous
04/07/26(Tue)14:05:57 No.108550540

Anonymous 04/07/26(Tue)14:05:57 No.108550540

>>108549789
not all of us are gaymers anon