[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: utils.png (679 KB, 1000x2000)
679 KB
679 KB PNG
a solid computer system needs a solid set of utilities, so which implementation is the most optimal and minimal? here is a comparison of how many lines of code it takes for the cat command in each toolset (note: loc is not a definitive measurement of optimization, but it can give a good idea). all implementations are written in C unless stated otherwise
gnu coreutils - 684 loc
freebsd utils - 470 loc
openbsd utils - 219 loc
busybox - 200 loc
rustybox - 136 loc (in rust)
toybox - 59 loc
sbase - 46 loc
plan 9/9front/9base - 36loc
so does plan9 win?
>>
>>101526441
I could rewrite "cat" in like 1 or 2 lines in a sufficiently high level language.
LOC means absolutely fucking nothing.
>>
File: xa.png (179 KB, 798x763)
179 KB
179 KB PNG
>>101526546
>中出し
aight
>>
>>101526546
ok but isn't the point of these programs that they are written in a lower level language for the purpose of performance and optimisation? because if you mean some shit like python (which by default compiles to c) the whole thing is like 40x slower at least
>>
>>101526635
The point of these programs is to perform a task or function.
And here's the thing you fail to note, the GNU coreutils version of cat does way more than the plan9 version, which account for the increased LOC.
If you want to argue that plan9 is more "pure" or "true to the unix philosphy", that's fine and dandy but the LOC is literally irrelevant to the discussion.
>>
>>101526675
well do you need the extra flags? trying for a bare-minimum use case here
>>
>muh lines of code
gnu is consistently faster than BSD despite more code
>>
back in 1980s passage light port switch triggered events were fashionable by average business

bigger events were configured to happen after about 48 days inactivity
>>
>>101526441
These all have less lines, but compare their performance instead
You will see the bigger ones all perform better in both time and speed, because they use more complicated algorithms with a better growth

A simpler example is big int multiplication
If you have a big number with 1000 words, it will take:
>long multiplication: a million steps
>karatsuba: around 55 thousand steps
>toom-cook: around 24 thousand steps
>schönhage–strassen: around 3300 steps
This last one is complicated as hell as it uses even fucking FFT, but it's fast as fuck

Number of lines is just an excuse from bad programmers who don't want to accept they don't know stuff and could learn if they were less arrogant
>>
>>101526675
usecase for the extra features?
>>
>>101526441
are you really comparing the quality of implementation by the least amount of lines? Mods have always been loose on 4chan but please ban retards like this.
>>
>>101526841
this, I don't care if emacs has all these features, I'm not using them!
>>
>>101526830
you're right about having a proper benchmark set up, but if it's just for outputting text, there's no way some fancy algorithm is going to make something as simple as that faster.
>>
>>101526936
GNU cat takes 12sec for a 15GB file to /dev/null here
Busybox cat takes 24sec

It's not just outputting text, you have buffering and optimized sizes for buffers, you have overhead using FILE* instead of the Linux fd directly, there's a function I forgot the name that lets you move data between fds without the kernel switching overhead

There's always more to the picture, and they could learn all that if they weren't arrogant
>>
>>101526441
>optimal and minimal?
These are orthogonal features. What's more, "optimality" varies from case to case. If you have use case A and the busybox version of cat doesn't support it, the fact that it might be faster than GNU in other cases isn't relevant.
For example, I have the GNU version of `diff` on my system. I can use `diff -q DIR1 DIR2` to compare the files in two directories passed to it, and have it tell me which file pairs differ. It also, however, tells me which filenames only exist in one or the other dir, which I often don't care about. But GNU diff doesn't offer an option to suppress that additional functionality, and I would switch it out for one that did if I performed this task often enough, regardless of whether it performed 20% slower than the GNU version in other situations.
>>
>>101526441
depends on the license. Avoid BSD
>>
>>101526441
LoC means nothing in a vacuum, show benchmarks and compare the feature set.
>>
>>101527394
>benchmarks
they mean nothing
show user feedback
>>
i use toybox on my linux distro
it is best imo
>>
>>101527163
on my system, I found out that memchr has a high startup cost and my custom sse2 memchr beats it consistently if character is guaranteed to be within 64, probably 48 bytes more realistically when searching from the middle of string which causes unaligned accesses, but by the time it gets into hundreds and thousands, memchr is around 60% faster and the gap keeps growing.
I also have a for loop for bottom line and even when char is guaranteed to be first byte in the string, dumb for loop, aka muh simplicity, is 1-2ns slower than a sse2 load.
This destroys nocodetrannies, I actually now use both glibc memchr and my own depending on context, and my program is noticeably even faster now, despite having more code due to having custom memchr for small searches.
I didn't bother checking why memchr beats my code though, because I can just use it when my own is inadequate.
>>
>>101530640
That's good, those algorithms tend to be slower with a small n, so using custom versions tend to help when it's used on smaller amount of data repeatedly

Slower algorithms are sometimes used with cryptography as well, as they have constant growth rate and don't leak the key through cpu/power/time side-channels

There's really lots of reasons to write more code for a task, I think those people just try to reduce the easiest metric as gaining performance or ram usage is much harder
>>
>>101526830
>>101527163
>>101530731
blue archive poster actually talking about technology
groundbreaking
>>
>>101530731
It's easy to reduce RAM usage, and it has same problem as reducing lines of code and writing only "simple code".
And this is very common among Dunning K&Rugers for some reason.
>>
>>101530748
Well, some cases are easy like not buffering, it's the time-space tradeoff: optimizing speed can be easily done as well (with humongous lookup tables)

Hard to judge software by just one metric, I guess these people's problem is that they don't want to accept better solutions exist, and instead of simply promoting their software as their vision for a problem, they invent some weird ideologies to explain why their software is actually the best
>>
>>101530799
Huge lookup tables are slower than just precomputing things on demand on modern hardware, I tried to beat std::to_chars with a lookup table that contains numbers up to USHORT_MAX and I got btfo even on table[0] case because doing few checks on the number is faster than fetching memory.

The "modern" hardware is Sandy Bridge from 2011, literally behaves the same as any newer CPU.
>>
>>101526546
He is right at this point. However at same level of abstraction it usually good to have a smaller amount of LOC (maintainability/readability is a good criteria).
>>101526675
Despite it being true (a fair comparison would include the same flags), GNU is known to not care much about resources usage (in general).
>>101527163
Warmed comparison.
for i in cat 9base-6/cat/cat sbase/cat ecore/src/cat; do                                                                                                      <
> echo $i
> time $i data.bin >/dev/null
> done
cat
0m00.37s real 0m00.04s user 0m00.33s system
9base-6/cat/cat
0m00.30s real 0m00.02s user 0m00.23s system
sbase/cat
0m01.01s real 0m00.21s user 0m00.79s system
ecore/src/cat
0m00.04s real 0m00.01s user 0m00.02s system

>>101526441
A comparison (using (elllc) ecc -Os -static):
$ file 9base-6/cat/cat sbase/cat ecore/src/cat
9base-6/cat/cat: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=8299124db6a62fd254102c0b9f03a665d5180a15, for GNU/Linux 4.4.0, stripped
sbase/cat: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=e4e24a9aacf20e79397a18dedf59ca0ae356bb5d, stripped
ecore/src/cat: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=aed358e6c73d5bbf08f2367302ccb4ee315346c1, stripped
$ size 9base-6/cat/cat sbase/cat ecore/src/cat
text data bss dec hex filename
689673 27432 24392 741497 b5079 9base-6/cat/cat
17384 232 656 18272 4760 sbase/cat
15466 464 10064 25994 658a ecore/src/cat
$ wc -l 9base-6/cat/cat.c sbase/cat.c ecore/src/cat.c
36 9base-6/cat/cat.c
52 sbase/cat.c
42 ecore/src/cat.c
130 total
>>
File: linux_gaming2.png (172 KB, 501x333)
172 KB
172 KB PNG
>>101526441
I think the desktop attracted too many end-users and gamers. The user-space
has turned into garbage for that audience. Windows as the Unix/Linux
Desktop was popular for 30 years. Now these casual "Enthusiasts" who are
paid agents force their 3rd-world-OS ideas into everyones ass.

The future is *BSD as a server only operating system. *BSD's strength is
that nobody cares about it.

Linux on the desktop is about to be replaced by WSL because Windows just is
the better Linux Desktop. Always has been. In the end convenience trumps
ideology. Why should someone install Linux with a Desktop that isn't the
Windows Desktop when you can have the Windows Desktop and WSL? Soon the
only people using a Linux Desktop will be the same people who are developing
it. Just leave them behind.

And who is still using Desktops anyways? It is either servers (Linux and
*BSD) or Apps on Smart Devices or dedicated Video Games hardware. Apps
don't need a full blown Desktop to be launched. Games don't need a full
blown Desktop to be launched. Desktop is just a workplace for sad people.
>>
>>101530847
>muh maintainability and readability
literal tranny dogwhistle for "I'm a fucking retard"
>>
>>101530845
That's true, large lookup tables tend to not fit on cache, so they end slower than native processor operations on various cases

>>101530847
What about GNU and BSD ones?
And this ecore one seems way too fast, can you share the code?
>>
>>101530878
in my benchmarks, I don't even bother simulating thrashing cache, std::to_chars is still faster in so many cases when everything is in cache that LUT is a pessimization.
>>
>>101530847
>ecore
are you him
>>
>>101530908
Didn't use C++ much, but I know it has compile time optimizations like std::move, can be hard to beat the default implementations when the compiler know what it has to do
>>
>>101530946
std::move is a cast to rvalue and using move constructors is always slower than not using any.
When I write code for myself not to be used by retard wagies, I delete my move constructors, literally 0 usecase, structure your program better.
>>
>>101526441
An optimal implementation of cat should use the splice(2) syscall on platforms that support it (since then the kernel will take care of it without copying memory to userspace). But it will still need another code path for cases that splice doesn't support, so it will be strictly less minimal in return.
>>
>>101531054
>splice
>linux only
>one end must be a pipe
>doesn't support all filesystems
meh
>>
>>101530878
>*1000
brainlet: *512 (+* 488)
genius: *128 (-* 3) *4
computationally 488+9 operations against 128+3*2+2 operations
>>
>>101526441
anything beyond this is bloat
.global _start

.common buf, 4096

.text

_start:
adrp x19, buf
add x19, x19, :lo12:buf
.loop:
mov w0, 0
mov x1, x19
mov w2, 4096
mov w8, 63
svc 0
cmp w0, 0
beq .end

mov w2, w0
mov w0, 1
mov w8, 64
svc 0
b .loop

.end:
mov w0, 0
mov w8, 93
svc 0

$ as -o cat.o cat.s
$ ld -static -o cat cat.o
$ strip cat
$ ls -l cat
-rwx------ 1 anon anon 528 Jul 23 11:05 cat
$ time head -c 100000000 /dev/urandom | ./cat >/dev/null

real 0m1.646s
user 0m0.082s
sys 0m1.564s
$ time head -c 100000000 /dev/urandom | cat >/dev/null

real 0m1.713s
user 0m0.093s
sys 0m1.645s
>>
>>101527163
ah yes i too daily cat 15 gig files, totally worth the +10000 loc!!!!
>>
>>101531487
Fast NVMe storage costs about 5 cents per gigabyte. Just how fucking poor are you?
>>
>>101531537
what does this have to do with anything?
>>
>>101531356
I don't get why ARM is considered as a RISC architecture, when the ARMv8 manual is 12000 pages longer than the latest Intel 64 manual
>>
>>101526441
>caring about how much lines of code a program has
mental illness
>>
>>101531776
What does loc have to do with anything?
>>
>>101532002
i dont want the base programs on my system to be backdoor hell, FAGGOT
less loc = less bugs
>>
>>101532011
>i dont want the base programs on my system to be backdoor hell, FAGGOT
Doesn't make the slightest difference unless you manually review the code yourself.
>>
>>101532025
guess what i fucking do moron?
its so easy to read toybox source code anyone can do it in 20 minutes while debugging everything
>>
>>101532049
And the kernel?
>>
>>101532052
my own microkernel with linux compability
>>
File: cattest.png (11 KB, 677x397)
11 KB
11 KB PNG
>>101527163
>GNU cat takes 12sec for a 15GB file to /dev/null here
>Busybox cat takes 24sec
you just made it up, didn't you?
by the way with 100M files, which I would guess is a more common case, busybox cat is twice as fast as GNU cat with 35ms vs 70ms
>>
I like busybox. Comes with everything and a kitchen sink and yet it's small and lightweight.
>>
>>101532311
its some avatarfag, ignore him
most of his posts are made up garbage
https://desuarchive.org/g/search/filename/sample_%2A/
>>
>>101532492
this one's got a buggy awk
toybox ships a better one
>>
>>101532877
What's buggy about it? I think it doesn't handle NULL well but other than that it's OK.
>>
musl
>>
>>101532940
you can't pass arrays by reference, for example
$ busybox awk 'BEGIN{a[5]; f(a)} function f(b){print length(b)}'
0

this works everywhere else
>>
>>101532877
>>101532940
>>101533636
that's why you should just use perl for scripting
don't @ me, i'm /threading
>>
File: threading.jpg (29 KB, 619x495)
29 KB
29 KB JPG
>>101533663
this you?
>>
>>101531487
OP is testing cat, but cp and mv would be similarly performant (and widely used for large files)

>>101532311
I didn't, works in my machine
In my case I'm used mysys2 and a single 15gb file. Like mentioned before it's probably the fact that complicated algorithms starts to be more efficient on larger data, where speed actually matters
>>
>>101534258
how often do you use cp?
>>
>>101534298
Copy performance makes difference when copying to a different device, more difference than lines of code when it was compiled
>>
>>101534258
NIGGER NIGGER NIGGER KYS AVATARFAG
>>
>>101526441
LoC has nothing to do with performance or quality of code.

I want all nocoders off my board.
>>
>>101534344
>coders
its programming
>LoC has nothing to do with performance
yes it does considering userspace is just syscalls, sse should be used for rest
>>
>>101534344
It's probably because it's the easy way of praising your projects: you make a thing, its slower than the rest, it has less features than the rest, so it's better because....it has less code!
>>
File: perl-pepe.png (1.1 MB, 1200x1200)
1.1 MB
1.1 MB PNG
>>101533663
Based Perlchad
>>
c go perl
>>
>>101526441
busybox + openssh is the smallest usable option.
>>
>>101526675
>does way more
what you fail to understand is this is the violation of the unix philosophy in the first place

keep it simple, retard
>>
>>101534258
>a single 15gb file
UUOC
>>
>>101533636
Interesting. Is that in POSIX or is it just convention? busybox tries to be POSIX compliant but nothing more. It does support some extensions though.
>>
>>101530943
> him
who
>>
>>101532057
share it or it doesn't exist
>>
>>101536715
minix
>>
File: plan 9.jpg (46 KB, 800x450)
46 KB
46 KB JPG
>>101526441
Plan 9 always wins. Unix design schemes are outdated to a post mainframe world.
>>
>>101536488
>Is that in POSIX
Yes
>>
>>101536488
"one true awk" is the best implementation for testing if something should be supported across implementations (if not that's a bug on them)



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.