/g/ - syscall of the day: pipe - Technology

Anonymous

syscall of the day: pipe 01/25/26(Sun)10:03:15 No.107964764

File: yaoification.png (3.32 MB, 1872x2000)

syscall of the day: pipe Anonymous 01/25/26(Sun)10:03:15 No.107964764 Archived

previous: >>107957203
#define __NR_pipe                22
https://man7.org/linux/man-pages/man2/pipe.2.html
https://man7.org/linux/man-pages/man7/pipe.7.html

this is probably one of the most useful syscalls of all time. it's honestly not all that complicated, but the utility of a pipe and fork is really unparalleled.
well.... i guess actually after looking at man 7 pipe, it can be a little complicated. but this is actually one of the topics where i don't feel the need to get discussion started all that much, since i actually see /g/ post about pipes a lot. so hopefully this will be a good thread!

relevant resources:
man man
man syscalls
https://man7.org/linux/man-pages/
https://linux.die.net/man/
https://elixir.bootlin.com/linux/
https://elixir.bootlin.com/musl/
https://elixir.bootlin.com/glibc/

Anonymous
01/25/26(Sun)11:48:27 No.107965557

Anonymous 01/25/26(Sun)11:48:27 No.107965557

File: so funny lol poop xd.jpg (51 KB, 600x404)

51 KB JPG

>man pipe

Anonymous
01/25/26(Sun)14:03:49 No.107966536

Anonymous 01/25/26(Sun)14:03:49 No.107966536

>>107965557
lel

Anonymous
01/25/26(Sun)15:06:09 No.107967035

Anonymous 01/25/26(Sun)15:06:09 No.107967035

bampu

Anonymous
01/25/26(Sun)17:24:03 No.107968122

Anonymous 01/25/26(Sun)17:24:03 No.107968122

no interest in pipes ( ._.)

Anonymous
01/25/26(Sun)17:26:45 No.107968140

Anonymous 01/25/26(Sun)17:26:45 No.107968140

>>107965557
man makes every command funny

Anonymous
01/25/26(Sun)23:34:00 No.107970406

Anonymous 01/25/26(Sun)23:34:00 No.107970406

>>107968140
it's a nice bonus

Anonymous
01/25/26(Sun)23:53:29 No.107970519

Anonymous 01/25/26(Sun)23:53:29 No.107970519

>>107964764
I shall effortpost.
Is there a way to have a pipe which stays open?
Specifically, I want a named pipe file, and I want to have an epoll event on it waiting for writes.
I want programs to be able to arbitrarily write to the pipe.
The problem is that once a program finishes writing and closes its end, the pipe is now closed, and epoll_wait always immediately returns EPOLLHUP.
The solution I have now is to re-create the pipe, but that's kind of retarded.
I just want a way to have an open, named "port" between processes on a local machine (no networking), which any process can write to, and the application can indefinitely read from.
Am I going about this the wrong way?

Anonymous
01/25/26(Sun)23:55:21 No.107970529

Anonymous 01/25/26(Sun)23:55:21 No.107970529

>>107970519
Actually, this is for mkfifo, not pipe. My bad.
Still, if anyone knows the solution, I'd appreciate it.

Anonymous
01/25/26(Sun)23:58:44 No.107970539

Anonymous 01/25/26(Sun)23:58:44 No.107970539

Should pipe be used for high throughput or latency sensitive applications, or should you use shared memory instead? I would think the syscalls involved for reading and writing make it only useful for messages, rather than data movement.

Anonymous
01/26/26(Mon)00:01:46 No.107970549

Anonymous 01/26/26(Mon)00:01:46 No.107970549

How does bash | work with Linux pipes? Does it create a pipe fd, then pass that into the next program? How does it pass it to the program? How does the program know which fd to read from?

Anonymous
01/26/26(Mon)00:26:12 No.107970660

Anonymous 01/26/26(Mon)00:26:12 No.107970660

>>107970549
>How does the program know which fd to read from?
Each programs starts with 3 fds open
0 stdin
1 stdout
2 stderr
and thus you always spit out your output to stdout and get input from stdin
>How does it pass it to the program?
That's where the magic of the combo fork into exec comes into the play
bash creates a pipe gives the write side to the program on the left of | and read one to the program on the right
>but anon you just said programs always write/read to/from stdout and stdin and file descriptors seem to be constant for those so how does that make any sense
thats where dup2 comes into play it allows you to change where the stdout and stdin goes to/from and bash can change with it stdin to come from the pipe and not from the keyboard
so bash does something like
>create a pipe
>fork -> dup(pipe.write, STDOUT_FILENO) -> exec(left_program)
>fork -> dup(pipe.read, STDIN_FILENO) -> exec(right_program)
'>' work similarly you just dup into some file and not a program
you should try to write your own shell or just 'system' from stdlib its pretty cute

Anonymous
01/26/26(Mon)00:28:47 No.107970671

Anonymous 01/26/26(Mon)00:28:47 No.107970671

remember to open our pipes and files with O_CLOEXEC

Anonymous
01/26/26(Mon)00:35:44 No.107970703

Anonymous 01/26/26(Mon)00:35:44 No.107970703

>>107970660
>thats where dup2 comes into play it allows you to change where the stdout and stdin goes to/from and bash can change with it stdin to come from the pipe and not from the keyboard
neat

Anonymous
01/26/26(Mon)00:39:21 No.107970719

Anonymous 01/26/26(Mon)00:39:21 No.107970719

>>107970519
Open it for both reading and writing in the polling process

Anonymous
01/26/26(Mon)00:41:20 No.107970726

Anonymous 01/26/26(Mon)00:41:20 No.107970726

>>107970519
named unix domain sockets?
>>107970539
pipes with basic write/read kind of slow because data has to be constantly be copied bunch of times
write has to copy data from the userspace into kernel
and read has to copy data from the kernel into userspace
so io_uring / shared memory is probably better
but ultimately it depends on what you are doing

Anonymous
01/26/26(Mon)00:44:42 No.107970740

Anonymous 01/26/26(Mon)00:44:42 No.107970740

>>107970719
I'll try that, thanks.

Anonymous
01/26/26(Mon)00:53:38 No.107970776

Anonymous 01/26/26(Mon)00:53:38 No.107970776

>>107970549

#include <unistd.h>

int
main(void) {
  int fds[2];
  pipe(fds);
  if (fork() == 0) {
    close(fds[1]);
    dup2(fds[0], 0);
    execlp("wc", "wc", NULL);
  }
  else {
    close(fds[0]);
    dup2(fds[1], 1);
    execlp("ls", "ls", NULL);
  }
}

ls | wc

Anonymous
01/26/26(Mon)07:44:54 No.107972628

Anonymous 01/26/26(Mon)07:44:54 No.107972628

I understand make -j 9 uses pipes to keep the child processes from getting out of hand. It makes a pipe and keeps hold of both ends. Then it writes 9 bytes onto the pipe, and each child reads a byte before spawning a process, and then writes a byte when it reaps.

Reading the man page I see mention of
O_NOTIFICATION_PIPE
, which I had never heard of. You might look at
ioctl_pipe(2)
, but the man pages are inadequate on this. I found https://www.kernel.org/doc/Documentation/watch_queue.rst and <linux/watch_queue.h>. It appears to be a generic mechanism that is in fact only used for "keys and keyrings". I don't understand why they can't just use a regular pipe... Okay, here's some defense, but I'm still unimpressed by it... couldn't the kernel just write to a page from userspace, and, IDK, sent a signal or written a byte to an empty pipe...? https://kernsec.org/pipermail/linux-security-module-archive/2019-November/016990.html

Anonymous
01/26/26(Mon)08:19:45 No.107972870

Anonymous 01/26/26(Mon)08:19:45 No.107972870

>>107964764
I really like pipes, they're a simple yet really effective way of doing IPC, I just wish they had some type semantics.
It's a little annoying to hand-roll your own binary format every time you use them, a distinction between "byte-oriented" and "C-typed" pipes would've been nice.
>Is there a way to have a pipe which stays open?
stays open compared to what? pipes will stay open until both sides are closed. named pipes (mkfifo) can be used to create pipes that persist even after being closed.
>>107970529
>Still, if anyone knows the solution, I'd appreciate it.
just have your host process create a fifo and open it for reading. Then just epoll the pipe until someone has written to it. What's hindering you?

Anonymous
01/26/26(Mon)08:25:07 No.107972906

Anonymous 01/26/26(Mon)08:25:07 No.107972906

>>107970726
I hate the shm_ API with a passion. Luckily I've never had to write performance critical applications with that sort of IPC, lol

Anonymous
01/26/26(Mon)09:37:00 No.107973406

Anonymous 01/26/26(Mon)09:37:00 No.107973406

>>107972870
>What's hindering you?
I'm using mkfifo. When the writer finishes writing, epoll now simply returns EPOLLHUP on the fifo, rather than waits for another writer.
The other suggestion was to have the reader also open it for writing, that way the kernel can't fully close the pipe. I'll try that.

Anonymous
01/26/26(Mon)11:06:00 No.107974065

Anonymous 01/26/26(Mon)11:06:00 No.107974065

>>107970660
>>107970776
That's a horrible design. Any kind of shared data should be passed in explicitly to the new process, instead of randomly sharing arbitrary parts. Fork is also a really bad design.

Anonymous
01/26/26(Mon)11:08:26 No.107974091

Anonymous 01/26/26(Mon)11:08:26 No.107974091

>>107974065
>windows baby seething
also it is explicitly passed via stdin

Anonymous
01/26/26(Mon)11:08:51 No.107974095

Anonymous 01/26/26(Mon)11:08:51 No.107974095

>>107974065
it is being shared explicitly, though. all programs expect to read from stdin, which it is

Anonymous
01/26/26(Mon)11:17:17 No.107974151

Anonymous 01/26/26(Mon)11:17:17 No.107974151

>>107974065
You pass them to the new process by not using O_CLOEXEC
This is suboptimal but that's backward compatibility for you

Anonymous
01/26/26(Mon)11:29:49 No.107974268

Anonymous 01/26/26(Mon)11:29:49 No.107974268

>>107974151
using pidfd_getfd is so unwieldy, though

Anonymous
01/26/26(Mon)11:39:28 No.107974351

Anonymous 01/26/26(Mon)11:39:28 No.107974351

>>107974268
Just inverting it so that FDs are only inherited if you add a flag rather than the other way around would already be a step up.
And making the standard file descriptors special because opening a file and accidentally having it become stderr and logging into it is really stupid.
Those are bandaid fixes and if you were breaking backward compact anyway you could do better. But I can dream.

Anonymous
01/26/26(Mon)11:40:11 No.107974357

Anonymous 01/26/26(Mon)11:40:11 No.107974357

>>107974091
>>windows baby seething
Windows does it right.

>also it is explicitly passed via stdin
"Explicitly passed" would mean that "execlp" would take it as an argument (and maybe "inherit them from the creating process" is an option, but still you have the choice instead of randomly leaking information to the new process).

>>107974095
Stdin is just a convention and it's not being shared explicitly. It's copied implicitly by fork and kept by execlp. The kernel randomly decides that some things belong to the new process and not others. What if file I/O was implemented in a dynamic library instead of in the kernel? Fork breaks a lot of things. The better way is for the create process operation to take all the information that the new process needs instead of implicit copying and replacing. Fork shouldn't exist.

Anonymous
01/26/26(Mon)11:44:44 No.107974402

Anonymous 01/26/26(Mon)11:44:44 No.107974402

>>107974351
i actually fully agree with your first point, but honestly, i am more inclined to the opinion that inheriting process memory after a fork is inherently bugprone due to the non-deterministic nature of thread death and global variable state
well actually, i take that back. forking is fine. i don't like the concept of threads. it shouldn't be possible to share memory like that. i think if two separate control flows need access to the same memory region, it ought to be accomplished via some dedicated shared memory scheme. it's too easy to shoot yourself in the foot with threads otherwise
>>107974357
part of the above applies to your post, too

Anonymous
01/26/26(Mon)11:50:33 No.107974450

Anonymous 01/26/26(Mon)11:50:33 No.107974450

>>107974402
>well actually, i take that back. forking is fine. i don't like the concept of threads. it shouldn't be possible to share memory like that. i think if two separate control flows need access to the same memory region, it ought to be accomplished via some dedicated shared memory scheme. it's too easy to shoot yourself in the foot with threads otherwise
Threads are a good idea. Actually the original "fork" that Dennis Ritchie stole from was more like threads, and sharing variables is fundamental in most programming languages. Unix "fork" did it wrong because he didn't understand how it worked.

Anonymous
01/26/26(Mon)11:59:58 No.107974521

Anonymous 01/26/26(Mon)11:59:58 No.107974521

>>107974450
i just don't like threads. i really am inclined to believe that if you feel like you need to share variables between threads, your program's architecture is probably flawed. for areas where you really do need to share it for whatever reasons, maybe graphics stuff or that sort of thing, a dedicated shared memory scheme is still going to be the best way to do it
i guess if you wanted threads, then i would lean more towards all writable data being thread-local/COW, and anything shared again needs to go to the dedicated shared memory scheme

Anonymous
01/26/26(Mon)12:12:21 No.107974628

Anonymous 01/26/26(Mon)12:12:21 No.107974628

>>107974521
Separate "address spaces" are fictional. The ideal OS design has one address space shared by everything. So in that case every process is like a thread. This reduces a lot of overhead and makes everything a lot simpler. That's also how the computer is actually designed to be used. This is another flaw of Unix fork because it prevents you from designing your OS like that.

Anonymous
01/26/26(Mon)12:16:49 No.107974663

Anonymous 01/26/26(Mon)12:16:49 No.107974663

>>107974628
i really strongly disagree with that, lol. abstractions exist for a reason. in this case, allowing a "new" address space, one which is effectively unlimited on 64-bit systems, is extremely useful from the standpoint of program design because it massively simplifies the logic necessary. you don't have to care what other programs are doing, what the actual disk looks like, etc. honestly, this is part of why i dislike threads. having multiple threads violates this assumption, because things can get changed out from under you. a single dedicated address space per single control flow would fix so many logic errors. by forcing program designers to explicitly specify when they want to interact with memory which is shared across processes, it forces them to not make assumptions about the safety of the memory

Anonymous
01/26/26(Mon)12:25:55 No.107974734

Anonymous 01/26/26(Mon)12:25:55 No.107974734

>>107974521
holy this a peak nocoder/dunninng-kruger posts
>all writable data being thread-local/COW
>lets make everything fucking slow and complex for no reason
>>107974628
>The ideal OS design has one address space shared by everything
>one stupid/malicious program can bring down the entire system
cool os design bro

Anonymous
01/26/26(Mon)12:30:10 No.107974760

Anonymous 01/26/26(Mon)12:30:10 No.107974760

>>107974663
>>107974734
Single address space operating systems are a lot more secure than mainstream operating systems.

Anonymous
01/26/26(Mon)12:31:28 No.107974774

Anonymous 01/26/26(Mon)12:31:28 No.107974774

>>107974734
i mean, i respectfully disagree. i have a pretty deeply technical job, and i work with low level systems every day. can you explain to me why you think my ideas are bad, rather than just calling me a nocoder? i am not even saying my idea *has to* be correct. i just really have encountered a lot of issues with the multi-threaded programming paradigm and so would be in favor of an alternative
to briefly address your points, i don't think what i want would inherently make things any slower or more complicated than the current design. you could still have decided multi-processing. so if you, for example, wanted to perform chunked cryptography, you'd have multiple processes, each with the specific goal of performing a given operation on a chunk of data. it's pretty much fundamentally the same concept as threads, just removing the volatility of memory which is implicitly (rather than explicitly) shared

Anonymous
01/26/26(Mon)12:36:51 No.107974832

Anonymous 01/26/26(Mon)12:36:51 No.107974832

>>107964764
>posting syscalls every day on /g/
This kind of autism is why I really like /g/.
>>107965557
This is the other reason.

Anonymous
01/26/26(Mon)12:37:33 No.107974838

Anonymous 01/26/26(Mon)12:37:33 No.107974838

>>107974402
Meh, I enjoy threading in Rust. It doesn't solve all the footguns but it does OK. And other languages that ban user-mode shared mutability seem to do even better.
Even granting that threads are a bad tool for most use cases I don't know that that means they shouldn't be available. If removing threads could provide additional fundamental system-level guarantees then maybe, but otherwise you could instead nudge people toward the less-sharp solutions and allow them to make their own tradeoffs.

>>107974450
Many parts of Unix are bad but not everything is a strictly inferior ripoff of its inspirations.

>>107974628
Computers haven't been designed for that for a very long time.
Multics actually did execute programs all in the same process, more like subroutine calls (not even threads), but it still had oodles of memory mapping with half a dozen rings.

Anonymous
01/26/26(Mon)12:41:09 No.107974870

Anonymous 01/26/26(Mon)12:41:09 No.107974870

>>107974760
i don't know whether that's true. could you explain the logic behind why you believe this to be the case?

Anonymous
01/26/26(Mon)13:03:58 No.107975043

Anonymous 01/26/26(Mon)13:03:58 No.107975043

>>107974774
>so if you, for example, wanted to perform chunked cryptography, you'd have multiple processes, each with the specific goal of performing a given operation on a chunk of data
first of all how the fuck do you have problems with "memory volatility" in a problem like that? each thread gets its own arena to put the result on, scratch arena to do temp alloc, and a read-only chunk to do operation, woah memory volatility solved
but anyway
slow in comparison to just doing that in a single process
nightmare to debug (is there even a way to pause all the forks?)
not portable (emulating fork on windows is slow so you would have to end up using threads)
sure you removed "volatility" of the memory but that didn't come for free e.g. since you use fork a lot more you have a higher chance of fork bombing yourself
you are not gaining (possibly even losing) but giving up a lot of perf and debug capabilities , so what is the point?
the alternative is better design, better tooling, better codebases

Anonymous
01/26/26(Mon)13:10:41 No.107975096

Anonymous 01/26/26(Mon)13:10:41 No.107975096

>>107975043
every call from each thread into a standard library is doing any number of indeterminate things with global variables (there's a whole thing in the man pages about mt and async safety, for example), so if for whatever reason this gets interrupted, you could face quite a lot of issues
>slow in comparison to just doing that in a single process
how so?
>nightmare to debug (is there even a way to pause all the forks?)
not really. if you're doing it the "right way", you should only need to debug a single implementation of the code, since it's running the same instructions, just on different data
but if you really needed to stop them all, you could just send them all SIGSTOPs, yeah
>not portable
oh, yeah. i would want this to be an OS-level architecture decision. portability wouldn't apply here
>you have a higher chance of fork bombing yourself
i mean, wouldn't thread bombing effectively be the same thing in this scenario?

Anonymous
01/26/26(Mon)13:27:56 No.107975211

Anonymous 01/26/26(Mon)13:27:56 No.107975211

>>107975096
>every call from each thread into a standard library
simple, stop using stdlib its dogshit in every single way
if youre serious about your work you already have your own C codebase
>how so?
how do you share the date in multiproccess and ensure memory volatility without making copies all over the place / putting down synchronization mechanism / having to involve kernel in each single talking between processes? in multithread you just memmove data into some spot, signal some condvar and thats it
>since it's running the same instructions, just on different data
in your problem statement (which is like i said quite simple to do with threads ensuring safety) what if you have a problem when you need to start communicating and run different instructions? have fun debugging that
>i mean, wouldn't thread bombing effectively be the same thing in this scenario?
yea, what im saying that in GENERAL mutliprocess isn't a free lunch

Anonymous
01/26/26(Mon)13:52:27 No.107975405

Anonymous 01/26/26(Mon)13:52:27 No.107975405

>>107975211
>how do you share the date in multiproccess
i mean, the same way you do now. the only difference being you'd have to explicitly mark the mapping as shared. something like https://man7.org/linux/man-pages/man7/shm_overview.7.html
>what if you have a problem when you need to start communicating and run different instructions?
ideally you'd minimize the post-fork logic to only do one thing. basically like throwing chunks off into a queue to be processed, and then receiving the data once it's done

Anonymous
01/26/26(Mon)14:16:47 No.107975632

Anonymous 01/26/26(Mon)14:16:47 No.107975632

>>107975405
>something like https://man7.org/linux/man-pages/man7/shm_overview.7.html
so something that is slower than just memoving within one address space
also shm has no guarantees when it comes to memory so process can still fuck each other up
and you are introducing more dependency on random stuff and thus introducing more points of failure
>ideally you'd minimize the post-fork logic to only do one thing
cannot be done in most problems