previous: >>107957203#define __NR_pipe 22https://man7.org/linux/man-pages/man2/pipe.2.htmlhttps://man7.org/linux/man-pages/man7/pipe.7.htmlthis is probably one of the most useful syscalls of all time. it's honestly not all that complicated, but the utility of a pipe and fork is really unparalleled.well.... i guess actually after looking at man 7 pipe, it can be a little complicated. but this is actually one of the topics where i don't feel the need to get discussion started all that much, since i actually see /g/ post about pipes a lot. so hopefully this will be a good thread!relevant resources: man manman syscallshttps://man7.org/linux/man-pages/https://linux.die.net/man/https://elixir.bootlin.com/linux/https://elixir.bootlin.com/musl/https://elixir.bootlin.com/glibc/
#define __NR_pipe 22
man man
man syscalls
>man pipe
>>107965557lel
bampu
no interest in pipes ( ._.)
>>107965557man makes every command funny
>>107968140it's a nice bonus
>>107964764I shall effortpost.Is there a way to have a pipe which stays open?Specifically, I want a named pipe file, and I want to have an epoll event on it waiting for writes.I want programs to be able to arbitrarily write to the pipe.The problem is that once a program finishes writing and closes its end, the pipe is now closed, and epoll_wait always immediately returns EPOLLHUP.The solution I have now is to re-create the pipe, but that's kind of retarded.I just want a way to have an open, named "port" between processes on a local machine (no networking), which any process can write to, and the application can indefinitely read from.Am I going about this the wrong way?
>>107970519Actually, this is for mkfifo, not pipe. My bad.Still, if anyone knows the solution, I'd appreciate it.
Should pipe be used for high throughput or latency sensitive applications, or should you use shared memory instead? I would think the syscalls involved for reading and writing make it only useful for messages, rather than data movement.
How does bash | work with Linux pipes? Does it create a pipe fd, then pass that into the next program? How does it pass it to the program? How does the program know which fd to read from?
>>107970549>How does the program know which fd to read from?Each programs starts with 3 fds open0 stdin 1 stdout2 stderrand thus you always spit out your output to stdout and get input from stdin>How does it pass it to the program?That's where the magic of the combo fork into exec comes into the playbash creates a pipe gives the write side to the program on the left of | and read one to the program on the right >but anon you just said programs always write/read to/from stdout and stdin and file descriptors seem to be constant for those so how does that make any sense thats where dup2 comes into play it allows you to change where the stdout and stdin goes to/from and bash can change with it stdin to come from the pipe and not from the keyboardso bash does something like>create a pipe>fork -> dup(pipe.write, STDOUT_FILENO) -> exec(left_program)>fork -> dup(pipe.read, STDIN_FILENO) -> exec(right_program) '>' work similarly you just dup into some file and not a programyou should try to write your own shell or just 'system' from stdlib its pretty cute
remember to open our pipes and files with O_CLOEXEC
>>107970660>thats where dup2 comes into play it allows you to change where the stdout and stdin goes to/from and bash can change with it stdin to come from the pipe and not from the keyboardneat
>>107970519Open it for both reading and writing in the polling process
>>107970519named unix domain sockets?>>107970539pipes with basic write/read kind of slow because data has to be constantly be copied bunch of timeswrite has to copy data from the userspace into kerneland read has to copy data from the kernel into userspaceso io_uring / shared memory is probably betterbut ultimately it depends on what you are doing
>>107970719I'll try that, thanks.
>>107970549#include <unistd.h>intmain(void) { int fds[2]; pipe(fds); if (fork() == 0) { close(fds[1]); dup2(fds[0], 0); execlp("wc", "wc", NULL); } else { close(fds[0]); dup2(fds[1], 1); execlp("ls", "ls", NULL); }}ls | wc
#include <unistd.h>intmain(void) { int fds[2]; pipe(fds); if (fork() == 0) { close(fds[1]); dup2(fds[0], 0); execlp("wc", "wc", NULL); } else { close(fds[0]); dup2(fds[1], 1); execlp("ls", "ls", NULL); }}
I understand make -j 9 uses pipes to keep the child processes from getting out of hand. It makes a pipe and keeps hold of both ends. Then it writes 9 bytes onto the pipe, and each child reads a byte before spawning a process, and then writes a byte when it reaps.Reading the man page I see mention of O_NOTIFICATION_PIPE, which I had never heard of. You might look at ioctl_pipe(2), but the man pages are inadequate on this. I found https://www.kernel.org/doc/Documentation/watch_queue.rst and <linux/watch_queue.h>. It appears to be a generic mechanism that is in fact only used for "keys and keyrings". I don't understand why they can't just use a regular pipe... Okay, here's some defense, but I'm still unimpressed by it... couldn't the kernel just write to a page from userspace, and, IDK, sent a signal or written a byte to an empty pipe...? https://kernsec.org/pipermail/linux-security-module-archive/2019-November/016990.html
O_NOTIFICATION_PIPE
ioctl_pipe(2)
>>107964764I really like pipes, they're a simple yet really effective way of doing IPC, I just wish they had some type semantics.It's a little annoying to hand-roll your own binary format every time you use them, a distinction between "byte-oriented" and "C-typed" pipes would've been nice.>Is there a way to have a pipe which stays open?stays open compared to what? pipes will stay open until both sides are closed. named pipes (mkfifo) can be used to create pipes that persist even after being closed.>>107970529>Still, if anyone knows the solution, I'd appreciate it.just have your host process create a fifo and open it for reading. Then just epoll the pipe until someone has written to it. What's hindering you?
>>107970726I hate the shm_ API with a passion. Luckily I've never had to write performance critical applications with that sort of IPC, lol
>>107972870>What's hindering you?I'm using mkfifo. When the writer finishes writing, epoll now simply returns EPOLLHUP on the fifo, rather than waits for another writer.The other suggestion was to have the reader also open it for writing, that way the kernel can't fully close the pipe. I'll try that.
>>107970660>>107970776That's a horrible design. Any kind of shared data should be passed in explicitly to the new process, instead of randomly sharing arbitrary parts. Fork is also a really bad design.
>>107974065>windows baby seethingalso it is explicitly passed via stdin
>>107974065it is being shared explicitly, though. all programs expect to read from stdin, which it is
>>107974065You pass them to the new process by not using O_CLOEXECThis is suboptimal but that's backward compatibility for you
>>107974151using pidfd_getfd is so unwieldy, though
>>107974268Just inverting it so that FDs are only inherited if you add a flag rather than the other way around would already be a step up.And making the standard file descriptors special because opening a file and accidentally having it become stderr and logging into it is really stupid.Those are bandaid fixes and if you were breaking backward compact anyway you could do better. But I can dream.
>>107974091>>windows baby seethingWindows does it right.>also it is explicitly passed via stdin"Explicitly passed" would mean that "execlp" would take it as an argument (and maybe "inherit them from the creating process" is an option, but still you have the choice instead of randomly leaking information to the new process).>>107974095Stdin is just a convention and it's not being shared explicitly. It's copied implicitly by fork and kept by execlp. The kernel randomly decides that some things belong to the new process and not others. What if file I/O was implemented in a dynamic library instead of in the kernel? Fork breaks a lot of things. The better way is for the create process operation to take all the information that the new process needs instead of implicit copying and replacing. Fork shouldn't exist.
>>107974351i actually fully agree with your first point, but honestly, i am more inclined to the opinion that inheriting process memory after a fork is inherently bugprone due to the non-deterministic nature of thread death and global variable statewell actually, i take that back. forking is fine. i don't like the concept of threads. it shouldn't be possible to share memory like that. i think if two separate control flows need access to the same memory region, it ought to be accomplished via some dedicated shared memory scheme. it's too easy to shoot yourself in the foot with threads otherwise>>107974357part of the above applies to your post, too
>>107974402>well actually, i take that back. forking is fine. i don't like the concept of threads. it shouldn't be possible to share memory like that. i think if two separate control flows need access to the same memory region, it ought to be accomplished via some dedicated shared memory scheme. it's too easy to shoot yourself in the foot with threads otherwiseThreads are a good idea. Actually the original "fork" that Dennis Ritchie stole from was more like threads, and sharing variables is fundamental in most programming languages. Unix "fork" did it wrong because he didn't understand how it worked.
>>107974450i just don't like threads. i really am inclined to believe that if you feel like you need to share variables between threads, your program's architecture is probably flawed. for areas where you really do need to share it for whatever reasons, maybe graphics stuff or that sort of thing, a dedicated shared memory scheme is still going to be the best way to do iti guess if you wanted threads, then i would lean more towards all writable data being thread-local/COW, and anything shared again needs to go to the dedicated shared memory scheme
>>107974521Separate "address spaces" are fictional. The ideal OS design has one address space shared by everything. So in that case every process is like a thread. This reduces a lot of overhead and makes everything a lot simpler. That's also how the computer is actually designed to be used. This is another flaw of Unix fork because it prevents you from designing your OS like that.
>>107974628i really strongly disagree with that, lol. abstractions exist for a reason. in this case, allowing a "new" address space, one which is effectively unlimited on 64-bit systems, is extremely useful from the standpoint of program design because it massively simplifies the logic necessary. you don't have to care what other programs are doing, what the actual disk looks like, etc. honestly, this is part of why i dislike threads. having multiple threads violates this assumption, because things can get changed out from under you. a single dedicated address space per single control flow would fix so many logic errors. by forcing program designers to explicitly specify when they want to interact with memory which is shared across processes, it forces them to not make assumptions about the safety of the memory
>>107974521holy this a peak nocoder/dunninng-kruger posts>all writable data being thread-local/COW>lets make everything fucking slow and complex for no reason>>107974628>The ideal OS design has one address space shared by everything>one stupid/malicious program can bring down the entire systemcool os design bro
>>107974663>>107974734Single address space operating systems are a lot more secure than mainstream operating systems.
>>107974734i mean, i respectfully disagree. i have a pretty deeply technical job, and i work with low level systems every day. can you explain to me why you think my ideas are bad, rather than just calling me a nocoder? i am not even saying my idea *has to* be correct. i just really have encountered a lot of issues with the multi-threaded programming paradigm and so would be in favor of an alternativeto briefly address your points, i don't think what i want would inherently make things any slower or more complicated than the current design. you could still have decided multi-processing. so if you, for example, wanted to perform chunked cryptography, you'd have multiple processes, each with the specific goal of performing a given operation on a chunk of data. it's pretty much fundamentally the same concept as threads, just removing the volatility of memory which is implicitly (rather than explicitly) shared
>>107964764>posting syscalls every day on /g/This kind of autism is why I really like /g/.>>107965557This is the other reason.
>>107974402Meh, I enjoy threading in Rust. It doesn't solve all the footguns but it does OK. And other languages that ban user-mode shared mutability seem to do even better.Even granting that threads are a bad tool for most use cases I don't know that that means they shouldn't be available. If removing threads could provide additional fundamental system-level guarantees then maybe, but otherwise you could instead nudge people toward the less-sharp solutions and allow them to make their own tradeoffs.>>107974450Many parts of Unix are bad but not everything is a strictly inferior ripoff of its inspirations.>>107974628Computers haven't been designed for that for a very long time.Multics actually did execute programs all in the same process, more like subroutine calls (not even threads), but it still had oodles of memory mapping with half a dozen rings.
>>107974760i don't know whether that's true. could you explain the logic behind why you believe this to be the case?
>>107974774>so if you, for example, wanted to perform chunked cryptography, you'd have multiple processes, each with the specific goal of performing a given operation on a chunk of datafirst of all how the fuck do you have problems with "memory volatility" in a problem like that? each thread gets its own arena to put the result on, scratch arena to do temp alloc, and a read-only chunk to do operation, woah memory volatility solvedbut anywayslow in comparison to just doing that in a single processnightmare to debug (is there even a way to pause all the forks?)not portable (emulating fork on windows is slow so you would have to end up using threads)sure you removed "volatility" of the memory but that didn't come for free e.g. since you use fork a lot more you have a higher chance of fork bombing yourself you are not gaining (possibly even losing) but giving up a lot of perf and debug capabilities , so what is the point?the alternative is better design, better tooling, better codebases
>>107975043every call from each thread into a standard library is doing any number of indeterminate things with global variables (there's a whole thing in the man pages about mt and async safety, for example), so if for whatever reason this gets interrupted, you could face quite a lot of issues>slow in comparison to just doing that in a single processhow so?>nightmare to debug (is there even a way to pause all the forks?)not really. if you're doing it the "right way", you should only need to debug a single implementation of the code, since it's running the same instructions, just on different databut if you really needed to stop them all, you could just send them all SIGSTOPs, yeah>not portableoh, yeah. i would want this to be an OS-level architecture decision. portability wouldn't apply here>you have a higher chance of fork bombing yourselfi mean, wouldn't thread bombing effectively be the same thing in this scenario?
>>107975096>every call from each thread into a standard library simple, stop using stdlib its dogshit in every single wayif youre serious about your work you already have your own C codebase >how so?how do you share the date in multiproccess and ensure memory volatility without making copies all over the place / putting down synchronization mechanism / having to involve kernel in each single talking between processes? in multithread you just memmove data into some spot, signal some condvar and thats it>since it's running the same instructions, just on different datain your problem statement (which is like i said quite simple to do with threads ensuring safety) what if you have a problem when you need to start communicating and run different instructions? have fun debugging that>i mean, wouldn't thread bombing effectively be the same thing in this scenario?yea, what im saying that in GENERAL mutliprocess isn't a free lunch
>>107975211>how do you share the date in multiproccessi mean, the same way you do now. the only difference being you'd have to explicitly mark the mapping as shared. something like https://man7.org/linux/man-pages/man7/shm_overview.7.html>what if you have a problem when you need to start communicating and run different instructions?ideally you'd minimize the post-fork logic to only do one thing. basically like throwing chunks off into a queue to be processed, and then receiving the data once it's done
>>107975405>something like https://man7.org/linux/man-pages/man7/shm_overview.7.htmlso something that is slower than just memoving within one address spacealso shm has no guarantees when it comes to memory so process can still fuck each other upand you are introducing more dependency on random stuff and thus introducing more points of failure >ideally you'd minimize the post-fork logic to only do one thingcannot be done in most problems