Shell Null Termination / Separation
Null or zero termination or separation is when one uses a null byte to separate records, which is different from the newline usually used in shell pipelines.
Usually operating things by a newline is good enough and easy to debug, but even a little shell pipeline like a
ls | sort can blow up when it encounters unexpected data like a filename that contains a newline character.
Fun fact: Most filesystems actually allow you to use all kinds of characters from emoji to non-printables and control characters. Including newlines. One could even put ASCII-art in a filename (I've already done that)!
Using a null character saves you there as it is the only character guaranteed to never occur inside a filename and this becomes more important with other sources of (untrusted) input.
The improved version would be
ls --zero | sort -z and a
| tr '\0' '\n' to convert to newlines for the human that still wants the illusion of line separation.
Examples of inputs you shouldn't trust:
- The human in front of the machine (I know myself)
- Your filesystem (because USB sticks, downloads, the human, etc.)
- Anything that comes in over the network (Not even your own service, not the service you are paying for, …)
Translating between worlds
To translate between newline separated and the null separated world there is a little utility called
- null to newline
tr '\0' '\n'
- newline to null
tr '\n' '\0'
- swap newline and null
tr '\n\0' '\0\n'
- remove potential nulls
tr -d '\0'
Make sure to actually remove the characters you assume to not be in your input!
Getting a program into null mode
Unfortunately there is no "the one way" to make a program use nulls instead of newlines so here is a a hopefully useful table. Please note that some programs only have the option available in the gnu version, but not in i.e. the busybox version.
If a program has some kind of
printf option one can use that to make the output null separated.
|gnu, other modern
|--format '…%0' --no-newline
|-r -d ""
Note: The best way to automatically find out if a program supports an option is to
grep -q across the output of its
--help. Just make sure to choose a specific enough regex to avoid false positives. Trying out is also an option but that usually is a bit more complex.