Replace newlines with sed

Sed is a commandline linux tool to replace text in a file or input stream. Typically sed works line-oriented, i.e., a line is read, the expression applied, then the next line is read. Say we have a file where one line is one word. We want to reconstruct the sentence. How to replace all linebreaks in the file with a space? Simple:

sed "{:q;N;s/\n/ /g;t q}" 

The regular expression ‘s/\n/ /’ says substitute linebreaks (\n) by a space. ‘g’ says apply this globally. ‘N’ says append the next line to what is processed. Using only ‘N’ would replace linebreaks in every second line. The rest of the thing is a trick to join all lines together. We define the label q (‘:q;’), then we say that in case that there was a sucessfull substitution, go to label q (‘t q’).

Now we have all words in one line. Across sentences! Sentences are separed by an empty line. So easy – replace linebreaks by spaces, replace two adjacent spaces by a linebreak. Gives you one sentence per line, words separated by spaces. Voila:

cat  | sed "{:q;N;s/\n/ /g;t q}" | sed "{s/  /\n/g}"