Use a typewriter lately? No? Well, who cares… except when you encounter stupidities left over from the early days of computing where people were still used to typewriters. Because typewriters had two ways of going to a new line, ASCII knows two ways of representing the newline:
- LF (line feed, German Zeilenvorschub), represented as Unicode code point
0x0A
, ASCII00001100
and escape character\n
- CR (carriage return, German WagenrĂĽcklauf), represented as Unicode code point
0x0D
, ASCII00001101
and escape character\r
ASCII was the first-ever invented encoding for representing text in bits. It’s from the 1960s and at the time someone probably thought it is a good idea to have two characters for the concept of a new line. We’d think "who cares about stuff from the 1960s", it’s 2017, right? But unfortunately many later encodings base themselves on ASCII, most notably those from the Unicode family, e.g., the widely used UTF-8. So – thank you, 1960s! /sarcasm
Two characters for a new line would not be too bad if they were used consistently, but that is where the fun begins. Of course they are not! Differnt operating systems use different conventions to mark the end of a line:
- Linux and Mac OSX use LF
- Windows uses CR LF
- (and to make the chaos complete, Mac OS from before version X uses CR)
So have fun reading "plain text" files! /sarcasm