Find file names with invalid encoding on Linux

I have files copied from Windows computers in ancient times. The filenames contain special characters and they have been messed up somewhere along the way. For example I got a file named 9.5.2 Modelo de aceptaci??n (espa??ol).doc in the folder 9 Garant??a del Estado.

First, I want to find and list these files. Stackexchange tells us how to do that:

LC_ALL=C find . -name '*[! -~]*'

This will find all names that have non-ASCII letters, not only those that are broken. But in my case I have folders where ALL of the names are broken, so I don’t mind.

Second, I want to fix the names. I did it manually, but for future reference, if I ever were to do anything like that again, I might use one of the solutions proposed in this thread on serverfault.com.

This entry was posted in Linux and tagged , , , , by swk. Bookmark the permalink.

About swk

I am a software developr, data scientist, computational linguist, teacher of computer science and above all a huge fan of LaTeX. I use LaTeX for everything, including things you never wanted to do with LaTeX. My latest love is lilypond, aka LaTeX for music. I'll post at irregular intervals about cool stuff, stupid hacks and annoying settings I want to remember for the future.