I am now totally sold that the only way input validation will ever be secure is by explicitly listing safe characters and not be listing unsafe characters. I was on the fence on this issue. I thought that as long as you used well published open source functions to check for unsafe characters you were pretty secure, but then I saw this bug for mysql_real_escape_string(). This took a year to be fixed as well.
Worst of all is that I’ve tested applications with MySQL backends with many security tools that look for SQL injection, including security tools that costs thousands of dollars per run, and none of these tools found this bug. I read in many places that unicode has been a big recurring headache for software security. So, I would that would be the second place to look after the obvious SQL injection attacks.
This is why I now think everyone should program explicitly listing the safe characters and input lengths, even if this hurts the future flexibiity of the program. The solution is obvious for things like names, zip codes, etc. I know the solution is clearly not obvious for multilingual sites. And binary files are still tricky to validate. The best I can think of for this is to use Base64 encoding.