Archive for the ‘Programming’ Category

Ruby Sucks

Thursday, March 19th, 2009

Ruby cares nothing about backwards compatibility.
This code on lenny:
irb(main):012:0> tmpFile = Tempfile.new('_archive_')
=> #
irb(main):013:0> tmpFile.close(true)
=> nil
irb(main):014:0> tmpFile.path
=> nil

Used to do this on sarge:
irb(main):012:0> tmpFile = Tempfile.new('_archive_')
=> #
irb(main):013:0> tmpFile.close(true)
=> nil
irb(main):014:0> tmpFile.path
=> "/tmp/_archive_20090319-12089-1dphb46-0"


And sadly there are many other such examples.

Java File Descriptor Leak

Thursday, May 31st, 2007

I’ve encountered issues similar this bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6246565
and have found some other bug entries in the past that seem to cover the same problem.

If I’m running a lot of Runtime.getRuntime().exec() or Axis SOAP calls java open several pipes associated with each exec or SOAP call, that especially under a heavily loaded system, can take a very long time to be garbage collected. This has lead to a case where this code hits the default 1024 open file limit in a case where only maybe 20 or 30 files should be open at a time. The solution thus far has been be more careful about opening sockets and pipes in Java, but I have yet to find a better solution. I’m still running Java 5 JDK 1.5 and many of the bug entries seem to recommend upgrading to Java 6.

Pessimistic Programming

Sunday, June 4th, 2006

I am now totally sold that the only way input validation will ever be secure is by explicitly listing safe characters and not be listing unsafe characters. I was on the fence on this issue. I thought that as long as you used well published open source functions to check for unsafe characters you were pretty secure, but then I saw this bug for mysql_real_escape_string(). This took a year to be fixed as well.

Worst of all is that I’ve tested applications with MySQL backends with many security tools that look for SQL injection, including security tools that costs thousands of dollars per run, and none of these tools found this bug. I read in many places that unicode has been a big recurring headache for software security. So, I would that would be the second place to look after the obvious SQL injection attacks.
This is why I now think everyone should program explicitly listing the safe characters and input lengths,  even if this hurts the future flexibiity of the program. The solution is obvious for things like names, zip codes, etc. I know the solution is clearly not obvious for multilingual sites. And binary files are still tricky to validate. The best I can think of for this is to use Base64 encoding.

Zero day exploits

Saturday, May 27th, 2006

Whoever you are, I have always depended on the kindness of strangers.  — Blanche DuBois, A Streetcar Named Desire

As far as I can see there is no solution to the zero day exploit problem. Someone can always see what software you are running and wait until the moment a vulnerability is discovered and immediately attack you before you can patch. It is true however that most people will not be targeted in this way since there are many more benevolent people than malicious people (see above quote). In general websites are not targeted until vulnerable sites can be found with a web search engine, making it easy for a few people to find a large number of vulnerable sites.

Don’t know where I’m going with this post. Just that we are doomed to not be able to instantly patch.

Microsoft Announces Meta OS

Thursday, November 3rd, 2005

So this post was inspired by a slashdot article. Though I wasn’t able to view Microsoft’s site on Singularity because it was already slashdotted.

Any way so that idea appears to be making an OS where the only code you can run is managed code which is Microsoft speak for interpreted code as opposed to native byte code. The only thing I think this gives you is buffer overflow prevention. The garbage collection is a bit of a an unachievable goal because you can still write in memory leaks in code with garbage collectors just by writing bade code that keeps a pointer to the object longer than you need to. And you can still write a program that asks to allocate all memory like while(true) { $a[$i++] = "some string"; }

And I’m sure there are still interpreted viruses… I think they are called Visual Basic macros, or Office Add-ins.

Overall though I like interpreted languages more than native binary compiled languages, despite the occasional speed/memory issue. So I’m all for it.

Falling Behind

Tuesday, November 1st, 2005

Damn, Ravi outnumbers me in posts 2 to 1… and I’m only level 36 in WoW. I’m not sure exactly what we might be competing for, but I think Ravi is winning.

I’ve become sold on the whole Pragmatic Programmer site, philosophy, whatever. Automated build and testings systems, for which the proper term is Continuous Integration, seem especially cool. Check out the DamageControl CI comparison site. I wanted to try out Tinderbox 2.0 which is what Mozilla is using (not Tinderbox 3.0), but I can’t find anywhere to download it. I only see how to view the interface to it on mozilla’s site. Right now I’m liking DamageControl and BuildBot. I don’t really want to run a Java system… since I don’t use Java much except when I have to for school, and I also don’t have a windows server to run cruise control.net.

In outher programming news in my life, WiX caused me a bit of pain because I thought it was more mature that it appears to be. Inno Setup kicks ass though for a Windows Installer.

Hmm… well more later… just felt pressured to post :)

Some more reasons that Perl sucks

Monday, September 26th, 2005

So Tom did quite a bang-up job of explaining why the switch statement is so retarded in Perl. But there are other reasons that Perl sucks.

The scalar data type is an extremely leaky abstraction
As Joel Spolsky notes in the Law of Leaky Abstractions, you have to understand the theory that an abstraction is, well, abstracting, in order to use it properly because, for any abstraction, there are situations which cause the abstraction to ‘leak’ and reveal the underlying complexity. The better an abstraction is, the further it has to be pushed before it starts to leak.

Testing for equality should not be the kind of Xtr333m action that causes an abstraction to leak, but in Perl, that’s exactly what will happen.

my $foo = function_to_get_input();
if ($foo == 123) {
print "Wooha\n";
}

This seems like a pretty innocent snippet, right? Too bad it will fuck up your day – unless you spend far too much work patching the holes in the abstraction. If function_to_get_input() returned a number, everything’s fine. But if it returned a string, the comparison $foo == 123 will generate a warning, because == expects a number and $foo is a string. So clearly the abstraction is leaky: Perl only provides the scalar data type, because the user ‘doesn’t need to know’ whether the scalar is a number or a string. Unfortunately, the user DOES need to know, because the interpreter is making the user responsible for matching the data types.

Warnings aside, strings in perl also evaluate to 0 in a numeric context – unless they happen to represent a number. This means that == will tell you that 0, “0″, and “My lord, there is talk of cake” are all the same thing. C has NaNs. Perl is written in C. Why does Perl think 0 is a more accurate description of “Mary Poppins is the antichrist, I have proof” than Not a Number?

The string comparison operator, eq, is a bit more generous in its argument coersion; it will automatically convert a number to its string representation. Unfortunately you end up with the same ’smushing’ effect where 1 and ‘1′ are the same thing.

Most scripting languages will have functions like is_numeric() and is_string(), which return 1 if the object is of the listed type and 0 otherwise. But Perl doesn’t have anything like that; in order to get the true representation of a scalar, you’ve got to install a module, written in C, which violates the encapsulation of the scalar to read an internal variable.

The long and short of it is that writing a block like

my $foo = function_to_get_input();
if ($foo eq "Do it") {
print "Wooha\n";
} elsif ($foo == 1) {
print "Blarg\n";
} elsif ($foo eq "1") {
print "My eyes are bleeding help help I can't see\n";
}

ends up being an entirely non-trivial affair.

Input to regular expressions can fuck the interpreter
$foo =~ s/meh/bah/;
works just fine. But suppose we want to replace the right-hand side with the variable $gee. Let’s see what happens:
$foo =~ s/meh/$gee/;
Harmless, right? Sure it is – until $gee gets assigned the value “/” and, you guessed it, the interpreter throws an error and exits. You can solve this with the \Q and \E (start quotation, end quotation) metacharacters, but this will cause a problem – backreferences won’t work. That is to say, if $gee is assigned the value “\$1″ (a literal backslash followed by the numeral one), then the code

$foo = "I know kung fu";
$gee = "\$1";
$foo =~ s/know (.*)/\Q$gee\E/;
print "$foo\n";

will print the line “I $1″ instead of “I kung fu.”

Using evaluation of the right hand as code (adding an e after the regex) can solve this problem, sort of:

$foo = "I know kung fu";
$gee = "\$1";
$foo =~ s/know (.*)/$gee/e;
print "$foo\n";

That gets “I kung fu”. But
$foo = "I know kung fu";
$gee = "\$1";
$foo =~ s/know (.*)/think $gee is awesome/e;
print "$foo\n";

will explode, because the interpreter can evaluate “$1″ as a command, but can’t evaluate “think $1 is awesome” as a command. In order to do that, we’re going to need an extra level of indirection:

$foo = "I know kung fu";
$gee = "\$1";
$foo =~ s/know (.*)/"\"think $gee is awesome\""/ee;
print "$foo\n";

Notice the two e’s at the end of that regex; they tell Perl to evaluate the righthand side as code, then evaluate the output of that as code, then use that as the substitution pattern.

Perl sucks. And don’t tell me any crap about Perl 6; the Promised Land has been on the horizon for several years now, and PHP and Python and Ruby and a million other languages are way ahead of where Perl was supposed to be. I can’t believe I thought this language was awesome. What a noob.

Author’s note: the original version of this document contained some boneheaded errors. I fixed them. Perl still sucks.

Perl Switch considered painfully slow

Sunday, September 25th, 2005

The Switch perl modules is much slower than and if/elsif/else block. The below benchmark code puts it at about 40 times slower.

I used this program to generate some test data.
#!/usr/bin/perl

@options = ('a','b','c','d','e','f','g','h','i','j');

for($i=0; $i<1000000; $i++) {
print $options[int(rand($#options))] . "\n";
}

Then I timed these two:
#!/usr/bin/perl

my $a = 0;
while($line = <>) {
chomp $line;
if($line eq 'a') { $a = $a+1; }
elsif($line eq 'b') { $a = $a+2}
elsif($line eq 'c') { $a = $a+3; }
elsif($line eq 'd') { $a = $a+4; }
elsif($line eq 'e') { $a = $a+5; }
elsif($line eq 'f') { $a = $a+6; }
elsif($line eq 'g') { $a = $a+7; }
elsif($line eq 'h') { $a = $a+8; }
elsif($line eq 'i') { $a = $a+9; }
elsif($line eq 'j') { $a = $a+10; }
}

print $a . "\n";

And:
#!/usr/bin/perl

use Switch;

my $a = 0;
while($line = <>) {
chomp $line;
switch($line) {
case 'a' { $a = $a+1; }
case 'b' { $a = $a+2}
case 'c' { $a = $a+3; }
case 'd' { $a = $a+4; }
case 'e' { $a = $a+5; }
case 'f' { $a = $a+6; }
case 'g' { $a = $a+7; }
case 'h' { $a = $a+8; }
case 'i' { $a = $a+9; }
case 'j' { $a = $a+10; }
}
}

print $a . "\n";