Tuesday, December 14

low-competence data collection

So three or four semesters ago, my good friend Levi (at that time the guy across the hall from me) asked if I could do a simple web-based survey for him. He'd foolishly believed me when I said I knew a little Perl, and he wanted to exploit some undergraduates for behavioral data. I said sure and didn't think about it much until the night before he actually wanted to run the thing.

What I then slopped together from equal parts of CGI.pm and fervent profanity was probably the most appalling mechanism I have ever committed to the use of actual human beings, but it worked. The Perfect, I told myself, is the enemy of the Good - and in three hours, after consuming that much yerba mate and using a keyboard the approximate size of the little flat box that Chiclets come in, the Good is the enemy of the Barely Functional.

    $id_num = int(rand(100000));

Anyway, since then, we've used basically the same framework to grab stuff from three or four hundred people. It has continued to work, just barely. So when Levi wanted to run that very first survey again with a larger group, I figured no problem. And it wasn't, except for the part where I dumped half of the data into overlapping files by forgetting to check if the random numbers I was assigning had already been used.

    until ( !(-e "data/$id_num") ) {
        $id_num = int(rand(100000));
        # make sure it's six digits by padding w/ 1's.
        $id_num = "1" x ( 6 - length( $id_num ) ) . $id_num;
    }

I'm not a hacker, but sometimes I play one in conversation.