I thought I’d mention the sort of database-related work I’ve done on contract, in case any of you might be needing some help, or just so you can be awed by my mad skills. I haven’t done a lot of contract work, but there are a lot of blogs and CMS’s out in the world, and sometimes things go wrong.
I just did some work for my friend Kimli, partially because it seemed like a good challenge, but mostly because she was trying to manually type in 2449 blog entries (omfg! 2449!) from a Windows-based blogging software she’d used from 2001 to 2006 into WordPress. This is the sort of thing that makes me weep: why do anything tedious by hand that you can automate? I demanded that she stop and send me her backup file of blog posts, which turned out to be a “CSV” (comma-separated values) file with entries similar to this:
219, “It was a dark and stormy night. The captain of the ship yelled, “”Tell me a story!”" and the story goes like this:
It was a dark and stormy night. The captain of the ship yelled…”, “A silly recursive joke”, 2001-11-01 10:54:33
This is why I double-quoted CSV: this ain’t no proper CSV, because an entry can span more than one line. Not to mention, all the “” within the post text are not going to make MySQL happy when it reads it, since double-quotes are supposed to determine the start and end of the field.
There was also a Windows-to-not-Windows text issue, where some posts had content like so:
On tonightís menu: chicken strips with a sweet yet zesty red chili sauce, sautÈed garlic mushroom caps, roasted potatoes.
The “í” here is supposed to be a ‘ (single-quote), but instead it’s some weird-looking i. And that “È” should be an “é”, and that’s not just a case issue since the accent’s pointing the wrong way. This problem was everywhere, because Kimli seems to like using French words a lot. Damn her and her hoity-toity vocabulary. Unfortunately, the dos2unix utility did not help me here, and I’ve dealt with this problem before.
So, by the wonders of regular expressions (Everybody stand back!) I fixed all of this: all the in-the-middle-of-the-post line breaks became <br />, all the “” became \”, and all the improper characters were mapped back to what they were supposed to be (which required the most time at all, since I had to google what a “Caga TiÛ” might possibly be if it were spelled correctly). There were some other issues, too, but these were most of them.
So I ended up with a file that looked something like this:
219, “It was a dark and stormy night. The captain of the ship yelled, \”Tell me a story!\” and the story goes like this: <br />It was a dark and stormy night. The captain of the ship yelled…”, “A silly recursive joke”, 2001-11-01 10:54:33
Now I had to map these values to blog entries in the WordPress database schema. Time passes. Once I got that figured out, the command I ran to import them into the database was this:
load data infile "fixedbackupfile.csv" into table wp_posts
fields terminated by ","
optionally enclosed by "\""
escaped by "\\"
lines terminated by '\n'
ignore 1 lines
(@var1, post_content, post_title, post_date)
set post_excerpt = left(100, post_content), post_author = 1;
What this all means is irrelevant; doesn’t it look complicated? There was some other stuff to be done to make WordPress work right with the new data, and add categories based on the post year, but it wasn’t much. And what you have is delicious juice dot archives, as Kimli wanted.
Now, none of this is brilliant (though if you want to be impressed, feel free), but it’s the sort of work that requires DBA or coder skills plus a good knowledge of MySQL. And I set up a testing environment to do all this in before pushing it live (woo, look at me, my own home testing environment). And I’ve certainly had more difficult tasks in my full-time work, but they’re even more boring to read about, believe me.
So this is one of the things DBAs work on. It’s often referred to as magical (Kimli agrees). A friend jokes that I need a “DBA hat”, which he insists is like a wizard hat, but Harry Potter doesn’t wear his anymore (I guess because it doesn’t look cool) so I don’t see why I have to.
I am indeed impressed, and very grateful! As if I needed MORE of a girl crush on you ..
Testing before deployment? Wimp
Regular expressions and data manipulation for the win!
I salute you, Madam.
Partially because my employer is heavily encouraging me to learn a *little* php/mysql because our flagship product (and all future products) require it, and theoretically I run the support department, so I should know how to use them, right?
*sigh*
Well, I managed to get it to make an archive-by-month section. blargh. And for my next trick, watch me pull a rabbit out of my hat! With MySQL!
You’re pretty sexay, all right. And XKCD FTW!
Ever consider contracting in Seattle? I don’t know how hard that is, but my friend is a DBA from Montreal who moved here to contract at Microsoft and other Seattle companies, and is now making $90k as a full-timer. She’s only been at it for, like, three years. She didn’t even study comp sci – her degree was in hotel management. It doesn’t seem hard to find work here.
In any case, good luck!
Apparently we all like it when you talk dirty like this.
Your regular expression are anything but regular.
They’re real and they’re spectacular.
So that’s what DBAs do. I always wondered. I mean, it’s not that hard to type “SELECT * FROM foo”
@Garth: DBAs are the people who make sure that developers can type “SELECT * FROM foo”. It’s a thankless job.
You ma’am, are (as they say in these parts) ‘teh awesome’.
Regexps ftw!
Wow, someone who not only knows about, but also knows how to use, LOAD DATA INFILE. Eight years of MySQL wrangling and I’ve never had the stomach. Impressive.
With MySQL skills and regex skills, you’re bound to find a job soon enough. Hmm… I wonder if we’re still hiring. Do you have a resume?
@Dan So far nobody I know has been able to convince their bosses that they need a DBA, or even a DBA/coder. And yes I have a resume, it’s sort of what got me this far.
[...] With big help from Gillian (who’s a bright, shining star, and a database administrator looking for work) and an assist from Ask Metafilter, I got them done. I’ll explain a little more about how [...]