Since the announcement of the Schema.org standards for microdata back in the summer of 2011, I have wanted to incorporate the new conventions into WordPress. At first I considered writing a plugin that would add information dynamically, but this didn’t seem to be a very efficient route. Instead, I have decided to extend the default Twenty Eleven theme that is already gorgeous and well-defined, and create a child theme that builds the microdata standard directly into the template.

Adding microdata to your site has several benefits. First and foremost, you contribute to machine readable data everywhere. The Internet is a wonderful place for humans to browse, but we can make it more accessible and more consumable if we let the computer figure as much of it out as it can. Second, search engines can use this data to get a better understanding of each page that it indexes, and hopefully provide more relevant search results. (Notice that I am not saying you are going to get an SEO boost for doing this. You may, you may not, I have no idea. But if everyone included this data on their sites, the results would be better.) There are no downsides really to simply plugging the data in.

If you are using Twenty Eleven as your theme and would like to add Schema.org microdata to your site without any effort on your part, give this child theme a try. You can download it for now from my Twenty Eleven Schema.org Child Theme page. Eventually I hope to add it to the WordPress Theme repository, but it needs some testing before it’s ready to head over there.

Hope it helps!

Presenting a little tool to the world that others may find handy: my LinkedIn Birthday Reminders web app. It hooks into the LinkedIn API, grabs a list of your contacts, and generates an iCal file that you can import into your calendaring program and receive reminders throughout the year. (Can be imported into Outlook, Google Calendar, OS X’s iCal, etc.)

Screenshot of LinkedIn Birthday Reminders

The motivation behind creating this? First, LinkedIn gives you no easy way of exporting the data yourself. Second, I needed an excuse to learn Node.js. A few hours and an entire RFC later, I had a nice working prototype.

How does it work? It begins with the official LinkedIn API, and the ability to do an OAuth sign-in from any site. When you click on the sign in button (and don’t worry, I never gain access to your credentials), LinkedIn authorizes the request, and then some Javascript extracts a list of your contacts’ names and birthdays. This is then sent to my Node server and script via AJAX, and for everyone that has a usable birth date, the Node script cycles through them and generates a .ics file to download. The link to the file is passed back to the browser and presented as a download button, and after the download is complete, the file is then scrubbed from the server. Fairly simple stuff, and when I get around to it, I’ll put the source on GitHub. If you spot any bugs, be sure to let me know!

I invite you to try it out and grab the downloadable .ics for your contacts, and then make everyone’s birthday a little bit brighter by sending them some special day wishes!

Resources used:

Small Aside

Want a reason why I think the OS X operating system is fantastic? Check out the icon for this .ics file that I downloaded using my app:

iCal Parsed icon on OS X showing the date and title of the first event

The fact that the date shown is Oct 7 and the text says “Christian’s Birthday” is no coincidence—that is the first event in the .ics file! Now how cool is that? :)

Google went and took away my “Reader” link in the nav bar again today. Well, what happens when Google does that? I make an extension to get it back! I’ve updated my amazingly complex (not) “Back to the Reader” extension, available in the Chrome App Store, to account for the new navigation bar and changes.

It works again, and will stay activated in my browser until Google decides to put it back. For more information, see my page about it.

A sad thing happened to my mother-in-law the other day. Somehow, one of her 1 TB hard drives that stores mountains of images (she is a professional photographer) got reformatted to a nice, fresh state, but certainly not what was wanted. She has a client that needs the images in just a couple of days, and so the race was on to see what I could do (remotely, using LogMeIn no less) to recover the data.

I turned to Photorec, and absolutely fantastic command line program that does a hard core scavenge of any disk, looking for known file headers, and tries to reconstruct a file based on its findings. I felt that, since the drive had simply been reformatted and not zeroed, with only one file being placed on the drive since the formatting, that the chances of finding the files was pretty good. Using another external hard drive as storage for recovered files, 15 hours later it had recovered 51,407 .JPGs, .NEFs, and .PSDs, woohoo!

Now one unfortunate byproduct of Photorec recovery is that, while much of the file’s metadata is still in place (modified/accessed timestamps, EXIF data, etc.), it didn’t recover filenames or folder structure. So I had over 51,000 images with random filenames in 130 different “recup” folders, mostly in chunks of the same creation date, but not entirely.

To solve this, I wrote a small little bash script to sort files into folders by modified date. I’m not super skilled at bash scripting, so it took me longer than it probably should of and is probably not very elegant, but it worked quite well in my case! I pointed the script at the recovered directories, ran it, and 3 minutes later it had created over 1300 folders based on the created/modified dates of the files, moved the files into the folders, and bam! I was pretty impressed with myself. :)

If anyone is looking for a similar script (couldn’t find one after an initial Google search last night) then feel free to take this one and modify to suit your needs. Obviously I can’t guarantee anything with it, but it shouldn’t eat your files if you set up the directories correctly. If you are nervous, change the “mv” command on line 27 to “cp” and you should be safer (although the script will of necessity run much slower). There are certainly some things that could be improved about the script (not hardcoding the 300 of ## number, not cd’ing into the source directory, etc.) but it worked in a pinch and didn’t need to be elegant. Hope it helps!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/bash
# Sort files into folders based on their creation date.
#
 
# Indicate a path to the source folder of data to be sorted, and a destination folder where you want the sorted folders to be.
# Note: this will work better if the sorted directory is NOT inside the source directory, otherwise recursion might occur and the world might blow up.
SOURCE_DIR=~/Desktop/testsrc/
SORTED_DIR=~/Desktop/sorted/
sorted_count=0
 
# go into the first folder of the source folder
cd $SOURCE_DIR;
 
# for each item in this folder
find . -type f -print | while read file; do 
 
	# Get the creation date of the item and store it in a variable
	file_date=`stat -f '%Sm' -t "%F" "$file"`;
 
	# Does a folder with this date already exist?
	if [ ! -d "$SORTED_DIR/$file_date" ]; then
		# If no, create the folder
		mkdir -p $SORTED_DIR/$file_date
	fi
 
	# Now move the file to the folder
	mv "$file" "$SORTED_DIR/$file_date/"
 
        # Provide some feedback to the user on our progress
	let "sorted_count += 1"
	if [[ "$sorted_count % 100" -eq "0" ]]; then
		echo "Sorted $sorted_count of 51,407" # Obviously the of count is hard coded 
	fi
 
done
 
echo "Finished."

I have recently been moving several clients off of a basic email server to a Google Apps account (Free or Business, depending on the client), and several have had up to a couple years worth of email. I’ve tried several different techniques, but recently found some great command line kung fu that made the process extremely easy and much more accurate than the past attempts.

There are several different methods available for migrating email from various servers to Google Apps, including POP transfer from within each account’s settings, dragging and dropping via IMAP with a mail client like Thunderbird, or using the official email migration API from Google. However, each of these methods has its downsides: POP is slow and kludgy, dragging and dropping only works for low volume accounts, and the email migration API requires code be written and using a for-pay Apps account. What’s a geek to do? Turn to free and open source solutions!

A little bit of searching turns up that a command line tool known as imapsync does a fantastic job of, well, syncing IMAP accounts. I was tipped off by a blog post on Marius Ducea’s blog (lots of good stuff on that site by the way), which contains a basic introduction to imapsync and how to use it. The author of imapsync, Gilles Lamiral, continues to develop the software and is taking donations for different features, and also charges for the source and for support (visit Gilles’ site here). However, thanks to the licensing of imapsync (NSWF), it is also available for free from other hosted repositories. I was using CentOS, and got my copy from the following:

https://fedorahosted.org/imapsync/

After getting a copy of the Perl script, I ran perl -c imapsync to check dependencies as Ducea instructs, and found I was missing the Mail::IMAPClient module. That was a quick fix with

sudo cpan Mail::IMAPClient

Voila! The script ran just fine at that point. Ducea provides some good templates for running a sync between two servers, and it worked almost 100% right off the bat. However, I was getting issues with sent messages and deleted messages ending up in Google Apps with labels such as “[IMAP]/Sent Messages,” which obviously was not what I wanted. (Note: most such errors can be avoided by appending --dry --justfolders to the command and checking the output to see what imapsync discovers as folders/labels that already exist, and those which it plans on creating.)

A little more searching turned up Mark’s Pages of Stuff’s post about moving to Google Apps, and with a little modification there as well, I was able to get messages moved properly from the IMAP’s sent folder into Google Apps’s sent folder, trash, etc. On Mark’s posting, he uses folder labels like “[Google Mail] Bin,” but he notes that Google changes the name of labels based on locale. He just happens to be in England. However, running imapsync with --dry --justfolders will show you all the folders that exist on both the origin and target servers, so make sure that when you are planning your own migration that you inspect these values. For me, based in the US, the folders used the [Gmail] prefix and Trash instead of Bin.

Finally, after several test runs, I got the command line down to exactly what I needed, and I was off to the races. I set up a small bash script that would iterate over the several mailboxes I needed to migrate, pressed enter, and got about 30 messages in when the process died saying “Out of memory.” Out of memory, really? I was confused for a bit until I checked the message that it was dying on each time, and found that it had a rather massive attachment on it. The origin server, a small, shared hosting situation, must have capped the memory below the size of the attachment.

Luckily, imapsync has a --maxsize option, which will allow it to skip messages that total over a certain level of bytes. A little bit of experimenting on the server showed that I could only work with messages of about up to 2 MB in size, which was rather small, but doable in this situation. I added the –maxsize switch to the command, and ran through all of migrations again.

The final command that I used to migrate each account was as follows:

imapsync --syncinternaldates --host1 <ORIGIN MAIL SERVER> --port1 993 --ssl1 --user1 <ORIGIN USERNAME> --password1 <ORIGIN USER PASSWORD> 
--host2 imap.gmail.com --port2 993 --ssl2 --user2 <TARGET EMAIL ADDRESS> --password2 <TARGET USER PASSWORD> 
--useheader 'Message-Id' --skipsize --noauthmd5 --reconnectretry1 1 --reconnectretry2 1 --maxsize 2194304 \
   --folder "INBOX.Sent Messages" --prefix2 '[Gmail]/' --regextrans2 's/Sent Messages$/Sent Mail/g' \
   --folder "INBOX.Deleted Messages" --prefix2 '[Gmail]/' --regextrans2 's/Deleted Messages$/Trash/' \
   --folder "INBOX.Drafts" --prefix2 '[Gmail]/' --regextrans2 's/INBOX\.Drafts$/Drafts/'

(The three folder lines were unique to the IMAP configuration on the origin server, and will likely change for you. Also don’t forget --dry --justfolders while testing!)

After running the migrations on the underpowered origin server, I switched over to a VPS and ran the commands again, but this time without the –maxsize switch. No memory issues were encountered on the robust VPS, and I also was able to catch the last few newer emails that had been missed since the first migration.

Finally, after each account was migrated, I placed a forward on the email account on the origin server to redirect all new mail to <account>@<domain>.test-google-a.com, which acts as a secondary address for all Google Apps accounts before you switch over MX records. This allowed me to to a cutover from the old server to the new server without losing any new messages, and worked flawlessly.

All in all, the process took about 5 hours to transfer all of the data, and about an hour of research and playing around. Much better than dragging and dropping, or writing an entire mini-application to utilize the email migration API. Thanks to some great resources like imapsync and several helpful blogs, this was a much more pain-free experience than in the past!