XKCD-Get

September 20th, 2010 by

I love the XKCD web comics. They are absolutely brilliant, and have become a part of my daily routine. I thought it would be nice to have an archive of the comics…so I fired up my good friend BASH, and asked it to create one for me. This script uses wget to download all the comics.  The comics are licensed under a “Creative Commons Attribution-NonCommercial 2.5 License.” That means this should be legal? Please keep the sleep value at 15 seconds or greater. We don’t want to DDOS one of the greatest WebPages ever. That would be like giving someone a hug so hard it knocks ‘em out. Don’t do it.

[START CODE]
#!/bin/bash
#Simple XKCD comic download tool coded by kno

#Get most recent comic number
wget -q -nd -U Mozilla http://xkcd.com
recent=$(cat index.html | grep 'Permanent link to this comic:' | cut -d '/' -f 4)
rm index.html

#Set first comic to value in last.log if it exists.
if [ -f last.log ]
	then
		start=$(cat last.log)
	else
		 start=1
fi

#If archive is not up to date, get the newest comic
if [ "$start" -ge "$recent" ]
	then
		echo "[+] You are up to date"
	else
		#Get each comic
		for i in $(seq $start $recent)
		do
			echo [+] Getting comic $i
			wget -q -nd -U Mozilla http://xkcd.com/$i
			wget -q -nd -U Mozilla $(cat $i | grep '(for hotlinking/embedding):' | cut -d ' ' -f 5 | cut -d '<' -f 1)
			rm $i
			echo $i > last.log
			sleep 15
		done
fi

[END CODE]
  1. Runa says:

    Instead of setting i to be less than 796 (which means that you will get everything up till, and including, today’s comic), you could have your script check the id of the latest comic and use that value + 1. Just a tip :)

  2. any1 says:

    can u write one for phdcomics.com?

  3. Trenton says:

    @Runa – Good idea. I just made the change.

Leave a Reply