Jun 242013

As a freelance tech contractor, I get a lot of emails every day from recruiters about prospective jobs. Many of these are unsolicited, but this is a good thing, as over time, quite a large list of contacts can be built up. I don’t have the time to reply to each one – particularly those that don’t suit my skillset, or if I’m happily indentured – but I do file these emails into a separate Gmail Mailbox which I’ve labelled “Employment”.

I’m not going to add each recruiter to my Contacts as it arrives – that would be time consuming to start and difficult to maintain. But when it comes time to make a great big recruiter contact list, I’ve written the Python script below to scrap the entire mailbox and output each unique email address.


import imaplib
import sys
import email
import re

IMAP_HOST = 'imap.gmail.com'  # Change this according to your provider

email_list = []
email_unique = []

mail = imaplib.IMAP4_SSL(IMAP_HOST)
mail.login(LOGIN, PASSWORD)

result, data = mail.search(None, 'ALL')
ids = data[0]
id_list = ids.split()
for i in id_list:
	typ, data = mail.fetch(i,'(RFC822)')
	for response_part in data:
		if isinstance(response_part, tuple):
			msg = email.message_from_string(response_part[1])
			sender = msg['from'].split()[-1]
			address = re.sub(r'[<>]','',sender)
# Ignore any occurences of own email address and add to list
	if not re.search(r'' + re.escape(LOGIN),address) and not address in email_list:
		print address

I’ve hard-coded my email login, password and the mailbox name, although it’s easy enough to modify the script to enter them as argmuents (I’ve commented out a line demonstrating this).

In a later post, I’m going to discuss how I use this script for job-seeking.

Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.

He lives and works in London.