Webscraper will not work

Use BeautifulSoup

from bs4 import BeautifulSoup
from urllib2 import urlopen

f = urlopen("http://www.emergencyassistanceuk.co.uk/list-of-uk-police-stations.html").read()

bs = BeautifulSoup(f)

for tag in bs.find_all('span', {'class': 'listlink-police'}):
    print tag.a['href']

You are using regex to parse HTML. You shouldn't, because you end up with just this type of problem. For a start, the .* wildcard will match as much text as it can. But once you fix that, you will pluck another fruit from the Tree of Frustration. Use a proper HTML parser instead.


There are over 1.6k links with that class on it.

I think its working correctly... what makes you think it's not working?

And you should definitely use Beautiful Soup, it's stupid simple and extremely useable.