extract links from craigslist rdf feeds with python
note the use of urlfetch — i’m using this in an app engine application.
from BeautifulSoup import BeautifulSoup
class RDF(dict):
def __init__(self, url):
try:
self['contents'] = BeautifulSoup(urlfetch.fetch(url).contents)
except:
self['contents'] = ”
def links(self):
[ item['rdf:about'] for item in self['contents'].findAll(’item’) ]
if __name__ == ‘__main__’
from __main__ import RDF
problems with google app engine
if you’re just getting started with app engine, you’ll encounter limitations, some more debilitating than others. i’ll keep a list of those i encounter.
- currently, you can’t import urllib2. you’re limited to using urlfetch. urlfetch is decent, but you’d have to re-write/tweak parts of existing libraries that use the ubiquitious urllib2. for example, say goodbye to using the Universal Feed Parser without significant hacking. in the interim, i’m able to use

















