
Thanks to Andy C I’ve finished the WordPress import script for Banjo. That article gives instructions for direct SQL loading of a somewhat different blog database. I couldn’t use it directly for Banjo, partially because I’m through with using MySQL. I just can’t take all the UTF problems any more. I’ve gone all PostgreSQL all the time. So my solution needed to speak MySQL and then directly create objects in Django. That way it could be used for any target database.
Loading A WordPress database into Banjo/Django
What I did was to load my WordPress 2.3 database into a local MySQL server. Then I wrote a script which splits the import into three stages. 1) Create Post entries, 2) add categories to them, and 3) attach comments. First, I get all the entries, then make and save Posts for them.
entries = {}
querydict = {'prefix' : prefix}
sql = """select id, post_name, post_title, post_date_gmt, post_modified_gmt,
post_content, post_excerpt, post_status, ping_status, comment_status
from %(prefix)sposts where post_type='post'""" % querydict
c = conn.cursor()
c.execute(sql)
entries = {}
for post_id, name, title, date_gmt, modified_gmt, content, excerpt, wp_status, wp_comment, wp_ping in c.fetchall():
e = Entry(
# details here - skipped
)
e.save()
entries[post_id] = e
Then, I calc all the tags, and build as a space delimited list and assign to my Entry object. The Entry object uses the django-tagging app to handle parsing these tags in a convenient manner.
sql = """select %(prefix)sterm_relationships.object_id, %(prefix)sterms.slug
from %(prefix)sterm_relationships, %(prefix)sterm_taxonomy, %(prefix)sterms
where %(prefix)sterm_relationships.term_taxonomy_id = %(prefix)sterm_taxonomy.term_taxonomy_id
and %(prefix)sterm_taxonomy.term_id = %(prefix)sterms.term_id""" % querydict
c.execute(sql)
cats = []
curr = -1
for post_id, cat in c.fetchall():
if post_id != curr:
if cats:
try:
e = entries[post_id]
e.tags = " ".join(cats)
e.save()
except KeyError:
pass
curr = post_id
cats = []
cats.append(cat)
if cats:
try:
e = entries[post_id]
e.tags = " ".join(cats)
e.save()
except KeyError:
pass
Last, I build the comments. The only trick here is that due to “auto_now_add” on the comment object, I need to set the date on it again after it is initially created, and save again. Silly, but effective.
entrytype = ContentType.objects.get_for_model(Entry)
sql = """select comment_content, comment_post_id, comment_author,
comment_date_gmt, comment_author_ip, comment_approved
from %(prefix)scomments where comment_approved = '1'
""" % querydict
c.execute(sql)
for content, post_id, comment_author, comment_date, author_ip, approved in c.fetchall():
try:
e = entries[post_id]
comment = FreeComment(
content_type=entrytype,
object_id = e.id,
comment = force_unicode(content, errors='replace'),
person_name = force_unicode(comment_author[:50], errors='replace'),
submit_date = comment_date,
is_public = True,
ip_address = author_ip,
approved = True,
site = blog.site
)
comment.save()
comment.submit_date = comment_date
comment.save()
except KeyError, ke:
pass