JENS MALMGREN I create, that is my hobby.

Porting my blog for the second time, load a post

This is post #4 of my series about how I port this blog from Blogengine.NET 2.5 ASPX on a Windows Server 2003 to a Linux Ubuntu server, Apache2, MySQL and PHP. A so called LAMP. The introduction to this project can be found in this blog post https://www.malmgren.nl/post/Porting-my-blog-for-the-second-time-Project-can-start.aspx.

Here below is an example of the post files. It has the filename 0b4983f6-8a40-4aed-9832-477e507c63c2.xml. I shortened the content tag but for the rest this is how my test case of a post file looks like.

≺?xml version="1.0" encoding="utf-8" standalone="yes"?≻
≺post≻
  ≺author≻Admin≺/author≻
  ≺title≻Valeria II≺/title≻
  ≺description /≻
  ≺content≻Text...≺/content≻
  ≺ispublished≻True≺/ispublished≻
  ≺isdeleted≻False≺/isdeleted≻
  ≺iscommentsenabled≻True≺/iscommentsenabled≻
  ≺pubDate≻2011-10-16 21:16:00≺/pubDate≻
  ≺lastModified≻2011-12-20 22:15:50≺/lastModified≻
  ≺raters≻0≺/raters≻
  ≺rating≻0≺/rating≻
  ≺slug≻Valeria-II≺/slug≻
  ≺tags /≻
  ≺comments /≻
  ≺categories≻
    ≺category≻15042b04-4b20-4ad2-9f4e-72d984172406≺/category≻
    ≺category≻ac7d1fda-b384-48eb-8207-7fa92d9894fe≺/category≻
    ≺category≻f221f647-8fe5-44e8-8a9b-fbfb9477d86d≺/category≻
    ≺category≻0dc7f3de-5d3a-4ba7-a10b-3cd937bf318a≺/category≻
    ≺category≻de5b8b80-3220-4315-8180-bdeed2b9b560≺/category≻
  ≺/categories≻
  ≺notifications /≻
≺/post≻

As you can see there are several fields containing text, such as author|title|description|slug|content. Then there are fields containing boolean data such as ispublished|isdeleted|iscommentsenabled. Then there are fields with datetime values such as pubDate|lastModified. I figured that I loop over all fields and for each field I handle it depending on what type it is of and then store the value in a hashtable for each type and then when I looped over all fields then I have the data converted and ready at hand stored in various tables. Then at that point I can create a new record and insert all the fields from the stored tables.

my $filenamePost = "/usr/local/bin/jensblog/App_Data/posts/0b4983f6-8a40-4aed-9832-477e507c63c2.xml";
$xmldoc = $parser-≻parse_file($filenamePost);

my %dictStringFieldToValue = ();
my %dictBoolFieldToValue = ();
my %dictDateFieldToValue = ();

for my $node ($xmldoc-≻findnodes("/post/*"))
{
	if ($node -≻ nodeName() =~ /author|title|description|slug|content/)
	{
		$dictStringFieldToValue{$node -≻ nodeName()} = $node-≻textContent();
	}
	elsif ($node -≻ nodeName() =~ /ispublished|isdeleted|iscommentsenabled/)
	{
		my $boolValue = 0;
		if ($node-≻textContent() eq "True")
		{
			$boolValue = 1;
		}
		$dictBoolFieldToValue{$node -≻ nodeName()} = $boolValue;
	}
	elsif ($node -≻ nodeName() =~ /pubDate|lastModified/)
	{
		$dictDateFieldToValue{$node -≻ nodeName()} = $node-≻textContent();
	}
	else
	{
		print "Unhandled: ". $node -≻ nodeName() . "\n";
	}
}

There are more fields but this will do for now for testing creating a record in a table. So it is time to create a table for storing this data.

print "Creating table: Post\n";
$dbh -≻ do('create table Post ' .
	'('.
		'ID int auto_increment primary key,' .
		'Author varchar(15) not null,' .
		'Title varchar(256) not null,' .
		'Description varchar(256) not null,' .
		'Content blob null,' .
		'IsPublished tinyint(1) not null,' .
		'IsDeleted tinyint(1) not null,' .
		'IsCommentsEnabled tinyint(1) not null,' .
		'PublishedOn datetime not null,' .
		'ModifiedOn datetime not null,' .
		'Slug varchar(80) not null unique' .
	')');

In the Perl program I create the table from scratch. Then I know I have a new fresch table every time I run the program. Otherwise I have the database and it need not necessary be in sync on what I figure out in the Perl program. This is better. Please note that also the ID in this table is auto_increment. Every time I insert a new record I will automatically get a new ID. So how do I insert a new record using the data stored in my hash tables?

$dbh-≻do(
	'INSERT INTO Post (Author,Title,Description,Slug,Content,' .
	'IsPublished,IsDeleted,IsCommentsEnabled,' .
	'PublishedOn,ModifiedOn) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)', undef,
	$dictStringFieldToValue{"author"},
	$dictStringFieldToValue{"title"},
	$dictStringFieldToValue{"description"},
	$dictStringFieldToValue{"slug"},
	$dictStringFieldToValue{"content"},
	$dictBoolFieldToValue{"ispublished"},
	$dictBoolFieldToValue{"isdeleted"},
	$dictBoolFieldToValue{"iscommentsenabled"},
	$dictDateFieldToValue{"pubDate"},
	$dictDateFieldToValue{"lastModified"}
	);

After the loop reading the fields I issue an INSERT INTO query to the database. In there I have all my stored data ready at hand. This worked very well and right after insert I can pick up the ID of the inserted record using this code my $postID = $dbh->{mysql_insertid};. Earlier I told about how I imported the categories into Category records. Now it is time to figure out how to connect a Post with Category records.

One Post can have several Categories. One Category can be listed in several Posts. This is the kind of database analyse jargon you need to mumble to yourself to figure out that here we have a "many to many" relation between a Post and a Category. The way to solve this is by creating a PostCategory connection table.

print "Creating table: PostCategory\n";
$dbh -≻ do('create table PostCategory ' .
	'('.
		'PostID int,' .
		'CategoryID int,' .
		'primary key (PostID, CategoryID)'.
	')');

So here in my Perl program I create a new table that stored the ID of a Post and the ID of a Category. Now we need to read the category data of our sample file and try to set up these connections.

my $postID = $dbh-≻{mysql_insertid};
print "Post ID was: " . $postID . "\n";
for my $node ($xmldoc-≻findnodes("/post/categories/category"))
{
	print "Category: " . $node-≻textContent() . " " . $dictGUIDToID{$node-≻textContent()} . "\n";
	$dbh-≻do('INSERT INTO PostCategory (PostID,CategoryID) VALUES (?, ?)' , undef, $postID, $dictGUIDToID{$node-≻textContent()} );
}

It was some time ago when I worked on the $dictGUIDToID hash table but here it is again. It contains a conversion between a GUID to the ID of a category. Here below I list the entire program so far:

#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML; # apt-get install libxml-libxml-perl
use DBI; # apt-get install libdbd-mysql-perl
use Switch; #apt-get install libswitch-perl

my $dbh = DBI-≻connect('dbi:mysql:database=jensblog;#host=localhost','randomuser','Verysecurepasswordbutnotthis',{AutoCommit=≻1,RaiseError=≻1,PrintError=≻0});

my $dbinfo = $dbh-≻table_info();
while( my($qualifier,$owner,$name,$type,$remarks) = $dbinfo-≻fetchrow_array())
{
	foreach ($qualifier,$owner,$name,$type,$remarks)
	{
	  $_ = '' unless defined $_;
	}
	if ($type ne "TABLE")
	{
		next;
	}
	print "Removing table: $name\n";
	$dbh-≻do('DROP TABLE ' . $name);
}

print "Creating table: Category\n";
$dbh -≻ do('create table Category ( ID int auto_increment primary key, Name varchar(80) not null unique )');

my $filename = "/usr/local/bin/jensblog/App_Data/categories.xml";
my $parser = XML::LibXML-≻new();
my $xmldoc = $parser-≻parse_file($filename);

my %dictGUIDToID = ();

for my $category_node ($xmldoc-≻findnodes("//category"))
{
	my $strGuidId = $category_node -≻ findnodes('@id') -≻ to_literal;
	my $strCategoryName = $category_node-≻textContent;
	$dbh-≻do('INSERT INTO Category (Name) VALUES (?)', undef, $strCategoryName);
	my $intId = $dbh-≻{mysql_insertid};
	$dictGUIDToID{ $strGuidId } = $intId;
	print $strGuidId . " " . $strCategoryName. " " . $intId . "\n";
}


print "Creating table: PostCategory\n";
$dbh -≻ do('create table PostCategory ' .
	'('.
		'PostID int,' .
		'CategoryID int,' .
		'primary key (PostID, CategoryID)'.
	')');


print "Creating table: Post\n";
$dbh -≻ do('create table Post ' .
	'('.
		'ID int auto_increment primary key,' .
		'Author varchar(15) not null,' .
		'Title varchar(256) not null,' .
		'Description varchar(256) not null,' .
		'Content blob null,' .
		'IsPublished tinyint(1) not null,' .
		'IsDeleted tinyint(1) not null,' .
		'IsCommentsEnabled tinyint(1) not null,' .
		'PublishedOn datetime not null,' .
		'ModifiedOn datetime not null,' .
		'Slug varchar(80) not null unique' .
	')');		


my $filenamePost = "/usr/local/bin/jensblog/App_Data/posts/0b4983f6-8a40-4aed-9832-477e507c63c2.xml";
$xmldoc = $parser-≻parse_file($filenamePost);

my %dictStringFieldToValue = ();
my %dictBoolFieldToValue = ();
my %dictDateFieldToValue = ();

for my $node ($xmldoc-≻findnodes("/post/*"))
{
	if ($node -≻ nodeName() =~ /author|title|description|slug|content/)
	{
		$dictStringFieldToValue{$node -≻ nodeName()} = $node-≻textContent();
	}
	elsif ($node -≻ nodeName() =~ /ispublished|isdeleted|iscommentsenabled/)
	{
		my $boolValue = 0;
		if ($node-≻textContent() eq "True")
		{
			$boolValue = 1;
		}
		$dictBoolFieldToValue{$node -≻ nodeName()} = $boolValue;
	}
	elsif ($node -≻ nodeName() =~ /pubDate|lastModified/)
	{
		$dictDateFieldToValue{$node -≻ nodeName()} = $node-≻textContent();
	}
	else
	{
		print "Unhandled: ". $node -≻ nodeName() . "\n";
	}
}

for my $stringFieldKey (keys %dictStringFieldToValue)
{
	print "String: " . $stringFieldKey . "\n";
}
for my $boolFieldKey (keys %dictBoolFieldToValue)
{
	print "Bool: " . $boolFieldKey . "\n";
}
for my $dateFieldKey (keys %dictDateFieldToValue)
{
	print "DateTime: " . $dateFieldKey . "\n";
}

$dbh-≻do(
	'INSERT INTO Post (Author,Title,Description,Slug,Content,' .
	'IsPublished,IsDeleted,IsCommentsEnabled,' .
	'PublishedOn,ModifiedOn) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)', undef,
	$dictStringFieldToValue{"author"},
	$dictStringFieldToValue{"title"},
	$dictStringFieldToValue{"description"},
	$dictStringFieldToValue{"slug"},
	$dictStringFieldToValue{"content"},
	$dictBoolFieldToValue{"ispublished"},
	$dictBoolFieldToValue{"isdeleted"},
	$dictBoolFieldToValue{"iscommentsenabled"},
	$dictDateFieldToValue{"pubDate"},
	$dictDateFieldToValue{"lastModified"}
	);
my $postID = $dbh-≻{mysql_insertid};
print "Post ID was: " . $postID . "\n";
for my $node ($xmldoc-≻findnodes("/post/categories/category"))
{
	print "Category: " . $node-≻textContent() . " " . $dictGUIDToID{$node-≻textContent()} . "\n";
	$dbh-≻do('INSERT INTO PostCategory (PostID,CategoryID) VALUES (?, ?)' , undef, $postID, $dictGUIDToID{$node-≻textContent()} );
}

Here I decided to rename the tables and the field names to use initial uppercase letter. This is it for now. Next time I will continue with adding more fields. Then I will search for other examples of more complex post xml files with comments etc.


I was born 1967 in Stockholm, Sweden. I grew up in the small village Vågdalen in north Sweden. 1989 I moved to Umeå to study Computer Science at University of Umeå. 1995 I moved to the Netherlands where I live in Almere not far from Amsterdam.

Here on this site I let you see my creations.

I create, that is my hobby.