Example Theme for Simplified Saaze: Wendt

· klm's blog

A theme for Simplified Saaze based on the blog of Alexander Wendt, PublicoMag

Original post is here: eklausmeier.goip.de

Another theme for Simplified Saaze called "Wendt". You can inspect it here.

It offers below features:

  1. Responsive with media breaks for large and small screens, and for printing.
  2. Top menu with submenus.
  3. Two column using CSS grid, "Holy Grail Layout".
  4. Multiple blogs:
    • Each category has its own blog by using filtering.
    • Each author has its own blog by using filtering.
    • Aggregate blog, i.e., the combination of the above.
  5. Using the <!--more--> tag to showcase the initial content of a blog post.
  6. Sitemap in HTML and XML, RSS feed.
  7. WebAssembly based search using pagefind.
  8. No cookies, therefore no annoying cookie banner required.

The theme looks like this:

This theme is modeled after the blog from Alexander Wendt. That blog is powered by WordPress and hosted on Cloudflare. I have written on this PublicoMag website: Performance Remarks on PublicoMag Website. Alexander Wendt started this blog in October 2017. The number of posts per year are given in below table. Year 2024 is not complete. As time passes the year 2024 will have more and more posts.

Year 17 18 19 20 21 22 23 24
#posts 50 237 191 190 179 177 168 43
#comments 721 3999 3211 2973 2480 1300 1115 230

Number of comments were counted like this (varying 2017 to 2024):

1perl -ne 'if (/^(\d+) Kommentare <\/h5>/) { $s+=$1; printf("%d\t%d\t%s\n",$1,$s,$ARGV); }' 2017*

1. Installation #

There are two parts in the installation.

1. Install the theme including content and the Simplified Saaze static site generator using composer:

 1$ composer create-project eklausme/saaze-wendt
 2Creating a "eklausme/saaze-wendt" project at "./saaze-wendt"
 3Installing eklausme/saaze-wendt (v1.0)
 4  - Downloading eklausme/saaze-wendt (v1.0)
 5  - Installing eklausme/saaze-wendt (v1.0): Extracting archive
 6Created project in /tmp/T/saaze-wendt
 7Loading composer repositories with package information
 8Updating dependencies
 9Lock file operations: 1 install, 0 updates, 0 removals
10  - Locking eklausme/saaze (v2.2)
11Writing lock file
12Installing dependencies from lock file (including require-dev)
13Package operations: 1 install, 0 updates, 0 removals
14  - Downloading eklausme/saaze (v2.2)
15  - Installing eklausme/saaze (v2.2): Extracting archive
16Generating optimized autoload files
17No security vulnerability advisories found.
18        real 3.08s
19        user 0.48s
20        sys 0
21        swapped 0
22        total space 0

2. The Simplified Saaze installation is described in Simplified Saaze. It documents how to check for PHP version, check for yaml-parsing, FFI, MD4C extension, etc.

Once everything is installed, just run php saaze -mor.

2. Downloading all WordPress content #

We need a list or URLs available.

Below approach did not work: We use the month list in WordPress.

1for i in `seq 2018 2023`; do for j in `seq -w 01 12`; do curl https://www.publicomag.com/$i/$j/ > m$i-$j.html; done; done

Special cases for 2017 and 2024:

1curl https://www.publicomag.com/2017/10/ -o m2017-10.html
2curl https://www.publicomag.com/2017/11/ -o m2017-11.html
3curl https://www.publicomag.com/2017/12/ -o m2017-12.html
4...
5curl https://www.publicomag.com/2024/03/ -o m2024-03.html

It turned out that the month-lists lack links. To be exact: It lacks more than 466 URLs.

This approach fetches all links:

1$ curl https://www.publicomag.com/ -o wendt-p1.html
2$ time ( for i in `seq 2 124`; do
3    curl https://www.publicomag.com/page/$i/ -o wendt-p${i}.html;
4  done )

This creates 124 files:

 1$ ls -alFt | head
 2total 25580
 3drwxr-xr-x 2 klm klm   4096 Apr  2 11:34 ./
 4drwxr-xr-x 4 klm klm   4096 Apr  2 11:33 ../
 5-rw-r--r-- 1 klm klm 208194 Apr  2 11:28 wendt-p1.html
 6-rw-r--r-- 1 klm klm 187908 Apr  2 11:27 wendt-p124.html
 7-rw-r--r-- 1 klm klm 203575 Apr  2 11:27 wendt-p123.html
 8-rw-r--r-- 1 klm klm 206497 Apr  2 11:27 wendt-p122.html
 9-rw-r--r-- 1 klm klm 207572 Apr  2 11:27 wendt-p121.html
10-rw-r--r-- 1 klm klm 207970 Apr  2 11:27 wendt-p120.html
11-rw-r--r-- 1 klm klm 206010 Apr  2 11:27 wendt-p119.html
12...

List of URLs:

1perl -ne 'print $1."\n" if /<h2 class="post-title"><a href="([^"]+)"/' wendt-p*.html > allURL

Downloading all posts uses below Perl script blogwendtcurl:

 1#!/bin/perl -W
 2# Download content from www.publicomag.com (Alexander Wendt) given a list of URLs
 3# Elmar Klausmeier, 05-Mar-2024
 4
 5use strict;
 6my $fn;
 7my @F;
 8
 9while (<>) {
10	chomp;
11	@F = split('/');
12	$F[5] =~ s/a%cc%88/ä/;
13	$fn = $F[3] . '-' . $F[4] . '-' . $F[5] . '.html';
14	printf $fn . "\n";
15	`curl $_ -o $fn`;
16}

This creates a list of HTML files:

 1$ ls -alFt | head
 2total 175856
 3drwxr-xr-x 3 klm klm   4096 Mar  7 19:16 ../
 4drwxr-xr-x 2 klm klm  69632 Mar  5 19:53 ./
 5-rw-r--r-- 1 klm klm 203580 Mar  5 19:53 2024-03-18471.html
 6-rw-r--r-- 1 klm klm 252784 Mar  5 19:53 2024-03-wenn-die-zukunft-ans-fenster-des-gruenen-hauses-klopft.html
 7-rw-r--r-- 1 klm klm 203765 Mar  5 19:53 2024-03-zeller-der-woche-niedere-gruende.html
 8-rw-r--r-- 1 klm klm 203337 Mar  5 19:53 2024-02-zeller-der-woche-widerstaendler.html
 9-rw-r--r-- 1 klm klm 231904 Mar  5 19:52 2024-02-das-nie-wieder-deutschland-und-seine-millionen-fuer-judenhasser.html
10...

3. Analyzing content types #

1. Fonts.

  1. Logo: Shadows Into Light Two, original uses image instead. Another contender could be Croissant One.
  2. Text: Playfair Display

2. Categories. Categories over all posts are as follows:

1$ perl -ne 'print $1."\n" if / hentry category-([-\w]+)/' *.html | sort | uniq -c | sort -rn
2    595 spreu-weizen
3    486 politik-gesellschaft
4    122 medien-kritik
5     28 fake-news
6      3 hausbesuch
7      1 film

Different, i.e., multiple, categories can be attributed to a single post. However, the majority of posts only has a single category attached.

In the above list there is no categoriy "alte-weise". I added this category.

We want to convert images in "Alte-Weise" to text. That way loading those pages should be way quicker. Therefore we need to download those images and convert them with tesseract.

3. URLs. Below Perl one-liners produces a list of URLs for the images.

1perl -ne 'print "$1$2\n" if (/^<meta property="og:image"\s+content="(https:\/\/www\.publicomag\.com\/wp-content\/uploads\/\d+\/\d+\/)(Alte-Weis[^"]+|AlteWeise[^"]+|AlteuWeise[^"]+|auw-[^" ]+|aub_[^"]+|auw_[^"]+|AuW_[^"]+|AW_[^"]+|OW[^"]+)"/)' *.html | sort > ../allAlte-WeiseURL

Downloading these images:

1perl -ane 'chomp; @F=split(/\//); `curl $_ -o $F[7]`' ../allAlte-WeiseURL
2curl https://www.publicomag.com/wp-content/uploads/2023/01/Alte-Weise_C.Wright-Mills-1011x715.jpg -o Alte-Weise_Wright_Mills-scaled.jpg

4. JavaScript. A huge number of JavaScript libraries are loaded. We will get rid of them all.

  1. Google Analytics
  2. JQuery Minimal
  3. JQuery Migrate
  4. WordPress User Avatar
  5. Buzzblog Hercules Likes
  6. Borlabs Cookies Prioritize
  7. WordPress GDPR Compliance
  8. Comment Reply
  9. Contact Form
  10. JQuery Easing for Buzzblog
  11. JQuery MagnificPopup for Buzzblog
  12. JQuery Plugins for Buzzblog
  13. JQuery JustifiedGallery for Buzzblog
  14. Buzzblog Bootstrap
  15. Owl Carussel for Buzzblog
  16. Buzzblog AnimatedHeader
  17. Shariff
  18. MailPoet
  19. Akismet
  20. Borlabs Cookies Minimal

4. Reducing number of images #

An easy target is the logo: this was replaced with plain text. This saves one roundtrip to the web-server.

1. For the category "alte-weise" the entire image with text is converted to two elements:

  1. An image
  2. The actual text

The image is scanned with tesseract.

That way the text can be searched via Pagefind. Also, the required bandwidth is reduced.

Old:

New:

The new approach is to use a blockquote, where the CSS puts an image on top:

1blockquote blockquote {
2    background: transparent no-repeat top/30% url('/img/Alte-Weise-Kopf.svg');
3    text-align:center;
4    padding-left:2rem;
5    padding-right:2rem;
6    padding-top:12rem;
7    padding-bottom:1rem;
8    background-color:#b6c7c8; border-radius:2.5rem
9}

The actual text in Markdown is then:

1>> „Zweifel ist nicht das Gegenteil, sondern ein Element des Glaubens.“
2>>
3>> Paul Tillich

That way the ordinary blockquote in Markdown (single >) is left free to be used for citations.

Obviously, entering the text in >> is way easier than producing an image for each epigram.

2. Care was taken to reduce the number of images needed for the social media icons.

Old:

New:

That reduces loading eight images. However, you need to load some font glyphs.

1<a style="background-color:SkyBlue; color:white" href="https://telegram.me/share/url?url=<?=$urlEncoded?>&text=<?=$titleEncoded?>"
2   title="Teilen auf Telegram" target=_blank>&nbsp;<span class=symbols>&#x01fbb0;</span>&nbsp;Telegram&nbsp;</a>

In particular this symbol U+1fbb0 is %F0%9F%AE%B0 when URL encoded:

1@import url('https://fonts.googleapis.com/css2?family=Noto+Sans+Symbols+2&text=%F0%9F%97%8F%F0%9F%AE%B0%F0%9F%96%82%F0%9F%96%A8');

Similarly, symbol U+1f5cf is %F0%9F%97%8F when URL encoded.

5. Converting WordPress HTML to Markdown #

Perl script blogwendtmd is used to convert a single HTML file to Markdown.

1$ time ( for i in *.html; do blogwendtmd $i; done )
2        real 94.95s
3        user 136.51s
4        sys 0
5        swapped 0
6        total space 0

The long runtime is exclusively for running tesseract, i.e., the conversion from image to text. Once all WordPress posts are converted to Markdown, this script no longer needs to be run, obviously.

blogwendtmd is 180 lines of Perl code.

Listing of all authors and their corresponding directories.

 1$ perl -ne 'print $1."\n" if /\/author\/([^\/]+)\//' 2*.html | sort -u
 2alexander
 3archi-bechlenberg
 4bernd-zeller
 5cora-stephan
 6david-berger
 7hansjoerg-mueller
 8joerg-friedrich
 9matthias-matussek
10redaktion
11samuel-horn
12wolfram-ackner

Each of these authors have a separate index beneath /author/.

Generating all yearly overviews:

1for i in *; do ( echo $i; cd $i; blogwendtdate -gy$i *.md > index.md ) done

Perl script blogwendtdate generates a Markdown file, which contains all articles for the corresponding year. This script first has to store all posts for one year in a hash, sort it according to date in the frontmatter.

 1my @L;	# list of posts in a year, in the beginning not necessarily sorted
 2
 3sub markdownfile(@) {
 4	my $f = $_[0];
 5	my ($flag,$title,$date,$draft) = (0,"","",0);
 6	open(F,"<$f") || die("Cannot open $f");
 7	while (<F>) {
 8		if (/^\-\-\-\s*$/) {
 9			last if (++$flag >= 2);
10        . . .
11	}
12	if ($draft == 0  &&  length($title) > 0  &&  length($date) > 0) {
13		push(@L, sprintf("%s: [%s](%s%s)",$date,$title,$prefix,substr($f,0,-3)) );
14	}
15	close(F) || die("Cannot close $f");
16}
17
18while (<@ARGV>) {
19	#printf("ARGV=|%s|\n",$_);
20	next if (substr($_,-8) eq "index.md");
21	markdownfile($_);
22}
23
24for (sort @L) {
25	printf("%d. %s\n",++$cnt,$_);
26}

Many HTML errors were corrected, which were reported by Nu Html Checker. See for example das-magische-sprechen-schafft-macht-fuer-den-augenblick.

6. Handling comments #

The Publico blog contains comments, where readers have left their thoughts. In Perl script blogwendtmd we detect comments by checking for <h5> tags for the beginning, and pinglist for the end of all comments.

1if (/^<ul class="pinglist">/) { $flag = 0; next; }
2elsif (/<h5 class="comments-h">/) {
3    ...
4    $flag = 1;
5}
6next if ($flag == 0);

We refrained from integrating the commenting system HashOver. It is not difficult, as we have already demonstrated in the Lemire theme. However, for a political blog a comment system is rather "dangerous", as it can attract rather unwelcoming writings. Under German law the hoster of these comments becomes liable. Essentially, you therefore must check every comment manually:

... da die Kommentare alle gesichtet werden müssen und die Redaktion nach wie vor aus dem Gründer Alexander Wendt und einer Teilzeitredakteurin besteht, können sie nicht umgehend online gehen.

In light of the high volume of comments HashOver should most probably be added.

7. Running static site generator #

In serial mode it takes less than 3 seconds to build 19 collections without comments. With comments it takes less than 6 seconds to process 23 thousand pages, see below. This build time can be almost halved by using parallelisation with -p16.

 1$ time php saaze -morb /tmp/build
 2Building static site in /tmp/build...
 3    execute(): filePath=./content/alexander.yml, nSIentries=770, totalPages=39, entries_per_page=20
 4    execute(): filePath=./content/alte-weise.yml, nSIentries=131, totalPages=7, entries_per_page=20
 5    execute(): filePath=./content/archi-bechlenberg.yml, nSIentries=5, totalPages=1, entries_per_page=20
 6    execute(): filePath=./content/bernd-zeller.yml, nSIentries=332, totalPages=17, entries_per_page=20
 7    execute(): filePath=./content/cora-stephan.yml, nSIentries=1, totalPages=1, entries_per_page=20
 8    execute(): filePath=./content/david-berger.yml, nSIentries=1, totalPages=1, entries_per_page=20
 9    execute(): filePath=./content/fake-news.yml, nSIentries=28, totalPages=2, entries_per_page=20
10    execute(): filePath=./content/film.yml, nSIentries=1, totalPages=1, entries_per_page=20
11    execute(): filePath=./content/hansjoerg-mueller.yml, nSIentries=2, totalPages=1, entries_per_page=20
12    execute(): filePath=./content/hausbesuch.yml, nSIentries=2, totalPages=1, entries_per_page=20
13    execute(): filePath=./content/joerg-friedrich.yml, nSIentries=2, totalPages=1, entries_per_page=20
14    execute(): filePath=./content/mag.yml, nSIentries=1235, totalPages=62, entries_per_page=20
15    execute(): filePath=./content/matthias-matussek.yml, nSIentries=1, totalPages=1, entries_per_page=20
16    execute(): filePath=./content/medien-kritik.yml, nSIentries=123, totalPages=7, entries_per_page=20
17    execute(): filePath=./content/politik-gesellschaft.yml, nSIentries=486, totalPages=25, entries_per_page=20
18    execute(): filePath=./content/redaktion.yml, nSIentries=112, totalPages=6, entries_per_page=20
19    execute(): filePath=./content/samuel-horn.yml, nSIentries=3, totalPages=1, entries_per_page=20
20    execute(): filePath=./content/spreu-weizen.yml, nSIentries=596, totalPages=30, entries_per_page=20
21    execute(): filePath=./content/wolfram-ackner.yml, nSIentries=6, totalPages=1, entries_per_page=20
22Finished creating 19 collections, 19 with index, and 1248 entries (2.58 secs / 809.47MB)
23#collections=19, parseEntry=0.7290/23712-19, md2html=1.1983, toHtml=1.2839/23712, renderEntry=0.1562/1248, renderCollection=0.0403/224, content=23712/0
24    real 5.16s
25    user 4.36s
26    sys 0
27    swapped 0
28    total space 0

Running pagefind, i.e., indexing al keywords for the WebAssembly based search functionality:

 1$ time pagefind -s . --exclude-selectors aside --exclude-selectors footer --force-language=de
 2
 3Running Pagefind v1.0.4
 4Running from: "/tmp/buildwendt"
 5Source:       ""
 6Output:       "pagefind"
 7
 8[Walking source directory]
 9Found 1473 files matching **/*.{html}
10
11[Parsing files]
12Did not find a data-pagefind-body element on the site.
13↳ Indexing all <body> elements on the site.
14
15[Reading languages]
16Discovered 1 language: de
17
18[Building search indexes]
19Total:
20  Indexed 1 language
21  Indexed 1473 pages
22  Indexed 133261 words
23  Indexed 0 filters
24  Indexed 0 sorts
25
26Finished in 19.644 seconds
27        real 19.87s
28        user 18.28s
29        sys 0
30        swapped 0
31        total space 0

It would take 11 seconds without comments, i.e., indexing 77168 words.

8. Collections #

There are quite a number of collections at play in this theme. The most important one being mag (short for magazine). This directory contains all the blog posts. All the other collections are just symbolic links to mag, i.e., they do not contain additional content.

 1total 96
 2drwxr-xr-x  4 klm klm 4096 Apr 27 17:11 ./
 3drwxr-xr-x  7 klm klm 4096 May 13 13:00 ../
 4lrwxrwxrwx  1 klm klm    3 Mar 26 21:48 alexander -> mag/
 5-rw-r--r--  1 klm klm  273 Apr  2 18:56 alexander.yml
 6lrwxrwxrwx  1 klm klm    3 Apr 27 17:11 alte-weise -> mag/
 7-rw-r--r--  1 klm klm  225 Apr 27 17:10 alte-weise.yml
 8lrwxrwxrwx  1 klm klm    3 Mar 31 17:22 archi-bechlenberg -> mag/
 9-rw-r--r--  1 klm klm  495 Apr  2 18:58 archi-bechlenberg.yml
10lrwxrwxrwx  1 klm klm    3 Mar 31 17:17 bernd-zeller -> mag/
11-rw-r--r--  1 klm klm  213 Apr  2 18:01 bernd-zeller.yml
12lrwxrwxrwx  1 klm klm    3 Apr  2 15:18 cora-stephan -> mag/
13-rw-r--r--  1 klm klm  707 Apr  2 19:01 cora-stephan.yml
14lrwxrwxrwx  1 klm klm    3 Apr  2 15:17 david-berger -> mag/
15-rw-r--r--  1 klm klm  761 Apr  2 19:06 david-berger.yml
16drwxr-xr-x  2 klm klm 4096 Apr  2 16:24 error/
17-rw-r--r--  1 klm klm   88 Apr  2 16:21 error.not_used_yml
18lrwxrwxrwx  1 klm klm    3 Apr  2 19:25 fake-news -> mag/
19-rw-r--r--  1 klm klm  216 Apr  2 19:42 fake-news.yml
20lrwxrwxrwx  1 klm klm    3 Apr  2 19:25 film -> mag/
21-rw-r--r--  1 klm klm  201 Apr  2 19:43 film.yml
22lrwxrwxrwx  1 klm klm    3 Mar 31 17:22 hansjoerg-mueller -> mag/
23-rw-r--r--  1 klm klm  318 Apr  2 18:56 hansjoerg-mueller.yml
24lrwxrwxrwx  1 klm klm    3 Apr  2 19:25 hausbesuch -> mag/
25-rw-r--r--  1 klm klm  219 Apr  2 19:42 hausbesuch.yml
26lrwxrwxrwx  1 klm klm    3 Apr  2 15:18 joerg-friedrich -> mag/
27-rw-r--r--  1 klm klm  222 Apr  2 18:01 joerg-friedrich.yml
28drwxr-xr-x 10 klm klm 4096 May 12 20:56 mag/
29-rw-r--r--  1 klm klm  110 Apr  1 22:25 mag.yml
30lrwxrwxrwx  1 klm klm    3 Mar 31 17:22 matthias-matussek -> mag/
31-rw-r--r--  1 klm klm  228 Apr  2 18:02 matthias-matussek.yml
32lrwxrwxrwx  1 klm klm    3 Apr  2 19:25 medien-kritik -> mag/
33-rw-r--r--  1 klm klm  234 Apr  2 19:27 medien-kritik.yml
34lrwxrwxrwx  1 klm klm    3 Apr  2 17:47 politik-gesellschaft -> mag/
35-rw-r--r--  1 klm klm  255 Apr  2 17:59 politik-gesellschaft.yml
36lrwxrwxrwx  1 klm klm    3 Mar 31 17:16 redaktion -> mag/
37-rw-r--r--  1 klm klm  202 Apr  2 18:03 redaktion.yml
38lrwxrwxrwx  1 klm klm    3 Mar 31 17:21 samuel-horn -> mag/
39-rw-r--r--  1 klm klm  259 Apr  2 19:03 samuel-horn.yml
40lrwxrwxrwx  1 klm klm    3 Apr  2 19:25 spreu-weizen -> mag/
41-rw-r--r--  1 klm klm  231 Apr  2 19:27 spreu-weizen.yml
42lrwxrwxrwx  1 klm klm    3 Mar 31 17:22 wolfram-ackner -> mag/
43-rw-r--r--  1 klm klm  542 Apr  2 19:05 wolfram-ackner.yml

The collection yaml files look like this. First mag.yml:

1title: Publico
2sort_field: date
3sort_direction: desc
4index_route: /
5entry_route: /{slug}
6more: true
7rss: true

Now alexander.yml, which filters for author:

1title: Publico - Autor Alexander Wendt
2subtitle: "Alexander Wendt ist Herausgeber von Publico."
3sort_field: date
4sort_direction: desc
5index_route: /author/alexander
6entry: false
7entry_route: /{slug}
8more: true
9filter: return ($entry->data['author'] === 'Alexander Wendt');

Similarly, alte-weise.yml, which filters for categories:

1title: Publico - Alte &amp; Weise
2sort_field: date
3sort_direction: desc
4index_route: /alte-weise
5entry: false
6entry_route: /{slug}
7more: true
8filter: return (array_search('alte-weise',$entry->data['categories']) !== false);

Except mag.yml, all other yaml files set rss: false.

9. Templates #

This theme uses the following PHP template files:

  1. bottom-layout.php: commonalities for the bottom part
  2. entry.php: template for the entry, i.e., the usual blog post
  3. error.php: 404 page, or other error conditions
  4. head.php: HTML for the first few lines for all HTML files
  5. index.php: template for the index, i.e., the listing of posts
  6. overview.php: HTML sitemap
  7. rss.php: RSS feed
  8. sitemap.php: XML sitemap
  9. top-layout.php: commonalities for the top part

I use the following hierarchy of PHP files for my entry-template, i.e., the template for a blog post: [markmap]

entry.php #

top-layout.php #

head.php #

Actual content: $entry['content'] #

bottom-layout.php #

[/markmap] The following hierarchy is used for the index-template, i.e., the template for showing a reverse-date sorted list of blog posts: [markmap]

index.php #

top-layout.php #

head.php #

for-loop over entry-excerpts #

bottom-layout.php #

[/markmap]