Monday, April 9
App::WRT v4.3.0: schwartzian transforms, long-term projects
I should have been doing other things, but I spent a couple of hours over the weekend making wrt, the static site generator I use for p1k3.com, a bit more capable.
I decided I wanted to make the feed wrt generates (like this one) contain the most recent n entries instead of just the entries for the most recent month. For example, instead of just rendering a feed with the entries for this April, I wanted it to contain the last 30 days for which I’d written something.
If wrt entries lived in, say, an SQL database of some sort, this would be just a matter of changing a query to get some different ones. Since they’re just flatfiles in a directory tree without a lot of abstractions around them, it was a bit trickier but also more interesting.
Simplified a lot, the wrt repository for this site looks something like this:
/home/brennen/p1k3/
▾ archives/
▾ 2018/
▸ 1/
▸ 2/
▸ 3/
▾ 4/
▸ 5/
▸ 8/
▸ 9/
The basic idea is that a file 3 deep in the hierarchy of numerical
entries—like 2018/4/9
for this entry—represents a day, inside a month, inside a
year. If I wanted to put the last 30 entries into the feed, I’d need to
flatten this structure out into a sorted list.
I remembered that I was already getting a list of all the entries for the wrt
render-all
script that renders the whole site at once, so it seemed simple
enough to reuse that list, but there was a catch: Doing a simple reversed
sort
on that list gave me results like these:
...
2016/10/16
2016/10/14
2016/10/12
2016/10
2016/1/5
2016/1/3
2016/1/28
...
…because in a string comparison, 2018/10
follows 2018/1
, not 2018/9
.
If I’d decided to pad the months with 0s, like 2018/01
, a while back, this
would have been less of a problem, but it seemed pretty solvable. I just
needed to convert the entry paths to a different format and sort by that.
I wound up reading the Wikipedia entry for the Schwartzian transform, and writing something like the following:
sub get_date_entries_by_depth {
my $self = shift;
my ($depth) = @_;
# Match given $depth:
my @particles;
for (my $i = 0; $i < $depth; $i++) {
push @particles, '\d+';
}
my $pattern = join '/', @particles;
# Sort matching entries by sortable_date_from_entry()
return map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { [$_, sortable_date_from_entry($_)] }
grep m{^ $pattern $}x, $self->get_all_source_files();
}
sub sortable_date_from_entry {
my ($entry) = @_;
my @parts = map { sprintf("%4d", $_) } split '/', $entry;
return join '', @parts;
}
First, this builds a regular expression to match entries that are at a certain depth in the hierarchy (1 is a year, 2 is a month, 3 is a day).
Then it:
grep
s the list returned byget_all_source_files()
for entries matching the pattern.map
s the matching entries to a list of two-element arrays where the 0th element is the original path to the entry (2018/4/9
), and the 1st element is a format returned bysortable_date_from_entry()
that will sort correctly using string comparison (201800040009
).- Sorts the overall list by comparing the formatted values.
- Re-
map
s the list to the original format stored in the 0th element.
So now, in order to get the list of entries to turn into a feed, I can just call:
my @entries = reverse $self->get_date_entries_by_depth(3);
…and take the first 30 or so.
❁
Once I had the feed done, I decided to apply the same idea to the set of entries on the front page, and once I’d done that I realized that I could also use the same sorted lists to generate next/previous links for any given node in the date tree.
This was an interesting way to kill some time, both because I revisited an
algorithm I’d forgotten about, and because every time I hack on a project like
this I’m in a dialog with basic decisions I made before I knew how to write
software at all. And maybe, by the same token, looking with fresh eyes at
norms that I’d take for granted in any more modern context. wrt
isn’t a good
piece of software by any contemporary standard, and the approach it represents
isn’t one I’d use for anything bigger than a trivial shell script at my day
job, but there’s a curious durability to it all the same.
Every few years I revisit some facet of this tiny, mundane tool and apply a bit of understanding I lacked when it was first written, and some structure comes a little clearer that lives in the space between my ignorance at 20 and my experience, such as it is, at whatever age I’ve reached.
Everyone should have a few long-term projects, however small and unremarkable.