home | changes | index | login

October 4 2006

WareLogging, DisplayCode.

<Alan> well, it definitely needs better blogging software ;)

<alan> Your site is to me a mysterious thing. I can skim Wala.pm and still not quite get how the thing is laid out. Is it a blog or is it a wiki? I wasn't totally aware of the distinction between the blog (date-based as you say) subsystem and the wiki subsytem until you pointed it out in this post. In short, it's nebulous, but part of me likes the nebulousness. The whole site feels minimalist and yet magical.

Maybe you should marry the two subsystems. Like so: make the blog a view of the wiki diff stream. To blog post you just find the appropriate wiki place, or create a special purpose entry, and just start typing. Then the temporal component--which arguably diminishes in importance if you're saying anything truly durable--is not intrinsically tied to the content, but rather generated from.

Here I go giving away all my wiki / blog fusion ideas. Not like I'm making much implementation progress on them myself. ;)

<Brennen> It's funny. There're tricky bits - with versioning, the choice of markup, namespaces - but wikis and blogs should be fairly straightforward critters. Somehow, though, they offer a lot to think about.

I've thought about doing what you describe, more or less: Edit the documents in an arbitrary, wiki-like namespace, and then just look at modification times. The front page becomes a verbose RecentChanges, or maybe more a verbose reversed PageChangeTimes. (Though, subthought, a big part of the problem with blogs in general is the emphasis on temporal stuff. This obsession with novelty cripples us.)

What's stopped me so far is that when I started putting entries under a stack of directories named by date, I'd actually thought out all of these reasons it was a good idea: I would make days the structural units of my expression in the same way that stanzas or lines or feet or what have you are the scaffolding of verse. It's a certain kind of rigid, but you can hang things on it without worrying about where they belong. It's really just a sequence but it also self organizes into a useful hierarchy. It maps nicely onto a short URL. I was originally just gonna write a shell script wrapper around cat. etc.

Then of course I went and stuck a wiki in the background, which might negate a lot of that, and which probably accounts for that nebulousness you mention.

<Brennen> Also, "minimalist and yet magical" gives me the warm fuzzies.

<Alan> Stack of directories...I see. Maybe you can turn /archives into a cgi that just does time slicing on entries that are elsewhere. You can accomplish an awful lot with the readdir, stat, grep and sort builtins. Probably the majority of the work would be finding a placement algorithm for all those "purely temporal" thoughts of yours ;)

Let me throw out another idea. You don't like CamelCase. I find it a little too DWIM myself; it's not that much effort to resolve all the potential ambiguities yourself with simple bracketing.

Yet there is something to be said for automatic cross-referencing. So maybe a cool extra wiki view mode would be something I'll call "autoxref." Before presenting a page it trolls all word sequences up to length n and automatically linkifies anything that corresponds to an entry.

You could adapt the Knuth-Morris-Prat algorithm or the Boyer Moore to do this reasonably efficiently.

<Brennen> "time slicing on entries that are elsewhere ... placement algorithm" - Might be simple enough to say that an entry in the date hierarchy is always the first thing listed for a given date (range of mod times within a date), and then just intermingle other entries with the stream by their last modification time. The /2006/10/4 style URLs return the same as now, + views on the entries in that time slice. Over time, the further back you go in the range of available dates the fewer entries you encounter that aren't in the date hierarchy. Unless I preserve creation + subsequent mod times to leave a trail of activity and preserve context for everything... Now where did my minimalism wonder off to?

<Alan> By placement algorithm I meant what wiki entry to put each currently time-anchored post under. Categorization algorithm, whatever.

Here's some time slicing for you to experiment with.

use Date::Manip qw/DateCalc UnixDate/; use Fcntl qw(:mode); use strict;

if (exists $ENV{REQUEST_METHOD}) { require CGI; my $cgi = CGI->new; print $cgi->header; timeslice('.', [ grep length, split m{/}, $cgi->path_info ], 'mtime'); } else { timeslice('.', \@ARGV, 'mtime'); }

sub timeslice { my @tfields = qw(y m d h mn s); my $dir = shift; my $slicefrom = shift || ; $slicefrom = @$slicefrom ? $slicefrom : [ UnixDate('today', '%Y') ]; my $sliceby = {qw(atime 8 mtime 9 ctime 10)}->{shift || 'ctime'};

my $date1 = join '', map { s/\D//g; sprintf('%02d', $_) } @$slicefrom;
my $date2 = DateCalc('+1'.$tfields[@$slicefrom-1], $date1);
my ($t1, $t2) = map { UnixDate($_, '%s') } ($date1, $date2);

my %entry;

opendir my $dh, "$dir" or
  die "Error opening $dir: $?";

while (my $file = readdir $dh) {
  my @stat = stat "$dir/$file" or next;
  next unless S_ISREG($stat2);
  next unless $t1 &lt;= $stat[$sliceby] &amp;&amp; $stat[$sliceby] &lt;= $t2;
  $entry{$file} = $stat[$sliceby];

closedir $dh or
  die "Error closing $dir: $?";

print $_.$/ for sort { $entry{$a} &lt;=> $entry{$b} } keys %entry; 


<Brennen> This is nifty. It took much longer than I really want to think about to know roughly what was going on here. I ''am'' going to play with this.

It's driving me a little nuts that I don't know offhand how to construct a shell one-liner that will make find slice on a specific date, because it seems like there ought to be options to date that say "how many days ago was this?". (I did just have a nightmare vision of feeding the output from cal for a range of months to grep to wc to find, interspersing cut as (un)necessary.)

<Alan> I agree, I'm also bothered. I would have done it in shell to using date(1), sort(1), cut(1) and stat(1) but as you can see here the majority of the logic is date time conversion / interval arithmetic. I don't know of any core unix utils that do interval arithmetic on dates, do you?

And trust me, I looked at every other module in the Date:: namespace...before digging up the lodestone of dark power that is Date::Manip. Someone should compute McCabe complexity on that module man.

<Brennen> Hrm. Offhand (well, offhand and after half an hour's digging), no, but I had expected to find that at least the GNU date would display an arbitrary date. Waitaminute.

bbearnes@wendigo:~$ date -d 2006/3/1
Wed Mar 1 00:00:00 MST 2006 $ date -d 2006/3/1 +%s 1141196400 $ date -d today +%s 1160158510 $ echo "(1160158510 - 1141196400) / 86400" | bc 219 $ find . -mtime 219 ./work/tfox/old_scrapes/boston/boston_nodupes.tgz $ stat ./work/tfox/old_scrapes/boston/boston_nodupes.tgz File: `./work/tfox/old_scrapes/boston/boston_nodupes.tgz' Size: 213853 Blocks: 432 IO Block: 4096 regular file Device: 306h/774d Inode: 489682 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/bbearnes) Gid: ( 1000/bbearnes) Access: 2006-10-05 10:33:21.000000000 -0600 Modify: 2006-02-28 22:46:07.000000000 -0700 Change: 2006-02-28 22:46:07.000000000 -0700

<Alan> Problem with gnu date(1) is that it doesn't do explicit human -> epoch conversion. You can specify output format with +whatever, but there's no corresponding "input format is %Y%M%d%h" sort of thing. I bet 2006, 2006/3, and 2006/3/1 will all behave as expected though, and maybe this is all you need.

In this case all you need is the interval arithmetic. Maybe this is where your creative use of cal(1) comes in...how else do you know #days in a given year or #days in a given month/year?

<Alan> okay, your nightmare vision is slowly taking form...

$ cal 2006 | tr ' ' '\n' | grep -E '[[:digit:]]+' | grep -v 2006 | wc -l 365

$ cal 2000 | tr ' ' '\n' | grep -E '[[:digit:]]+' | grep -v 2000 | wc -l # leap year 366

$ cal 9 2006 | tr ' ' '\n' | grep -E '[[:digit:]]+' | grep -v 2006 | wc -l 30

<Brennen> Actually 2006 & 2006/3 appear to be interpreted differently, 2006 as today (?) and 2006/3 as... Well, I'm not sure what: Sat Dec 27 17:31:44 MST 2036. Odd.

a quote from "info coreutils date":

:Our units of temporal measurement, from seconds on up to months, are so complicated, symmetrical and disjunctive so as to make coherent mental reckoning in time all but impossible. Indeed, had some tyrannical god contrived to enslave our minds to time, to make it all but impossible for us to escape subjection to sodden routines and unpleasant surprises, he could hardly have done better than handing down our present system. It is like a set of trapezoidal building blocks, with no vertical or horizontal surfaces, like a language in which the simplest thought demands ornate constructions, useless particles and lengthy circumlocutions. Unlike the more successful patterns of language and science, which enable us to face experience boldly or at least level-headedly, our system of temporal calculation silently and persistently encourages our terror of time.

:... It is as though architects had to measure length in feet, width in meters and height in ells; as though basic instruction manuals demanded a knowledge of five different languages. It is no wonder then that we often look into our own immediate past or future, last Tuesday or a week from Sunday, with feelings of helpless confusion. ...

:-- Robert Grudin, `Time and the Art of Living'.

Maybe you could do special cases for each month and check for leap years. We are no longer in one-liner territory at that point. I have not really done any math in the shell other than piped to bc; would this be miserable?

'''* Brennen hits post and recoils in horror at the post that has appeared in the meanwhile...'''

<Alan> yeah i love the quotes in coreutils. i think the tsort background is pretty interesting. if only linkers were so simple.

<Brennen> This is one reason I like the notion of unix-as-literature. Or quasi-oral tradition. (I almost said "hypothesis" instead of "notion", but it's not like you could really falsify the idea.)

<Alan> bash shell arithmetic is not miserable if you use the $(( )) construct. shelling out to expr(1) is, on the other hand, miserable.

<Alan> either way, you're only going to compute the time interval once per request--looping and statting entries will be the slowdown.

<Brennen> The performance smart thing to do would, I suppose, be dumping out a static set of pages and updating only on a change.

<Alan> sure. add make(1) to the list.

<Alan> btw, at the rate we're going here, you're going to need an irc / wiki fusion before a wiki / blog one. :)

<Brennen> I'll just run out and grab one of those fat AJAX books, then...

AFAICR, this wiki was written because someone voiced the thought that we ought to make wiki more like IRC, with a little low-cost input box at the bottom...

<Brennen> (I didn't write it, so maybe I'm getting motivations backwards.)

<Alan> AJAX and core unix utilities are antiparticles. this page will probably explode now.

<Brennen> i suspect it's being held together largely by the roundabout intellectual parity of the ideas of implementing your publishing platform in bash and implementing, um, just about anything in a big heap consisting of fat "thin" client, display language(s), client scripting language, amorphous gluey markup language, server, server side language(s), and database (did i leave anything out?).

<Alan> while we're at it brennen, let's just fuse all forms of online communication into one big system that is a blog, a wiki, a chat server and an email server.

after all, one could argue that these forms all forked via a now rather arbitrary discretization along the axes of realtime-ness and access control. i.e. 2-way chat is like fast email. blog is like email readable to the world. wiki is like blog organized by idea space instead, of by idea time. wiki is like an open-invite chat room except not real-time.

<Brennen> ...or wiki is like a blog with different access control, e-mail is a targetted feed of a blog, you could fake it all in talk(1) anyway, and you could basically wedge any of it in a database...

maybe we could wrap it all up to look like a shitty e-mail client and sell it to IBM.

<Brennen> (i take back my earlier remark about intellectual parity, by the by. bash is objectively more sane.)

<Alan> i heard "bash" and "sane" in the same sentence...explosion of this page is now a certainty

pick a name (required to comment or edit a page)
last edited July 6, 2007