Calendars Aren't Just Plain Data
I've spent the past three months working on-and-off on a Rust implementation of RFC 5545 and its various cousins called calico
, after finding that the existing crates were variously either incomplete or extremely stringly-typed. Having now largely finished with the parser implementation, I thought it would be useful to stop and write down some of my thoughts.
Briefly: iCalendar
An iCalendar (.ics
) file is really just a list of nested records. The grammar isn't recursive, and instead defines a fixed hierarchy of record types roughly as follows:
- An
.ics
file contains zero or more calendar records. - Each calendar record is a component with some properties and one or more subcomponents.
- A component is a record belonging to a calendar which may contain some properties and zero or more subcomponents.
- A property is a record containing three elements:
- A name.
- Some optional property parameters.
- A value.
- A property parameter is a key-value pair.
What makes the standard difficult to implement is the fairly complex relationship between various elements of this hierarchy, based on information that may be statically or dynamically known.1 For example, the permissible parameters and value of a property depend on its name, and the value may also then depend on the particular arrangement of parameters. Similarly, the permissible properties and subcomponents of a component depend on its type and the presence or absence of other properties and subcomponents.
"Plain" Data
The majority of the data stored in a calendar is exactly what you would expect: simple declarations for events, TODOs, timezones, and so on. This is the raw, plain data, and is exactly the kind of thing that you could build a nice DSL for. Frankly, given that we're just concerned with records, the poor man's DSL could just be JSON or YAML.
But there are other aspects of the data that aren't really plain at all. When you define an event component, it requires both a unique identifier (UID) and a timestamp (DTSTAMP)2. These values don't exist to record information that any user will see, but instead serve to manage interactions with other calendaring systems. Other values exist to track e.g. the states of attendees, whether or not they have agreed to attend, and any responses they might have sent. In all, a decent chunk of the data in an .ics
file is used to negotiate interactions with external systems and record information about itself, and changes too frequently to be reasonably managed by hand.
To paraphrase, a calendar really contains two histories: the primary, nominal history it represents, and a second history about that primary history. A metahistory? More modern RFCs have introduced additional metahistory data, especially RFCs 9073 and 9074.
Why Does This Matter?
I originally began this project because I had a much more interesting goal in mind: designing a plaintext calendaring language.3 I had assumed that the interesting challenge would be designing something that could convey the primary history, with maybe some allowances for pulling in external data directly from .ics
files. But it has become increasingly clear to me that the actual hard problem is in the second history: how do you neatly capture the metahistory of a file in the file itself? Can you do it without tooling support? Without metadata files?
It strikes me that people use calendars to communicate with other people just as much as (if not more than) they do to record information for themselves. I don't think a calendaring system is usable if it can't record and interact with the necessary metahistory to make those communications possible, and so I'm now pretty uncertain as to where the project goes from here.
Is there a potential user base for a simpler, plaintext calendar that eschews much of the value provided by metahistory data? Do I even fall into it myself? I genuinely don't know.
I would be remiss not to point out that the parsing requirements described in RFC 5545 are utterly bizarre, and most other implementations just give up and reallocate every line of the input to deal with the line-folding rule (
calico
avoids this).↩What does the timestamp mean? Good question! It depends on whether or not the CREATED and LAST-MODIFIED properties are also present, as well as whether or not the parent calendar has a METHOD property; as a shorthand you can think of it as the time the component was last revised.↩
Alongside
@fake.koldinium.com
and@cwonus.org
, of course.↩