Top-level specification

Assumptions

An assumption (A-…) is a requirement for the intended behaviour of dlb that cannot be checked at runtime in a reliable and efficient way.

For every assumption at set of acceptable effects of its violation is given:

repair
The working tree needs manual repair.
obscure-fail
A build may fail in an obscure way. The diagnostic messages do neither clearly indicate the problem nor a way to fix it.
vulnerable
The build becomes vulnerable to attacks. Running tool instances might overwrite any filesystem object the process has permission to, potentially with information generated during the build or stored in the working tree.
redo-miss
A redo miss in the current or a future run of dlb can occur.
graceful-fail
A build may fail in a meaningful way. The diagnostic messages clearly indicate the problem or a way to fix it.

Modification of the working tree

A-A1 (access to management tree)

Except for modifications by dlb internals, the management tree is not modified while dlb is running and only as suggested by diagnostic messages of dlb.

Changing the absolute path of the working tree’s root is considered a modification.

Acceptable when violated:

A-A2 (access to managed tree)

While a tool instance in running, the managed tree is modified only by running tool instances.

Changing the absolute path of the working tree’s root is considered a modification.

Acceptable when violated:

A-A3 (manual modification of mtime)

Except from modifications requested by a running tool instance, every modification of the mtime of a filesystem object in the working tree is a mtime update. [2]

Acceptable when violated:

A-A4

No part of the filesystem outside of the working tree is modified while a tool instance t is running, unless it cannot affect the behaviour of t.

Acceptable when violated:

Filesystems behaviour of working tree

A-F1 (one filesystem)

Every filesystem object w/p, where w is the path of the working tree’s root and p is a relative path without .. components, resides on the same (local or remote) file system.

Acceptable when violated:

A-F2 (mtime update at creation)

Every creation of a filesystem object in the working tree updates its mtime. [4]

Acceptable when violated:

A-F3 (mtime update at write to regular file)

Every write to regular file in the working tree updates its mtime as soon as it is completed. [4] [1]

Between start and completion of a write, a reader of the file may observe an intermediate state of the file’s content.

[--------------] content change

^              ^
start          mtime update (write complete)

-------------------> ideal time

Acceptable when violated:

A-F4 (mtime update for directory)

Every creation, removal, renaming and attribute change of a filesystem object in the working tree updates the mtime of the (directly) containing directory. [4]

Acceptable when violated:

A-F5 (moving is atomic)

Moving a regular file, a directory or a symbolic link in the working tree to a different directory in the working tree is possible in an reasonably secure, efficient and atomic operation that does not affect the moved object’s filesystem attributes (including mtime in full resolution).

Acceptable when violated:

A-F6 (moving makes invisible)

Immediately after a regular file, a directory or a symbolic link in the working tree has been successfully moved to a different directory within the same working tree, no other process “sees” it in the original directory.

Acceptable when violated:

A-F7 (no corruption)

A filesystem object in working tree is never corrupted (e.g. by failure of software, memory or power).

Acceptable when violated:

Timing and concurrency

A-T1 (working tree time exists)

The mtime of every filesystem object in the working tree is updated from the same system time (local or remote), the working tree’s system time. Whenever a mtime update occurs for a filesystem object p after one occurs for filesystem object q, where p and q are in the working tree, the mtime of p is not later than the mtime of q. [3]

Acceptable when violated:

A-T2 (working tree time mostly monotonically increasing)

With the exception of rare backward jumps, the working tree time is monotonically increasing.

The time between consecutive backward jumps is longer than the duration of a run of dlb.

Acceptable when violated:

A-T3 (effective mtime resolution)

The regular file o in the management tree has an effective mtime resolution no coarser than 100 ms.

Acceptable when violated:

A-T4 (working tree time of true input dependencies in the past)

The mtime of every filesystem object in the managed tree that is an true input dependency of a tool instance t is earlier than the time t starts running.

Acceptable when violated:

Dependencies

A-D1 (regular files)
Most of the filesystem objects in the managed tree that serve as input dependency of tool instances are regular files.
A-D2 (no links to input dependencies)

A filesystem object in the managed tree that serves as an input dependency of a tool instance has no hard link or symbolic link pointing to it in the managed tree.

Acceptable when violated:

A-D3 (no implicit symbolic links in paths)

A filesystem object in the managed tree that serves as a dependency of a tool instance t does not have a parent directory p in its path that is a symbolic link, unless p is an input dependency of t and in the working tree.

Acceptable when violated:

Guarantees

A guarantee (G-…) is a specification of behaviour observable by the user.

Dependencies

G-D1 (no redo miss when working tree time monotonic)

A benign managed tree modification is redo-safe, provided the assumptions A-F1, A-F2, A-F3, A-F4 hold and the working tree time is monotonically increasing (at least since the oldest mtime of all filesystem objects that are true input dependencies of a tool instance).

This is true even when assumption A-A2 is violated.

G-D2 (no redo miss when file size changes)
Modifying the content of a regular file in the managed tree while a tool instance is running (in violation of A-A2) is redo-safe if it also changes the size of the regular file.
G-D3 (redo miss unlikely when modification intervals relatively long)

A benign managed tree modification is likely to be redo-safe, provided the assumptions A-F1, A-F2, A-F3, A-F4 hold and the “modification intervals are relatively long” for every filesystem object that is a true input dependency of a tool instance.

Here, a modification interval of a filesystem object p is considered to be relatively long if it is unlikely that the working tree time at the ideal time t is the same as at t + T, where T is the ideal time between two consecutive mtime updates of p.

This is true even when assumption A-A2 is violated.

G-D4
When assumption A-T2 is violated at a certain time at the start of the “redo check” phase of a running tool instance, the build is aborted with a meaningful diagnostic message. [6]

Timing and concurrency

G-T1 (active context exit)
An active context is not left as long as a tool instance is running in it.
G-T2 (root context exit)
A root context is not left other than by a raised exception before there has been a time window with the following property: The mtime of a regular file o in the management tree would have been different from the mtime of the last filesystem object modified by a running tool instance.
G-T3 (multiple dlb processes)
When multiple scripts are run by different processes on the same working tree, at most one of them is in an active context at the same time.
G-T4 (threads)
dlb does not create threads.
[1]The update of mtime for an mmap’d file conforming to ISO 1003.1-2008 after a write to the mapped memory is only guaranteed via msync(). Therefore such a write operation is not considered complete before the next call of msync() (which may never happen). Actual behaviour of mmap on different operating systems (2018): https://apenwarr.ca/log/20181113.
[2]Especially, mtime is not manually set with touch -t or any tool that uses a coarser time resolution than the effective mtime resolution. See touch and utimensat().
[3]Linux currently (2020) uses ktime_get_coarse_real_ts64() as time source for its (optional) mtime updates, which which returns the system-wide realtime clock at the last tick.
[4](1, 2, 3) Some filesystems support mount options to sacrifice this guaranteed for performance. Example: Ext4 with mount option lazytime.
[5]adjtime() is not covered by ISO 1003.1-2008. It originated in 4.3BSD and System V. For many operating systems it states “the clock is always monotonically increasing” (Linux, OpenBSD, FreeBSD).
[6]Resolution for this situation: Try later (delay should be part of the message) or correct the mtime of the affected filesystem objects (list should be part of the message).