The global state of a fossil repository is kept simple so that it canendure in useful form for decades or centuries.A fossil repository is intended to be readable,searchable, and extensible by people not yet born.

The global state of a fossil repository is an unorderedset of artifacts.An artifact might be a source code file, the text of a wiki page,part of a trouble ticket, a description of a check-in including allthe files in that check-in with the check-in comment and so forth.Artifacts are broadly grouped into two types: content artifacts andstructural artifacts. Content artifacts are the raw project source-codefiles that are checked into the repository. Structural artifacts havespecial formatting rules and are used to show the relationships betweenother artifacts in the repository. It is possible for an artifact tobe both a structure artifact and a content artifact, though this israre. Artifacts can be text or binary.

In addition to the global state,each fossil repository also contains local state.The local state consists of web-page formattingpreferences, authorized users, ticket display and reporting formats,and so forth. The global state is shared in common among allrepositories for the same project, whereas the local state is oftendifferent in separate repositories.The local state is not versioned and is not synchronizedwith the global state.The local state is not composed of artifacts and is not intended to be enduring.This document is concerned with global state only. Local state is onlymentioned here in order to distinguish it from global state.

1.0 Artifact Names

Each artifact in the repository is named by a hash of its content.No prefixes, suffixes, or other information is added to an artifact beforethe hash is computed. The artifact name is just the (lower-casehexadecimal) hash of the raw artifact.

Fossil currently computes artifact names using either SHA1 or SHA3-256. Itis relatively easy to add new algorithms in the future, but there are noplans to do so at this time.

When referring to artifacts in using tty commands or webpage URLs, it issufficient to specify a unique prefix for the artifact name. If the inputprefix is not unique, Fossil will show an error. Within a structuralartifact, however, all references to other artifacts must be the completehash.

Prior to Fossil version 2.0, all names were formed from the SHA1 hash ofthe artifact. The key innovation in Fossil 2.0 was adding support foralternative hash algorithms.

2.0 Structural Artifacts

A structural artifact is an artifact with a particular formatthat is used to define the relationships between other artifacts in therepository.Fossil recognizes the following kinds of structuralartifacts:

These eight structural artifact types are described in subsections below.

Structural artifacts are ASCII text. The artifact may be PGP clearsigned.After removal of the PGP clearsign header and suffix (if any) a structuralartifact consists of one or more 'cards' separated by a single newline(ASCII: 0x0a) character. Each card begins with a singlecharacter 'card type'. Zero or more arguments may followthe card type. All arguments are separated from each otherand from the card-type character by a single spacecharacter. There is no surplus white space between argumentsand no leading or trailing whitespace except for the newlinecharacter that acts as the card separator. All cards must be in strictlexicographical order. There may not be any duplicate cards.

In the current implementation (as of 2017-02-27) the artifacts thatmake up a fossil repository are stored as delta- and zlib-compressedblobs in an SQLite database. Thisis an implementation detail and might change in a future release. Forthe purpose of this article 'file format' means the format of the artifacts,not how the artifacts are stored on disk. It is the artifact format thatis intended to be enduring. The specifics of how artifacts are stored ondisk, though stable, is not intended to live as long as theartifact format.

2.1 The Manifest

A manifest defines a check-in.A manifest contains a list of artifacts foreach file in the project and the corresponding filenames, aswell as information such as parent check-ins, the username of theprogrammer who created the check-in, the date and time whenthe check-in was created, and any check-in comments associatedwith the check-in.

Allowed cards in the manifest are as follows:

Ffilename ?hash? ?permissions? ?old-name?
Q (+|-)artifact-hash ?artifact-hash?
T (+|-|*)tag-name* ?value?

A manifest may optionally have a single B card. The B card specifiesanother manifest that serves as the 'baseline' for this manifest. Amanifest that has a B card is called a delta-manifest and a manifestthat omits the B card is a baseline-manifest. The other manifestidentified by the argument of the B card must be a baseline-manifest.A baseline-manifest records the complete contents of a check-in.A delta-manifest records only changes from its baseline.

A manifest must have exactly one C card. The sole argument tothe C card is a check-in comment that describes the check-in thatthe manifest defines. The check-in comment is text. The followingescape sequences are applied to the text:A space (ASCII 0x20) is represented as 's' (ASCII 0x5C, 0x73). Anewline (ASCII 0x0a) is 'n' (ASCII 0x5C, x6E). A backslash(ASCII 0x5C) is represented as two backslashes '. Apart fromspace and newline, no other whitespace characters are allowed inthe check-in comment. Nor are any unprintable characters allowedin the comment.

A manifest must have exactly one D card. The sole argument tothe D card is a date-time stamp in the ISO8601 format. Thedate and time should be in coordinated universal time (UTC).The format one of:


A manifest has zero or more F cards. Each F card identifies a filethat is part of the check-in. There are one, two, three, or fourarguments. The first argument is the pathname of the file in thecheck-in relative to the root of the project file hierarchy. No '.'or '.' directories are allowed within the filename. Space charactersare escaped as in C card comment text. Backslash characters andnewlines are not allowed within filenames. The directory separatorcharacter is a forward slash (ASCII 0x2F). The second argument to theF card is the lower-case hexadecimal artifact hash ofthe content artifact. The second argument is required for baselinemanifests but is optional for delta manifests. When the secondargument to the F card is omitted, it means that the file has beendeleted relative to the baseline (files removed in baseline manifestsversions are not added as F cards). The optional 3rd argumentdefines any special access permissions associated with the file. Thiscan be defined as 'x' to mean that the file is executable or 'l'(small letter ell) to mean a symlink. All files are always readableand writable. This can be expressed by 'w' permission if desired butis optional. The file format might be extended with new permissionletters in the future. The optional 4th argument is the name of thesame file as it existed in the parent check-in. If the name of thefile is unchanged from its parent, then the 4th argument is omitted.

A manifest has zero or one N cards. The N card specifies the mimetype for thetext in the comment of the C card. If the N card is omitted, a default mimetypeis used.

A manifest has zero or one P cards. Most manifests have one P card.The P card has a varying number of arguments thatdefine other manifests from which the current manifestis derived. Each argument is a lowercasehexadecimal artifact hash of a predecessor manifest. All argumentsto the P card must be unique within that card.The first argument is the artifact hash of the direct ancestor of the manifest.Other arguments define manifests with which the first wasmerged to yield the current manifest. Most manifests havea P card with a single argument. The first manifest in theproject has no ancestors and thus has no P card or (dependingon the Fossil version) an empty P card (no arguments).

A manifest has zero or more Q cards. A Q card is similar to a P cardin that it defines a predecessor to the current check-in. Butwhereas a P card defines the immediate ancestor or a mergeancestor, the Q card is used to identify a single check-in or a smallrange of check-ins which were cherry-picked for inclusion in orexclusion from the current manifest. The first argument ofthe Q card is the artifact ID of another manifest (the 'target')which has had its changes included or excluded in the current manifest.The target is preceded by '+' or '-' to show inclusion orexclusion, respectively. The optional second argument to theQ card is another manifest artifact ID which is the 'baseline'for the cherry-pick. If omitted, the baseline is the primaryparent of the target. Thechanges included or excluded consist of all changes moving fromthe baseline to the target.

The Q card was added to the interface specification on 2011-02-26.Older versions of Fossil will reject manifests that contain Q cards.

A manifest may optionally have a single R card. The R card hasa single argument which is the MD5 checksum of all files inthe check-in except the manifest itself. The checksum is expressedas 32 characters of lowercase hexadecimal. The checksum iscomputed as follows: For each file in the check-in (except forthe manifest itself) in strict sorted lexicographical order,take the pathname of the file relative to the root of therepository, append a single space (ASCII 0x20), thesize of the file in ASCII decimal, a single newlinecharacter (ASCII 0x0A), and the complete text of the file.Compute the MD5 checksum of the result.

A manifest might contain one or more T cards used to settags or propertieson the check-in. The format of the T card is the same asdescribed in Control Artifacts section below, except that thesecond argument is the single character '*' instead of anartifact ID. The * in place of the artifact ID indicates thatthe tag or property applies to the current artifact. It is notpossible to encode the current artifact ID as part of an artifact,since the act of inserting the artifact ID would change the artifact ID,hence a * is used to represent 'self'. T cards are typicallyadded to manifests in order to set the branch property and asymbolic name when the check-in is intended to start a new branch.

Each manifest has a single U card. The argument to the U card isthe login of the user who created the manifest. The login nameis encoded using the same character escapes as is used for thecheck-in comment argument to the C card.

A manifest must have a single Z card as its last line. The argumentto the Z card is a 32-character lowercase hexadecimal MD5 hashof all prior lines of the manifest up to and including the newlinecharacter that immediately precedes the 'Z', excluding any PGPclear-signing prefix. The Z card isa sanity check to prove that the manifest is well-formed andconsistent.

A sample manifest from Fossil itself can be seenhere.

2.2 Clusters

A cluster is an artifact that declares the existence of other artifacts.Clusters are used during repository synchronization to helpreduce network traffic. As such, clusters are an optimization andmay be removed from a repository without loss or damage to theunderlying project code.

Allowed cards in the cluster are as follows:


A cluster contains one or more M cards followed by a single Z card.Each M card has a single argument which is the artifact ID ofanother artifact in the repository. The Z card works exactly likethe Z card of a manifest. The argument to the Z card is thelower-case hexadecimal representation of the MD5 checksum of allprior cards in the cluster. The Z Praetorians mod imperial 5.1. card is required.

An example cluster from Fossil can be seenhere.

2.3 Control Artifacts

Control artifacts are used to assign properties to other artifactswithin the repository.Allowed cards in a control artifact are as follows:

T (+|-|*)tag-nameartifact-id ?value?

A control artifact must have one D card, one U card, one Z card andone or more T cards. No other cards or other text isallowed in a control artifact. Control artifacts might be PGPclearsigned.

The D card and the Z card of a control artifact are the sameas in a manifest.

The T card represents a tag or propertythat is applied tosome other artifact. The T card has two or three values. Thesecond argument is the lowercase artifact ID of the artifactto which the tag is to be applied. Thefirst value is the tag name. The first character of the tagis either '+', '-', or '*'. The '+' means the tag should be addedto the artifact. The '-' means the tag should be removed.The '*' character means the tag should be added to the artifactand all direct descendants (but not descendants through a merge) downto but not including the first descendant that contains amore recent '-', '*', or '+' tag with the same name.The optional third argument is the value of the tag. A tagwithout a value is a Boolean.

When two or more tags with the same name are applied to thesame artifact, the tag with the latest (most recent) date isused.

Some tags have special meaning. The 'comment' tag when appliedto a check-in will override the check-in comment of that check-infor display purposes. The 'user' tag overrides the name of thecheck-in user. The 'date' tag overrides the check-in date.The 'branch' tag sets the name of the branch that at check-inbelongs to. Symbolic tags begin with the 'sym-' prefix.

The U card is the name of the user that created the controlartifact. The Z card is the usual required artifact checksum.

An example control artifacts can be seen here.

2.4 Wiki Pages

A wiki artifact defines a single version of asingle wiki page.Wiki artifacts acceptthe following card types:


The D card is the date and time when the wiki page was edited.The P card specifies the parent wiki pages, if any. The L cardgives the name of the wiki page. The optional N card specifiesthe mimetype of the wiki text. If the N card is omitted, themimetype is assumed to be text/x-fossil-wiki.The U card specifies the loginof the user who made this edit to the wiki page. The Z card isthe usual checksum over the entire artifact and is required.

The W card is used to specify the text of the wiki page. Theargument to the W card is an integer which is the number of bytesof text in the wiki page. That text follows the newline characterthat terminates the W card. The wiki text is always followed by oneextra newline.

The C card on a wiki page is optional. The argument is a commentthat explains why the changes was made. The ability to have a Ccard on a wiki page artifact was added on 2019-12-02 at the suggestionof user George Krivov and is not currently used or generated by the implementation. Older versions of Fossil will reject a wiki-pageartifact that includes a C card.

An example wiki artifact can be seenhere.

3.0 Card Summary

The following table summarizes the various kinds of cards that appearon Fossil artifacts. A blank entry means that combination of card andartifact is not legal. A number or range of numbers indicates the numberof times a card may (or must) appear in the corresponding artifact type.e.g. a value of 1 indicates a required unique card and 1+ indicates that oneor more such cards are required.

Card Format

Used By
Afilenametarget ?source?1
Etechnote-time technote-id1
Ffilename ?uuid? ?permissions? ?oldname?0+
Jname ?value?1+
Puuid ..0-10-10-10-1
Q (+|-)uuid ?uuid?0+
T (+|*|-)tagnameuuid ?value?0+1+0+

4.0 Addenda

This section contains additional information which may be useful whenimplementing algorithms described above.

4.1 R-Card Hash Calculation

Given a manifest file named MF, the following Bash shell codedemonstrates how to compute the value of the R card in that manifest.This example uses manifest [28987096ac]. Lines starting with # areshell input and other lines are output. This demonstration assumes that thefile versions represented by the input manifest are checked outunder the current directory.

Minor caveats: the above demonstration will work only when none of thefilenames in the manifest are 'fossilized' (encoded) because they containspaces. In that case the shell-generated hash would differ because thestat calls will fail to find such files (which are output in encodedform here). That approach also won't work for delta manifests. Calculatingthe R card for delta manifests requires traversing both the delta and its baseline inlexical order of the files, preferring the delta's copy if both containa given file.

