Cryptosphere Identity Format (CIF)
Pronounced "sif" like the beginning of "sift"
A certificate format for the Cryptosphere. We have elected not
to use ASN.1-derived formats like X.509, and instead use a novel certificate
format (Cue obligatory XKCD comic).
This repository provides both the home of the format and a reference
implementation in Ruby.
Rationale
The existing public key infrastructure has a number of known issues:
The goal of a new certificate format should be to address all of these points,
with special attention paid to the third: designing a format that satisfies
security concerns at a linguistic level.
Our design will consider the Security Applications of Formal Language Theory
Improvements
We propose the following to address the above problems:
- A simple design that builds on existing standards (including JSON)
- A human-readable format that can be viewed in any file viewer or editor
- A format that learns the lessons of LANGSEC, with a formal grammar
that is unambiguous and easy to implement

Linguistic Underpinnings
To understand the design choices of CIF from a linguistic perspective, we have
to examine one of the most fundamental parts of language theory, the
Chomsky Hierarchy. Languages, be they natural languages we speak, the
programming languages humans use, or the instruction set architectures that our
CPUs execute fall into four fundamental categories:
- Regular: regular expressions. Can understand sequential patterns. Can't count
- Context-free: can understand tree structures, but can't use symbols within
what it's processing to help further understand what's being described
- Context-sensitive: interprets portions of what's being processed to control
subsequent processing
- Recursively enumerable (Turing complete): capable of unbounded computation
We will select a format that is context-sensitive. At first glance this
might not satisfy LANGSEC's requirements:

We will not be building a "weird machine", however. We will use a very simple
format with built-in restrictions that will hopefully make even the most
skeptical LANGSEC scruitinizer happy.
Our grammar will be context-sensitive because it includes a length prefix.
That's the weirdest part about it. The length prefix will also be bounded,
providing a maximum message length, and thus a guaranteed end to any
computation. Some may see a maximum length on input documents as a weakness. We
see it as a strength.
Even better, we're not going to invent anything new. We're merely going to
synthesize existing ideas.
Self-Delimiting Strings
A self-delimiting string is a simple idea: you read some sort of length prefix,
then can read an arbitrary string containing any data you want. When you're
done, you can interpret the remaining data however you wish.
Some examples of self-delimiting strings are:
- netstrings: Dan Bernstein's string format. Uses a decimal prefix
of unbounded size, supporting arbitrary-length documents
- git pkt-lines: Format used by the git protocol. Uses a fixed
4-byte prefix of hex digits, representing a 16-bit value. Messages (prefix
excluded) can be a maximum of 65520 bytes (or 65524 bytes with prefix).
We will be using git pkt-lines to frame our certificates. The size
limitation presents some problems, but we will work around them, and hopefully
end up in a better place for doing so from a language-theoretic perspective.
Installation
Add this line to your application's Gemfile:
gem 'cif'
And then execute:
$ bundle
Or install it yourself as:
$ gem install cif
Contributing
- Fork this repository on github
- Make your changes and send us a pull request
- If we like them we'll merge them
License
All project documentation is provided under the
Creative Commons Attribution 3.0 Unported
license.
Ruby source code Copyright (c) 2013 Tony Arcieri.
Distributed under the MIT License. See LICENSE.txt for further details.