internet-drafts/pep-trustwords/draft-birk-pep-trustwords.mkd

590 lines
18 KiB
Markdown
Raw Normal View History

---
coding: utf-8
title: "IANA Registration of Trustword Lists:
Guide, Template and IANA Considerations"
abbrev: IANA Registration of Trustword Lists
docname: draft-birk-pep-trustwords-05
category: std
stand_alone: yes
pi: [toc, sortrefs, symrefs, comments]
author:
#{::include ../shared/author_tags/volker_birk.mkd}
{::include ../shared/author_tags/bernie_hoeneisen.mkd}
{::include ../shared/author_tags/hernani_marques.mkd}
normative:
RFC4949:
# RFC7435:
2018-06-21 19:36:31 +02:00
RFC8126:
informative:
I-D.birk-pep:
2018-06-21 19:08:08 +02:00
RFC1751:
RFC1760:
RFC2289:
RFC3647:
2018-06-21 19:36:31 +02:00
RFC6117:
RFC6120:
# RFC4880:
# RFC7258:
# RFC7942:
2018-07-03 02:24:55 +02:00
# I-D.marques-pep-email:
# I-D.birk-pep-trustwords:
# I-D.marques-pep-rating:
I-D.marques-pep-handshake:
2019-07-04 19:24:44 +02:00
I-D.hoeneisen-pep-keysync:
2018-06-21 19:08:08 +02:00
PGP.wl:
target: https://en.wikipedia.org/w/index.php?title=PGP_word_list&oldid=749481933
title: PGP word list
date: 2017-11
bitcoin.wl:
target: https://en.bitcoin.it/w/index.php?title=Seed_phrase&oldid=66492#Word_Lists
title: Seed Phrase
date: 2019-06
2019-03-14 20:41:53 +01:00
ISO639:
2018-06-21 19:55:00 +02:00
target: https://www.iso.org/iso-639-language-codes.html
title: "Language codes - ISO 639"
{::include ../shared/references/isoc-btn.mkd}
# {::include ../shared/references/implementation-status.mkd}
--- abstract
This document specifies the IANA Registration Guidelines for
Trustwords, describes corresponding registration procedures, and
provides a guideline for creating Trustword list specifications.
2019-07-08 12:11:29 +02:00
Trustwords are common words in a natural language (e.g., English),
which hexadecimal strings are mapped to. Such a mapping makes
verification processes like fingerprint comparisons more practical,
and less prone to misunderstandings.
--- middle
# Introduction
2019-07-08 12:11:29 +02:00
In public-key cryptography, comparing the respective public key
fingerprints for each of the communication partners involved is vital
to ensure that there is no Man-in-the-Middle (MITM) attack on the
communication channel. These fingerprints normally consist of a chain
of hexadecimal characters, which are often impractical, cumbersome,
and prone to misunderstandings for end-users.
To mitigate these challenges, several systems offer Trustword
comparison as an alternative to these hexadecimal strings. Trustwords
are common words in a natural language (e.g., English), which these
hexadecimal strings are mapped to. Using Trustwords makes verification
processes like fingerprint comparisons more natural for users.
For example, in pEp's Privacy by Default proposition {{I-D.birk-pep}}
Trustwords are used to facilitate easy contact verification for
end-to-end encryption. Trustword comparison is offered after the peers
have opportunistically exchanged public keys. Examples of Trustword
lists used by current pEp implementations can be found here in CSV
format:
2018-06-21 23:18:34 +02:00
https://pep.foundation/dev/repos/pEpEngine/file/tip/db.
2019-07-08 12:11:29 +02:00
In addition to contact verification, Trustwords are also used for
other purposes, such as Human-Readable 128-bit Keys {{RFC1751}}, One
Time Passwords (OTP) {{RFC1760}} {{RFC2289}}, SSH host-key
2019-07-08 12:11:29 +02:00
verification, VPN server certificate verification, deriving private
keys in blockchain applications for cryptocurrencies, and to import or
synchronize secret keys across multiple devices owned by a single user
{{I-D.hoeneisen-pep-keysync}}. Further ideas include the use of
Trustwords for private key recovery in case of loss, contact
verification in Extensible Messaging and Presence Protocol (XMPP)
{{RFC6120}}, or for X.509 certificate verification in browsers
{{RFC3647}}.
{::include ../shared/text-blocks/key-words-rfc2119.mkd}
{::include ../shared/text-blocks/terms-intro.mkd}
{::include ../shared/text-blocks/handshake.mkd}
<!-- {::include ../shared/text-blocks/trustwords.mkd} -->
<!-- {::include ../shared/text-blocks/tofu.mkd} -->
{::include ../shared/text-blocks/mitm.mkd}
2018-06-21 19:08:08 +02:00
# The Concept of Trustword Mapping
## Example
2019-07-08 12:11:29 +02:00
As already discussed, fingerprints normally consist of a string
of hexadecimal characters. A typical fingerprint looks like this:
2018-06-21 19:08:08 +02:00
> F482 E952 2F48 618B 01BC 31DC 5428 D7FA ACDC 3F13
2018-06-21 19:08:08 +02:00
2019-07-08 12:11:29 +02:00
Instead of the hexadecimal string, Trustwords allow users to
compare ten common words of a language of their choosing. For example,
the above fingerprint, mapped to English Trustwords, might appear as:
2018-06-21 19:08:08 +02:00
> dog house brother town fat bath school banana kite task
2018-06-21 19:08:08 +02:00
2019-07-08 12:11:29 +02:00
The same fingerprint might appear in German Trustwords as:
2018-06-21 19:08:08 +02:00
> klima gelb lappen weg trinken alles kaputt rasen rucksack durch
2018-06-21 19:08:08 +02:00
2019-07-08 12:11:29 +02:00
Note: These examples are for illustration purposes only, and are not
derived from any published Trustword list.
2018-06-21 19:08:08 +02:00
2018-06-21 19:36:31 +02:00
## Previous work
2018-06-21 19:08:08 +02:00
2019-07-08 12:11:29 +02:00
The basic concept of Trustword mapping - also known as a biometric
word list - for fingerprint comparison is well-documented. Examples of
this concept are used with One-Time Passwords (OTP) {{RFC1751}}
{{RFC1760}} {{RFC2289}}, as well as the PGP Word List ("Pretty Good
Privacy word list" {{PGP.wl}}. Furthermore, cryptocurrencies use a
similar concept for deriving private keys {{bitcoin.wl}}.
\[\[ TODO: Explain each previous usage a bit further and synchronize
with section {{introduction}}. \]\]
2018-06-21 19:36:31 +02:00
Regarding today's needs, previous proposals have the following
shortcomings:
2018-06-21 19:36:31 +02:00
2019-07-08 18:14:35 +02:00
* Small/limited word lists, which generally result in more words to
compare
2018-06-21 19:36:31 +02:00
2019-07-08 18:14:35 +02:00
* Existing word lists are usually only available in English, which
limits their usefulness for non-English speakers
2018-06-21 19:36:31 +02:00
Furthermore, there are differences in the basic concept:
2019-07-08 12:11:29 +02:00
* The Trustword concept suggested herein intends to improve usability
and security for all users, instead of only the technically-savvy.
2018-06-21 19:36:31 +02:00
2019-07-08 12:11:29 +02:00
* In many use cases, Trustwords are only read (aloud) during the
comparison process, rather than being written or typed. For
2019-07-08 12:11:29 +02:00
example, two users might compare their respective Trustwords during
a phone call. Verbal comparison reduces the need to keep the actual
Trustwords short. The use of longer Trustwords increases the
2019-07-08 12:11:29 +02:00
entropy within the system, as it allows for a larger dictionary, and
thus reduces the likelihood of phonetic collisions.
2018-06-21 19:36:31 +02:00
## Number of Trustwords for a language
2019-07-08 12:11:29 +02:00
If the number of Trustwords in a dictionary is low, shorter parts of
2019-07-08 16:59:44 +02:00
the original string (e.g., fingerprint) can be mapped to a single
2019-07-08 12:11:29 +02:00
Trustword. Thus, many Trustwords will need to be compared, which
2019-07-08 16:59:44 +02:00
results in a potentially cumbersome process for users, and lead to
reduced usability.
2019-07-08 12:11:29 +02:00
To reduce the number of Trustwords that need to be compared, pEp's
Privacy by Default proposition {{I-D.birk-pep}} calls for 16-bit
scalars to be mapped to natural language words. Therefore, the size
(by number of key-value pairs) of any key-value pair structure
is 65536. However, the number of unique values to be used in a
language may be smaller than this number. This discrepancy can be
addressed by using the same value, or Trustword, for more than one
key. In such cases, the entropy of the representation is slightly
reduced. For example, a Trustword list of 42000 words still allows
for an entropy of log_2(42000), which is roughly 15.36 bits in 16-bit
mappings. As a consequence such Trustword lists are not bijective.
On the other hand, small Trustword lists allow for Trustwords
consisting of words with shorter strings (number of short words per
natural language is normally limited), which are easier to use in
implementations where Trustwords have to be typed or written, such as
in OTP applications.
2019-07-08 16:59:44 +02:00
Note: This specification allows for registration of variable numbers
of Trustwords per dictionary.
2018-06-21 19:36:31 +02:00
## Language
2019-07-08 12:11:29 +02:00
Although English is used around the world, the vast majority of the
global population is not English-speaking. For an application to be
useful to as wide of a user base as possible, localization is
essential. Therefore, this specification allows for registration of
Trustword lists in different languages.
2018-06-21 19:36:31 +02:00
2019-07-08 12:11:29 +02:00
In applications where two humans are attempting to establish
secure communications, it is likely that they share a common language.
At this time, no real-world use cases for Trustword list translation
capability have been identified. Because the translation process
inherently - and drastically - increases complexity from an IANA
registration standpoint, the topic of Trustword translation is beyond
the scope of this document.
2018-06-21 19:36:31 +02:00
## The nature of the words
2019-07-08 12:11:29 +02:00
Every Trustword list SHOULD be clear of offensive language (i.e.,
swear/curse words, slurs, derogatory language, etc.). This process
SHOULD be performed by native speakers of each respective language.
2018-06-21 19:36:31 +02:00
# Security Considerations
2019-07-08 12:11:29 +02:00
There are no specific security considerations.
# Privacy Considerations
\[\[ TODO \]\]
2018-06-21 19:36:31 +02:00
# IANA Considerations
Each natural language requires a different set of Trustwords. To allow
implementers for identical Trustword lists, a IANA registry is to be
established. The IANA registration policy according to {{RFC8126}} is
"Expert Review" and "Specification Required".
2018-06-21 19:36:31 +02:00
\[\[ Note: Further details of the IANA registry and requirements for
the expert to assess the specification are for further study. A
similar approach as used in {{RFC6117}} is likely followed. \]\]
2018-06-21 19:36:31 +02:00
## Registration Template (XML chunk)
~~~~~~~~~~
<record>
<languagecode>
<!-- ISO 639-3 (e.g. eng, deu, ...) -->
</languagecode>
<bitsize>
<!-- How many bits can be mapped with this list
(e.g. 8, 16, ...) -->
</bitsize>
<numberofuniquewords>
<!-- number of unique words registered
(e.g. 256, 65536, ...) -->
</numberofuniquewords>
<bijective>
<!-- whether or not the list allows for a two-way-mapping
(e.g. yes, no) -->
</bijective>
<version>
<!-- version number within language
(e.g. b.1.2, n.0.1, ...) -->
</version>
<registrationdocs>
<!-- Change accordingly -->
<xref type="rfc" data="rfc2551"/>
</registrationdocs>
<requesters>
<!-- Change accordingly -->
<xref type="person" data="John_Doe"/>
<xref type="person" data="Jane_Dale"/>
</requesters>
<additionalinfo>
<paragraph>
<!-- Text with additional information about
the Wordlist to be registered -->
</paragraph>
<artwork>
<!-- There can be artwork sections, too -->
</artwork>
</additionalinfo>
<wordlist>
2018-06-22 09:28:32 +02:00
<!-- Change accordingly -->
2018-06-22 10:43:51 +02:00
<w0>first</w0>
<w1>second</w1>
[...]
2018-06-22 10:43:51 +02:00
<w65535>last<w65535>
</wordlist>
</record>
<people>
<person id="John_Doe">
<name> <!-- Firstname Lastname --> </name>
<org> <!-- Organization Name --> </org>
<uri> <!-- mailto: or http: URI --> </uri>
<updated> <!-- date format YYYY-MM-DD --> </updated>
</person>
<!-- repeat person section for each person -->
</people>
2018-06-21 19:36:31 +02:00
~~~~~~~~~~
Authors of a Wordlist are encouraged to use these
XML chunks as a template to create the IANA Registration Template.
## IANA Registration
An IANA registration will contain the fallowing elements:
### Language Code (\<languagecode\>)
2018-06-21 19:36:31 +02:00
2019-03-14 20:41:53 +01:00
The language code follows the ISO 639-3 specification {{ISO639}},
e.g., eng, deu.
2018-06-21 19:55:00 +02:00
\[\[ Note: It is for further study, which of the ISO 639
Specifications is most suitable to address the Trustwords'
challenge. \]\]
Example usage for German:
2018-06-21 19:36:31 +02:00
~~~~~~~~~~
e.g. <languagecode>deu</languagecode>
2018-06-21 19:36:31 +02:00
~~~~~~~~~~
### Bit Size (\<bitsize\>)
2018-06-21 19:36:31 +02:00
The bit size is the number of bits that can be mapped with the
2018-06-26 16:32:55 +02:00
Wordlist. The number of registered words in a word list MUST be
2 ^ `(<bitsize>)`.
2018-06-21 19:36:31 +02:00
Example usage for 16-bit Wordlist:
2018-06-21 19:36:31 +02:00
~~~~~~~~~~
e.g. <bitsize>16</bitsize>
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
### Number Of Unique Words \(<numberofuniquewords\>)
2018-06-21 19:50:47 +02:00
The number of unique words that are registered.
~~~~~~~~~~
e.g. <numberofuniquewords>65536</numberofuniquewords>
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
### Bijectivity (\<bijective\>)
2018-06-21 19:50:47 +02:00
Whether the registered Wordlist has a one-to-one mapping, meaning the
number of unique words registered equals 2 ^ `(<bitsize>)`.
Valid content: ( yes \| no )
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
e.g. <bijective>yes</bijective>
~~~~~~~~~~
### Version (\<version\>)
2018-06-21 19:50:47 +02:00
The version of the Wordlist MUST be unique within a language code.
\[\[ Note: Requirements to a "smart" composition of the version number
are for further study \]\]
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
e.g. <version>b.1.2</version>
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
### Registration Document(s) (\<registrationdocs\>)
2018-06-21 19:50:47 +02:00
Reference(s) to the Document(s) containing the Wordlist
~~~~~~~~~~
e.g. <registrationdocs>
<xref type="rfc" data="rfc4979"/>
</registrationdocs>
e.g. <registrationdocs>
<xref type="rfc" data="rfc8888"/> (obsoleted by RFC 9999)
<xref type="rfc" data="rfc9999"/>
</registrationdocs>
e.g. <registrationdocs>
[International Telecommunications Union,
"Wordlist for Foobar application",
ITU-F Recommendation B.193, Release 73, Mar 2009.]
</registrationdocs>
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
### Requesters (\<requesters\>)
2018-06-21 19:50:47 +02:00
The persons requesting the registration of the Wordlist. Usually
these are the authors of the Wordlist.
~~~~~~~~~~
e.g. <requesters>
<xref type="person" data="John_Doe"/>
</requesters>
<people>
<person id="John_Doe">
<name>John Doe</name>
<org>Example Inc.</org>
<uri>mailto:john.doe@example.com</uri>
<updated>2018-06-20</updated>
</person>
</people>
2018-06-21 19:50:47 +02:00
~~~~~~~~~~
Note: If there is more than one requester, there must be one \<xref\>
element per requester in the \<requesters\> element, and one
\<person\> chunk per requester in the \<people\> element.
2018-06-21 19:50:47 +02:00
### Further Information (\<additionalinfo\>)
2018-06-21 19:50:47 +02:00
Any other information the authors deem interesting.
~~~~~~~~~~
e.g. <additionalinfo>
<paragraph>more info goes here</paragraph>
</additionalinfo>
2018-06-21 19:36:31 +02:00
~~~~~~~~~~
2018-06-21 19:50:47 +02:00
Note: If there is no such additional information, then the
\<additionalinfo\> element is omitted.
2018-06-21 19:50:47 +02:00
### Wordlist (\<wordlist\>)
2018-06-21 19:50:47 +02:00
2018-06-22 10:43:51 +02:00
The full Wordlist to be registered. The number of words MUST be a
2018-06-21 19:50:47 +02:00
power of 2 as specified above. The element names serve as key used for
enumeration of the Trustwords (starting at 0) and the elements
contains the values being individual natural language words in the
respective language.
~~~~~~~~~~
e.g. <wordlist>
2018-06-22 10:43:51 +02:00
<w0>first</w0>
<w1>second</w1>
[...]
2018-06-22 10:43:51 +02:00
<w65535>last<w65535>
</wordlist>
2018-06-21 19:50:47 +02:00
] ]>
~~~~~~~~~~
2018-06-26 16:32:55 +02:00
\[\[ Note: The exact representation of the Wordlist is for further study.
\]\]
2018-06-21 19:50:47 +02:00
2019-07-07 23:39:39 +02:00
# Acknowledgments
2018-06-21 19:50:47 +02:00
The authors would like to thank the following people who have provided
feedback or significant contributions to the development of this
2019-07-08 12:11:29 +02:00
document: Andrew Sullivan, Claudio Luck, Daniel Kahn Gilmore, Kelly
Bristol, Michael Richardson, Rich Salz, Volker Birk, and Yoav Nir.
2018-06-21 19:50:47 +02:00
This work was initially created by pEp Foundation, and then reviewed
and extended with funding by the Internet Society's Beyond the Net
Programme on standardizing pEp. {{ISOC.bnet}}
--- back
# IANA XML Template Example
This section contains a non-normative example of the IANA Registration
Template XML chunk.
~~~~~~~~~~
<record>
<languagecode>lat</languagecode>
<bitsize>16</bitsize>
<numberofuniquewords>57337</numberofuniquewords>
<bijective>no</bijective>
<version>n.0.1</version>
<registrationdocs>
<xref type="rfc" data="rfc2551"/>
</registrationdocs>
<requesters>
<xref type="person" data="Julius_Caesar"/>
</requesters>
<additionalinfo>
<paragraph>
This Wordlist has been optimized for
2018-06-22 11:29:52 +02:00
the Roman Standards Process.
</paragraph>
</additionalinfo>
<wordlist>
2018-06-22 10:43:51 +02:00
<w0>errare</w0>
<w1>humanum</w1>
[...]
2018-06-22 10:43:51 +02:00
<w65535>est<w65535>
</wordlist>
</record>
<people>
<person id="Julius_Caesar">
<name>Julius Caesar</name>
<org>Curia Romana</org>
<uri>mailto:julius.cesar@example.com</uri>
<updated>1999-12-31</updated>
</person>
</people>
~~~~~~~~~~
# Document Changelog
\[\[ RFC Editor: This section is to be removed before publication \]\]
* draft-birk-pep-trustwords-04:
* Add Privacy Considerations section
* Swapped Security and IANA Consideration Sections
2019-03-14 20:41:53 +01:00
* Corrected typo in ISO references
2019-07-08 12:11:29 +02:00
* Updated Introduction, Terms and concept Sections
2019-03-14 17:24:09 +01:00
* draft-birk-pep-trustwords-03:
* Update references
* Minor edits
2019-03-14 17:24:09 +01:00
2018-06-22 09:28:32 +02:00
* draft-birk-pep-trustwords-02:
2018-06-26 16:32:55 +02:00
* Minor editorial changes and bug fixes
2018-06-22 11:29:52 +02:00
* Added more items to Open Issues
2018-06-26 16:32:55 +02:00
* Add usage example
2018-06-22 09:28:32 +02:00
* draft-birk-pep-trustwords-01:
* Included feedback from mailing list and IETF-101 SECDISPATCH WG,
e.g.
2018-06-26 16:32:55 +02:00
* Added more explanatory text / less focused on the main use case
* Bit size as parameter
* Explicitly stated translations are out-of-scope for this document
* Added draft IANA XML Registration template,
considerations, explanation and examples
2018-06-26 16:32:55 +02:00
* Added Changelog to Appendix
* Added Open Issue section to Appendix
# Open Issues
2018-06-22 09:25:00 +02:00
\[\[ RFC Editor: This section should be empty and is to be removed
2019-07-08 12:11:29 +02:00
before publication. \]\]
* Better explain previous work on Trustwords
2018-06-26 16:32:55 +02:00
* More explanatory text for Trustword use cases, properties and
2018-06-22 11:29:52 +02:00
requirements
2018-06-26 16:32:55 +02:00
* Further details of the IANA registry and requirements for the expert
to assess the specification
2018-06-22 11:29:52 +02:00
2018-06-26 16:32:55 +02:00
* Decide which ISO language code either 639-1 or 639-3 to use, i.e.,
2018-06-22 12:03:08 +02:00
ISO-639-1 (e.g., ca, de, en, ...) as currently used in pEp
2019-03-14 20:41:53 +01:00
implementations (running code) or ISO-639-3 (eng, deu, ita, ...)
2018-06-26 16:32:55 +02:00
* Adjust exact representation of wordlists
2018-06-22 11:29:52 +02:00
* e.g. XML, CSV, ...
2018-06-26 16:32:55 +02:00
* Syntax for non-ASCII letters or language symbols (UTF-8) in
2018-06-22 12:05:23 +02:00
Wordlists
2018-06-26 16:32:55 +02:00
* Need for optional entropy value assigned to words, to account for
similar phonetics among words in the same wordlist?
2018-06-26 16:32:55 +02:00
* Need for an additional field, to define what a wordlist is optimized
for, e.g., "entropy", "minimize word lengths", ...?
* Work out (requirements for) "smart" composition of the version
number
2018-06-26 16:32:55 +02:00
* Decide whether in non-bijective Wordlists the redundant words need
to be repeated in the IANA Registration
2018-06-26 16:32:55 +02:00
* Register only a hash over the wordlist with IANA?
* Does it make sense to open registrations for other patterns than
just words, e.g., images?
2019-07-08 16:59:44 +02:00
<!-- LocalWords: utf docname toc sortrefs symrefs hoeneisen wl ACDC
-->
<!-- LocalWords: oldid blockchain cryptocurrencies klima gelb weg
-->
<!-- LocalWords: lappen trinken alles kaputt rasen durch eng deu WG
-->
<!-- LocalWords: languagecode bitsize numberofuniquewords wordlist
-->
<!-- LocalWords: registrationdocs requesters additionalinfo uri ITU
-->
<!-- LocalWords: Firstname Lastname mailto http YYYY Bijectivity de
-->
<!-- LocalWords: Kahn Salz Yoav Nir ISOC bnet errare humanum Romana
-->
<!-- LocalWords: Changelog SECDISPATCH ita wordlists
-->