about summary refs log tree commit diff
path: root/presentations/bootstrapping-2018/result.pdfpc
blob: b0fa6c9a0ef8b5db343e5a740a4a4a78e22efae3 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
[file]
result
[last_saved_slide]
10
[font_size]
20000
[notes]
### 1
- previous discussions of hardware trust (e.g. purism presentation)
- people leap to "now I'm on my trusted Debian!"
- unless you built it from scratch (spoiler: you haven't) you're *trusting* someone

Agenda: Implications of trust with focus on bootstrap paths and reproducibility, plus how you can help.### 2
self-hosting:
- C-family: GCC pre/post 4.7, Clang
- Common Lisp: Sunshine land! (with SBCL)
- rustc: Bootstrap based on previous versions (C++ transpiler underway!)
- many other languages also work this way!

(Noteable counterexample: Clojure is written in Java!)### 3

- compilers are just one bit, the various runtimes exist, too!### 4

Could this be exploited?

People don't think about where their compiler comes from.

Even if they do, they may only go so far as to say "I'll just recompile it using <other compiler>".

Unfortunately, spoiler alert, life isn't that easy in the computer world and yes, exploitation is possible.### 5

- describe what a quine is
- classic Lisp quine
- explain demo quine
- demo demo quine

- this is interesting, but not useful - can quines do more than that?### 6

- quine-relay: "art project" with 128-language circular quine

- show source of quine-relay

- (demo quine relay?)

- side-note: this program is very, very trustworthy!### 7

Ken Thompson (designer of UNIX and a couple other things!) received Turing award in 1983, and described attack in speech.

- figure out how to detect self-compilation
- make that modification a quine
- insert modification into new compiler
- add attack code to modification
- remove attack from source, distributed binary will still be compromised! it's like evolution :)### 8

damage potential is basically infinite:

- classic "login" attack
=> also applicable to other credentials

- attack (weaken) crypto algorithms

- you can probably think of more!### 10

idea being: potential vulnerability would have to work across compilers:

the more compilers we can introduce (e.g. more architectures, different versions, different compilers), the harder it gets for a vulnerability to survive all of those

The more compilers, the merrier! Lisps are pretty good at this.### 11

if we get a bit-mismatch after DDC, not all hope is lost: Maybe the thing just isn't reproducible!

- many reasons for failures
- timestamps are a classic! artifacts can be build logs, metadata in ZIP-files or whatever
- non-determinism is the devil
- sometimes people actively introduce build-randomness (NaCl)### 12

- Does that binary download on the project's website really match the source?

- Your Linux packages are signed by someone - cool - but what does that mean?### 13

Two things should be achieved - gross oversimplification - to get to the ideal "desired state of the union":

1. full-source bootstrap: without ever introducing any binaries, go from nothing to a full Linux distribution

2. when packages are distributed, we should be able to know the expected output of a source package beforehand

=> suddenly binary distributions become a cache! But more on Nix later.### 14

- Debian project does not seem as concerned with bootstrapping as with reproducibility
- Debian mostly bootstraps on new architectures (using cross-compilation and similar techniques, from an existing binary base)
- core bootstrap (GCC & friends) is performed with previous Debian version and depending on GCC### 15

... however! Debian cares about reproducibility.

- automated testing of reproducibility
- information about the status of all packages is made available in repos
- Over 90% packages of packages are reproducible!

< show reproducible builds website >

Debian is still fundamentally a binary distribution though, but it doesn't have to be that way.### 16

Nix - a purely functional package manager

It's not a new project (10+ years), been discussed here before, has multiple components: package manager, language, NixOS.

Instead of describing *how* to build a thing, Nix describes *what* to build:### 17
### 19

In Nix, it's impossible to say "GCC is the result of applying GCC to the GCC source", because that happens to be infinite recursion.

Bootstrapping in Nix works by introducing a binary pinned by its full-hash, which was built on some previous Nix version.

Unfortunately also just a magic binary blob ... ### 20

NixOS is not actively porting all of Debian's reproducibility patches, but builds are fully repeatable:

- introducing a malicious compiler would produce a different input hash -> different package

Future slide: hope is not lost! Things are underway.### 21

- bootstrappable.org (demo?) is an umbrella page for several projects working on bootstrappability

- stage0 is an important piece: manually, small, auditable Hex programs to get to a Hex macro expander

- end goal is a full-source bootrap, but pieces are missing### 22

MES is out of the GuixSD circles (explain Guix, GNU Hurd joke)

- idea being that once you have a Lisp, you have all of computing (as Alan Key said)

- includes MesCC in Scheme -> can *almost* make a working tinyCC -> can *almost* make a working gcc 4.7

- minimal Scheme interpreter, currently built in C to get the higher-level stuff to work, goal is rewrite in hex
- bootstrapping Guix is the end goal### 23

- userspace in Darwin has a Nix project
- unsure about other BSDs, but if anyone knows - input welcome!
- F-Droid has reproducible Android packages, but that's also userspace only
- All other mobile platforms are a lost cause

Generally, all closed-source software is impossible to trust.