about summary refs log tree commit diff
path: root/presentations/bootstrapping-2018/notes.org
blob: 363d75352e62cf696c0ad6c07d8f44bae77a22b1 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
#+TITLE: Bootstrapping, reproducibility, etc.
#+AUTHOR: Vincent Ambo
#+DATE: <2018-03-10 Sat>

* Compiler bootstrapping
  This section contains notes about compiler bootstrapping, the
  history thereof, which compilers need it - and so on:

** C

** Haskell
   - self-hosted compiler (GHC)

** Common Lisp
   CL is fairly interesting in this space because it is a language
   that is defined via an ANSI standard that compiler implementations
   normally actually follow!

   CL has several ecosystem components that focus on making
   abstracting away implementation-specific calls and if a self-hosted
   compiler is written in CL using those components it can be
   cross-bootstrapped.

** Python

* A note on runtimes
  Sometimes the compiler just isn't enough ...

** LLVM
** JVM

* References
  https://github.com/mame/quine-relay
  https://manishearth.github.io/blog/2016/12/02/reflections-on-rusting-trust/
  https://tests.reproducible-builds.org/debian/reproducible.html

* Slide thoughts:
  1. Hardware trust has been discussed here a bunch, most recently
     during the puri.sm talk. Hardware trust is important, as we see
     with IME, but it's striking that people often take a leap to "I'm
     now on my trusted Debian with free software".

     Unless you built it yourself from scratch (Spoiler: you haven't)
     you're placing trust in what is basically foreign binary blobs.

     Agenda: Implications/attack vectors of this, state of the chicken
     & egg, the topic of reproducibility, what can you do? (Nix!)

  2. Chicken-and-egg issue

     It's an important milestone for a language to become self-hosted:
     You begin doing a kind of dogfeeding, you begin to enforce
     reliability & consistency guarantees to avoid having to redo your
     own codebase constantly and so on.

     However, the implication is now that you need your own compiler
     to compile itself.

     Common examples:
     - C/C++ compilers needed to build C/C++ compilers:

       GCC 4.7 was the last version of GCC that could be built with a
       standard C-compiler, nowadays it is mostly written in C++.

       Certain versions of GCC can be built with LLVM/Clang.

       Clang/LLVM can be compiled by itself and also GCC.

     - Rust was originally written in OCAML but moved to being
       self-hosted in 2011. Currently rustc-releases are always built
       with a copy of the previous release.

       It's relatively new so we can build the chain all the way.

     Notable exceptions: Some popular languages are not self-hosted,
     for example Clojure. Languages also have runtimes, which may be
     written in something else (e.g. Haskell -> C runtime)
* How to help:
  Most of this advice is about reproducible builds, not bootstrapping,
  as that is a much harder project.

  - fix reproducibility issues listed in Debian's issue tracker (focus
    on non-Debian specific ones though)
  - experiment with NixOS / GuixSD to get a better grasp on the
    problem space of reproducibility

  If you want to contribute to bootstrapping, look at
  bootstrappable.org and their wiki. Several initiatives such as MES
  could need help!