diff options
Diffstat (limited to 'web/blog/posts')
-rw-r--r-- | web/blog/posts/best-tools.md | 160 | ||||
-rw-r--r-- | web/blog/posts/emacs-is-underrated.md | 233 | ||||
-rw-r--r-- | web/blog/posts/make-object-t-again.md | 98 | ||||
-rw-r--r-- | web/blog/posts/nixery-layers.md | 272 | ||||
-rw-r--r-- | web/blog/posts/nsa-zettabytes.md | 93 | ||||
-rw-r--r-- | web/blog/posts/reversing-watchguard-vpn.md | 158 | ||||
-rw-r--r-- | web/blog/posts/sick-in-sweden.md | 26 | ||||
-rw-r--r-- | web/blog/posts/the-smu-problem.md | 151 |
8 files changed, 1191 insertions, 0 deletions
diff --git a/web/blog/posts/best-tools.md b/web/blog/posts/best-tools.md new file mode 100644 index 000000000000..e4bad8f4cd07 --- /dev/null +++ b/web/blog/posts/best-tools.md @@ -0,0 +1,160 @@ +In the spirit of various other "Which X do you use?"-pages I thought it would be +fun to have a little post here that describes which tools I've found to work +well for myself. + +When I say "tools" here, it's not about software - it's about real, physical +tools! + +If something goes on this list that's because I think it's seriously a +best-in-class type of product. + +<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc --> +- [Media & Tech](#media--tech) + - [Keyboard](#keyboard) + - [Speakers](#speakers) + - [Headphones](#headphones) + - [Earphones](#earphones) + - [Phone](#phone) +- [Other stuff](#other-stuff) + - [Toothbrush](#toothbrush) + - [Shavers](#shavers) + - [Shoulder bag](#shoulder-bag) + - [Wallet](#wallet) +<!-- markdown-toc end --> + +--------- + +# Media & Tech + +## Keyboard + +The best keyboard that money will buy you at the moment is the [Kinesis +Advantage][advantage]. There's a variety of contoured & similarly shaped +keyboards on the market, but the Kinesis is the only one I've tried that has +properly implemented the keywell concept. + +I struggle with RSI issues and the Kinesis actually makes it possible for me to +type for longer periods of time, which always leads to extra discomfort on +laptop keyboards and such. + +Honestly, the Kinesis is probably the best piece of equipment on this entire +list. I own several of them and there will probably be more in the future. They +last forever and your wrists will thank you in the future, even if you do not +suffer from RSI yet. + +[advantage]: https://kinesis-ergo.com/shop/advantage2/ + +## Speakers + +The speakers that I've hooked up to my audio setup (including both record player +& Chromecast / TV) are the [Teufel Motiv 2][motiv-2]. I've had these for over a +decade and they're incredibly good, but unfortunately Teufel no longer makes +them. + +It's possible to grab a pair on eBay occasionally, so keep an eye out if you're +interested! + +[motiv-2]: https://www.teufelaudio.com/uk/pc/motiv-2-p167.html + +## Headphones + +I use the [Bose QC35][qc35] (note: link goes to a newer generation than the one +I own) for their outstanding noise cancelling functionality and decent sound. + +When I first bought them I didn't expect them to end up on this list as the +firmware had issues that made them only barely usable, but Bose has managed to +iron these problems out over time. + +I avoid using Bluetooth when outside and fortunately the QC35 come with an +optional cable that you can plug into any good old 3.5mm jack. + +[qc35]: https://www.bose.co.uk/en_gb/products/headphones/over_ear_headphones/quietcomfort-35-wireless-ii.html + +### Earphones + +Actually, to follow up on the above - most of the time I'm not using (over-ear) +headphones, but (in-ear) earphones - specifically the (**wired!!!**) [Apple +EarPods][earpods]. + +Apple will probably stop selling these soon because they've gotten into the +habit of cancelling all of their good products, so I have a stash of these +around. You will usually find no fewer than 3-4 of them lying around in my +flat. + +[earpods]: https://www.apple.com/uk/shop/product/MNHF2ZM/A/earpods-with-35mm-headphone-plug + +## Phone + +The best phone I have used in recent years is the [iPhone SE][se]. It was the +*last* phone that had a reasonable size (up to 4") *and* a 3.5mm headphone jack. + +Unfortunately, it runs iOS. Despite owning a whole bunch of SEs, I have finally +moved on to an Android phone that is only moderately larger (still by an +annoying amount), but does at least have a headphone jack: The [Samsung Galaxy +S10e][s10e]. + +It has pretty good hardware and I can almost reach 70% of the screen, which is +better than other phones out there right now. Unfortunately it runs Samsung's +impossible-to-remove bloatware on top of Android, but that is still less +annoying to use than iOS. + +QUESTION: This is the only item on this list for which I am actively seeking a +replacement, so if you have any tips about new phones that might fit these +criteria that I've missed please let me know! + +[se]: https://en.wikipedia.org/wiki/IPhone_SE +[s10e]: https://www.phonearena.com/phones/Samsung-Galaxy-S10e_id11114 + +# Other stuff + +## Toothbrush + +The [Philips Sonicare][sonicare] (note: link goes to a newer generation than +mine) is excellent and well worth its money. + +I've had it for a few years and whereas I occasionally had minor teeth issues +before, they seem to be mostly gone now. According to my dentist the state of my +teeth is now usually pretty good and I draw a direct correlation back to this +thing. + +The newer generations come with flashy features like apps and probably more +LEDs, but I suspect that those can just be ignored. + +[sonicare]: https://www.philips.co.uk/c-m-pe/electric-toothbrushes + +## Shavers + +The [Philipps SensoTouch 3D][sensotouch] is excellent. Super-comfortable close +face shave in no time and leaves absolutely no mess around, as far as I can +tell! I've had this for ~5 years and it's not showing any signs of aging yet. + +Another bonus is that its battery time is effectively infinite. I've never had +to worry when bringing it on a longer trip! + +[sensotouch]: https://www.philips.co.uk/c-p/1250X_40/norelco-sensotouch-3d-wet-and-dry-electric-razor-with-precision-trimmer + +## Shoulder bag + +When I moved to London I wanted to stop using backpacks most of the time, as +those are just annoying to deal with when commuting on the tube. + +To work around this I wanted a good shoulder bag with a vertical format (to save +space), but it turned out that there's very few of those around that reach any +kind of quality standard. + +The one I settled on is the [Waterfield Muzetto][muzetto] leather bag. It's one +of those things that comes with a bit of a price tag attached, but it's well +worth it! + +[muzetto]: https://www.sfbags.com/collections/shoulder-messenger-bags/products/muzetto-leather-bag + +## Wallet + +My wallet is the [Bellroy Slim Sleeve][slim-sleeve]. I don't carry cash unless +I'm attending an event in Germany and this wallet fits that lifestyle perfectly. + +It's near indestructible, looks great, is very slim and fits a ton of cards, +business cards, receipts and whatever else you want to be lugging around with +you! + +[slim-sleeve]: https://bellroy.com/products/slim-sleeve-wallet/default/charcoal diff --git a/web/blog/posts/emacs-is-underrated.md b/web/blog/posts/emacs-is-underrated.md new file mode 100644 index 000000000000..afb8dc889e53 --- /dev/null +++ b/web/blog/posts/emacs-is-underrated.md @@ -0,0 +1,233 @@ +TIP: Hello, and thanks for offering to review my draft! This post +intends to convey to people what the point of Emacs is. Not to convert +them to use it, but at least with opening their minds to the +possibility that it might contain valuable things. I don't know if I'm +on track in the right direction, and your input will help me figure it +out. Thanks! + +TODO(tazjin): Restructure sections: Intro -> Introspectability (and +story) -> text-based UIs (which lead to fluidity, muscle memory across +programs and "translatability" of workflows) -> Outro. It needs more +flow! + +TODO(tazjin): Highlight more that it's not about editing: People can +derive useful things from Emacs by just using magit/org/notmuch/etc.! + +TODO(tazjin): Note that there's value in trying Emacs even if people +don't end up using it, similar to how learning languages like Lisp or +Haskell helps grow as a programmer even without using them day-to-day. + +*Real post starts below!* + +--------- + +There are two kinds of people: Those who use Emacs, and those who +think it is a text editor. This post is aimed at those in the second +category. + +Emacs is the most critical piece of software I run. My [Emacs +configuration][emacs-config] has steadily evolved for almost a decade. +Emacs is my window manager, mail client, terminal, git client, +information management system and - perhaps unsurprisingly - text +editor. + +Before going into why I chose to invest so much into this program, +follow me along on a little thought experiment: + +---------- + +Lets say you use a proprietary spreadsheet program. You find that +there are features in it that *almost, but not quite* do what you +want. + +What can you do? You can file a feature request to the company that +makes it and hope they listen, but for the likes of Apple and +Microsoft chances are they won't and there is nothing you can do. + +Let's say you are also running an open-source program for image +manipulation. You again find that some of its features are subtly +different from what you would want them to do. + +Things look a bit different this time - after all, the program is +open-source! You can go and fetch its source code, figure out its +internal structure and wrangle various layers of code into submission +until you find the piece that implements the functionality you want to +change. If you know the language it is written in; you can modify the +feature. + +Now all that's left is figuring out its build system[^1], building and +installing it and moving over to the new version. + +Realistically you are not going to do this much in the real world. The +friction to contributing to projects, especially complex ones, is +often quite high. For minor inconveniences, you might often find +yourself just shrugging and working around them. + +What if it didn't have to be this way? + +------------- + +One of the core properties of Emacs is that it is *introspective* and +*self-documenting*. + +For example: A few years ago, I had just switched over to using +[EXWM][], the Emacs X Window Manager. To launch applications I was +using an Emacs program called Helm that let me select installed +programs interactively and press <kbd>RET</kbd> to execute them. + +This was very useful - until I discovered that if I tried to open a +second terminal window, it would display an error: + + Error: urxvt is already running + +Had this been dmenu, I might have had to go through the whole process +described above to fix the issue. But it wasn't dmenu - it was an +Emacs program, and I did the following things: + +1. I pressed <kbd>C-h k</kbd>[^2] (which means "please tell me what + the following key does"), followed by <kbd>s-d</kbd> (which was my + keybinding for launching programs). + +2. Emacs displayed a new buffer saying, roughly: + + ``` + s-d runs the command helm-run-external-command (found in global-map), + which is an interactive autoloaded compiled Lisp function in + ‘.../helm-external.el’. + + It is bound to s-d. + ``` + + I clicked on the filename. + +3. Emacs opened the file and jumped to the definition of + `helm-run-external-command`. After a few seconds of reading through + the code, I found this snippet: + + ```lisp + (if (get-process proc) + (if helm-raise-command + (shell-command (format helm-raise-command real-com)) + (error "Error: %s is already running" real-com)) + ;; ... the actual code to launch programs followed below ... + ) + ``` + +4. I deleted the outer if-expression which implemented the behaviour I + didn't want, pressed <kbd>C-M-x</kbd> to reload the code and saved + the file. + +The whole process took maybe a minute, and the problem was now gone. + +Emacs isn't just "open-source", it actively encourages the user to +modify it, discover what to modify and experiment while it is running. + +In some sense it is like the experience of the old Lisp machines, a +paradigm that we have completely forgotten. + +--------------- + +Circling back to my opening statement: If Emacs is not a text editor, +then what *is* it? + +The Emacs website says this: + +> [Emacs] is an interpreter for Emacs Lisp, a dialect of the Lisp +> programming language with extensions to support text editing + +The core of Emacs implements the language and the functionality needed +to evaluate and run it, as well as various primitives for user +interface construction such as buffers, windows and frames. + +Every other feature of Emacs is implemented *in Emacs Lisp*. + +The Emacs distribution ships with rudimentary text editing +functionality (and some language-specific support for the most popular +languages), but it also brings with it two IRC clients, a Tetris +implementation, a text-mode web browser, [org-mode][] and many other +tools. + +Outside of the core distribution there is a myriad of available +programs for Emacs: [magit][] (the famous git porcelain), text-based +[HTTP clients][], even interactive [Kubernetes frontends][k8s]. + +What all of these tools have in common is that they use text-based +user interfaces (UI elements like images are used only sparingly in +Emacs), and that they can be introspected and composed like everything +else in Emacs. + +If magit does not expose a git flag I need, it's trivial to add. If I +want a keybinding to jump from a buffer showing me a Kubernetes pod to +a magit buffer for the source code of the container, it only takes a +few lines of Emacs Lisp to implement. + +As proficiency with Emacs Lisp ramps up, the environment becomes +malleable like clay and evolves along with the user's taste and needs. +Muscle memory learned for one program translates seamlessly to others, +and the overall effect is an improvement in *workflow fluidity* that +is difficult to overstate. + +Also, workflows based on Emacs are *stable*. Moving my window +management to Emacs has meant that I'm not subject to the whim of some +third-party developer changing my window layouting features (as they +often do on MacOS). + +To illustrate this: Emacs has development history back to the 1970s, +continuous git history that survived multiple VCS migrations [since +1985][first-commit] (that's 22 years before git itself was released!) +and there is code[^3] implementing interactive functionality that has +survived unmodified in Emacs *since then*. + +--------------- + +Now, what is the point of this post? + +I decided to write this after a recent [tweet][] by @IanColdwater (in +the context of todo-management apps): + +> The fact that it's 2020 and the most viable answer to this appears +> to be Emacs might be the saddest thing I've ever heard + +What bothers me is that people see this as *sad*. Emacs being around +for this long and still being unparalleled for many of the UX +paradigms implemented by its programs is, in my book, incredible - and +not sad. + +How many other paradigms have survived this long? How many other tools +still have fervent followers, amazing [developer tooling][] and a +[vibrant ecosystem][] at this age? + +Steve Yegge [said it best][babel][^5]: Emacs has the Quality Without a +Name. + +What I wish you, the reader, should take away from this post is the +following: + +TODO(tazjin): Figure out what people should take away from this post. +I need to sleep on it. It's something about not dismissing tools just +because of their age, urging them to explore paradigms that might seem +unfamiliar and so on. Ideas welcome. + +--------------- + +[^1]: Wouldn't it be a joy if every project just used Nix? I digress ... +[^2]: These are keyboard shortcuts written in [Emacs Key Notation][ekn]. +[^3]: For example, [functionality for online memes][studly] that + wouldn't be invented for decades to come! +[^4]: ... and some things wrong, but that is an issue for a separate post! +[^5]: And I really *do* urge you to read that post's section on Emacs. + +[emacs-config]: https://git.tazj.in/tree/tools/emacs +[EXWM]: https://github.com/ch11ng/exwm +[helm]: https://github.com/emacs-helm/helm +[ekn]: https://www.gnu.org/software/emacs/manual/html_node/efaq/Basic-keys.html +[org-mode]: https://orgmode.org/ +[magit]: https://magit.vc +[HTTP clients]: https://github.com/pashky/restclient.el +[k8s]: https://github.com/jypma/kubectl +[first-commit]: http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=ce5584125c44a1a2fbb46e810459c50b227a95e2 +[studly]: http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=47bdd84a0a9d20aab934482a64b84d0db63e7532 +[tweet]: https://twitter.com/IanColdwater/status/1220824466525229056 +[developer tooling]: https://github.com/alphapapa/emacs-package-dev-handbook +[vibrant ecosystem]: https://github.com/emacs-tw/awesome-emacs +[babel]: https://sites.google.com/site/steveyegge2/tour-de-babel#TOC-Lisp diff --git a/web/blog/posts/make-object-t-again.md b/web/blog/posts/make-object-t-again.md new file mode 100644 index 000000000000..420b57c0fde9 --- /dev/null +++ b/web/blog/posts/make-object-t-again.md @@ -0,0 +1,98 @@ +A few minutes ago I found myself debugging a strange Java issue related +to Jackson, one of the most common Java JSON serialization libraries. + +The gist of the issue was that a short wrapper using some types from +[Javaslang](http://www.javaslang.io/) was causing unexpected problems: + +```java +public <T> Try<T> readValue(String json, TypeReference type) { + return Try.of(() -> objectMapper.readValue(json, type)); +} +``` + +The signature of this function was based on the original Jackson +`readValue` type signature: + +```java +public <T> T readValue(String content, TypeReference valueTypeRef) +``` + +While happily using my wrapper function I suddenly got an unexpected +error telling me that `Object` is incompatible with the type I was +asking Jackson to de-serialize, which got me to re-evaluate the above +type signature again. + +Lets look for a second at some code that will *happily compile* if you +are using Jackson\'s own `readValue`: + +```java +// This shouldn't compile! +Long l = objectMapper.readValue("\"foo\"", new TypeReference<String>(){}); +``` + +As you can see there we ask Jackson to decode the JSON into a `String` +as enclosed in the `TypeReference`, but assign the result to a `Long`. +And it compiles. And it failes at runtime with +`java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long`. +Huh? + +Looking at the Jackson `readValue` implementation it becomes clear +what\'s going on here: + +```java +@SuppressWarnings({ "unchecked", "rawtypes" }) +public <T> T readValue(String content, TypeReference valueTypeRef) + throws IOException, JsonParseException, JsonMappingException +{ + return (T) _readMapAndClose(/* whatever */); +} +``` + +The function is parameterised over the type `T`, however the only place +where `T` occurs in the signature is in the parameter declaration and +the function return type. Java will happily let you use generic +functions and types without specifying type parameters: + +```java +// Compiles fine! +final List myList = List.of(1,2,3); + +// Type is now myList : List<Object> +``` + +Meaning that those parameters default to `Object`. Now in the code above +Jackson also explicitly casts the return value of its inner function +call to `T`. + +What ends up happening is that Java infers the expected return type from +the context of the `readValue` and then happily uses the unchecked cast +to fit that return type. If the type hints of the context aren\'t strong +enough we simply get `Object` back. + +So what\'s the fix for this? It\'s quite simple: + +```java +public <T> T readValue(String content, TypeReference<T> valueTypeRef) +``` + +By also making the parameter appear in the `TypeReference` we \"bind\" +`T` to the type enclosed in the type reference. The cast can then also +safely be removed. + +The cherries on top of this are: + +1. `@SuppressWarnings({ "rawtypes" })` explicitly disables a + warning that would\'ve caught this + +2. the `readValue` implementation using the less powerful `Class` + class to carry the type parameter does this correctly: `public <T> + T readValue(String content, Class<T> valueType)` + +The big question I have about this is *why* does Jackson do it this way? +Obviously the warning did not just appear there by chance, so somebody +must have thought about this? + +If anyone knows what the reason is, I\'d be happy to hear from you. + +PS: Shoutout to David & Lucia for helping me not lose my sanity over +this. diff --git a/web/blog/posts/nixery-layers.md b/web/blog/posts/nixery-layers.md new file mode 100644 index 000000000000..3f25ceadce7b --- /dev/null +++ b/web/blog/posts/nixery-layers.md @@ -0,0 +1,272 @@ +TIP: This blog post was originally published as a design document for +[Nixery][] and is not written in the same style +as other blog posts. + +Thanks to my colleagues at Google and various people from the Nix community for +reviewing this. + +------ + +# Nixery: Improved Layering + +**Authors**: tazjin@ + +**Reviewers**: so...@, en...@, pe...@ + +**Status**: Implemented + +**Last Updated**: 2019-08-10 + +## Introduction + +This document describes a design for an improved image layering method for use +in Nixery. The algorithm [currently used][grhmc] is designed for a slightly +different use-case and we can improve upon it by making use of more of the +available data. + +## Background / Motivation + +Nixery is a service that uses the [Nix package manager][nix] to build container +images (for runtimes such as Docker), that are served on-demand via the +container [registry protocols][]. A demo instance is available at +[nixery.dev][]. + +In practice this means users can simply issue a command such as `docker pull +nixery.dev/shell/git` and receive an image that was built ad-hoc containing a +shell environment and git. + +One of the major advantages of building container images via Nix (as described +for `buildLayeredImage` in [this blog post][grhmc]) is that the +content-addressable nature of container image layers can be used to provide more +efficient caching characteristics (caching based on layer content) than what is +common with Dockerfiles and other image creation methods (caching based on layer +creation method). + +However, this is constrained by the maximum number of layers supported in an +image (125). A naive approach such as putting each included package (any +library, binary, etc.) in its own layer quickly runs into this limitation due to +the large number of dependencies more complex systems tend to have. In addition, +users wanting to extend images created by Nixery (e.g. via `FROM nixery.dev/…`) +share this layer maximum with the created image - limiting extensibility if all +layers are used up by Nixery. + +In theory the layering strategy of `buildLayeredImage` should already provide +good caching characteristics, but in practice we are seeing many images with +significantly more packages than the number of layers configured, leading to +more frequent cache-misses than desired. + +The current implementation of `buildLayeredImage` inspects a graph of image +dependencies and determines the total number of references (direct & indirect) +to any node in the graph. It then sorts all dependencies by this popularity +metric and puts the first `n - 2` (for `n` being the maximum number of layers) +packages in their own layers, all remaining packages in one layer and the image +configuration in the final layer. + +## Design / Proposal + +## (Close-to) ideal layer-layout using more data + +We start out by considering what a close to ideal layout of layers would look +like for a simple use-case. + +![Ideal layout](/static/img/nixery/ideal_layout.webp) + +In this example, counting the total number of references to each node in the +graph yields the following result: + +| pkg | refs | +|-------|------| +| E | 3 | +| D | 2 | +| F | 2 | +| A,B,C | 1 | + +Assuming we are constrained to 4 layers, the current algorithm would yield these layers: + +``` +L1: E +L2: D +L3: F +L4: A, B, C +``` + +The initial proposal for this design is that additional data should be +considered in addition to the total number of references, in particular a +distinction should be made between direct and indirect references. Packages that +are only referenced indirectly should be merged with their parents. + +This yields the following table: + +| pkg | direct | indirect | +|-------|--------|----------| +| E | 3 | 3 | +| D | 2 | 2 | +| F | *1* | 2 | +| A,B,C | 1 | 1 | + +Despite having two indirect references, F is in fact only being referred to +once. Assuming that we have no other data available outside of this graph, we +have no reason to assume that F has any popularity outside of the scope of D. +This might yield the following layers: + +``` +L1: E +L2: D, F +L3: A +L4: B, C +``` + +D and F were grouped, while the top-level references (i.e. the packages +explicitly requested by the user) were split up. + +An assumption is introduced here to justify this split: The top-level packages +is what the user is modifying directly, and those groupings are likely +unpredictable. Thus it is opportune to not group top-level packages in the same +layer. + +This raises a new question: Can we make better decisions about where to split +the top-level? + +## (Even closer to) ideal layering using (even) more data + +So far when deciding layer layouts, only information immediately available in +the build graph of the image has been considered. We do however have much more +information available, as we have both the entire nixpkgs-tree and potentially +other information (such as download statistics). + +We can calculate the total number of references to any derivation in nixpkgs and +use that to rank the popularity of each package. Packages within some percentile +can then be singled out as good candidates for a separate layer. + +When faced with a splitting decision such as in the last section, this data can +aid the decision. Assume for example that package B in the above is actually +`openssl`, which is a very popular package. Taking this into account would +instead yield the following layers: + +``` +L1: E, +L2: D, F +L3: B, +L4: A, C +``` + +## Layer budgets and download size considerations + +As described in the introduction, there is a finite amount of layers available +for each image (the “layer budget”). When calculating the layer distribution, we +might end up with the “ideal” list of layers that we would like to create. Using +our previous example: + +``` +L1: E, +L2: D, F +L3: A +L4: B +L5: C +``` + +If we only have a layer budget of 4 available, something needs to be merged into +the same layer. To make a decision here we could consider only the package +popularity, but there is in fact another piece of information that has not come +up yet: The actual size of the package. + +Presumably a user would not mind downloading a library that is a few kilobytes +in size repeatedly, but they would if it was a 200 megabyte binary instead. + +Conversely if a large binary was successfully cached, but an extremely popular +small library is not, the total download size might also grow to irritating +levels. + +To avoid this we can calculate a merge rating: + + merge_rating(pkg) = popularity_percentile(pkg) × size(pkg.subtree) + +Packages with a low merge rating would be merged together before packages with +higher merge ratings. + +## Implementation + +There are two primary components of the implementation: + +1. The layering component which, given an image specification, decides the image + layers. + +2. The popularity component which, given the entire nixpkgs-tree, calculates the + popularity of packages. + +## Layering component + +It turns out that graph theory’s concept of [dominator trees][] maps reasonably +well onto the proposed idea of separating direct and indirect dependencies. This +becomes visible when creating the dominator tree of a simple example: + +![Example without extra edges](/static/img/nixery/example_plain.webp) + +Before calculating the dominator tree, we inspect each node and insert extra +edges from the root for packages that match a certain popularity or size +threshold. In this example, G is popular and an extra edge is inserted: + +![Example with extra edges](/static/img/nixery/example_extra.webp) + +Calculating the dominator tree of this graph now yields our ideal layer +distribution: + +![Dominator tree of example](/static/img/nixery/dominator.webp) + +The nodes immediately dominated by the root node can now be “harvested” as image +layers, and merging can be performed as described above until the result fits +into the layer budget. + +To implement this, the layering component uses the [gonum/graph][] library which +supports calculating dominator trees. The program is fed with Nix’s +`exportReferencesGraph` (which contains the runtime dependency graph and runtime +closure size) as well as the popularity data and layer budget. It returns a list +of layers, each specifying the paths it should contain. + +Nix invokes this program and uses the output to create a derivation for each +layer, which is then built and returned to Nixery as usual. + +TIP: This is implemented in [`layers.go`][layers.go] in Nixery. The file starts +with an explanatory comment that talks through the process in detail. + +## Popularity component + +The primary issue in calculating the popularity of each package in the tree is +that we are interested in the runtime dependencies of a derivation, not its +build dependencies. + +To access information about the runtime dependency, the derivation actually +needs to be built by Nix - it can not be inferred because Nix does not know +which store paths will still be referenced by the build output. + +However for packages that are cached in the NixOS cache, we can simply inspect +the `narinfo`-files and use those to determine popularity. + +Not every package in nixpkgs is cached, but we can expect all *popular* packages +to be cached. Relying on the cache should therefore be reasonable and avoids us +having to rebuild/download all packages. + +The implementation will read the `narinfo` for each store path in the cache at a +given commit and create a JSON-file containing the total reference count per +package. + +For the public Nixery instance, these popularity files will be distributed via a +GCS bucket. + +TIP: This is implemented in [popcount][] in Nixery. + +-------- + +Hopefully this detailed design review was useful to you. You can also watch [my +NixCon talk][talk] about Nixery for a review of some of this, and some demos. + +[Nixery]: https://github.com/google/nixery +[grhmc]: https://grahamc.com/blog/nix-and-layered-docker-images +[Nix]: https://nixos.org/nix +[registry protocols]: https://github.com/opencontainers/distribution-spec/blob/master/spec.md +[nixery.dev]: https://nixery.dev +[dominator trees]: https://en.wikipedia.org/wiki/Dominator_(graph_theory) +[gonum/graph]: https://godoc.org/gonum.org/v1/gonum/graph +[layers.go]: https://github.com/google/nixery/blob/master/builder/layers.go +[popcount]: https://github.com/google/nixery/tree/master/popcount +[talk]: https://www.youtube.com/watch?v=pOI9H4oeXqA diff --git a/web/blog/posts/nsa-zettabytes.md b/web/blog/posts/nsa-zettabytes.md new file mode 100644 index 000000000000..f8b326f2fb42 --- /dev/null +++ b/web/blog/posts/nsa-zettabytes.md @@ -0,0 +1,93 @@ +I've been reading a few discussions on Reddit about the new NSA data +centre that is being built and stumbled upon [this +post](http://www.reddit.com/r/restorethefourth/comments/1jf6cx/the_guardian_releases_another_leaked_document_nsa/cbe5hnc), +putting its alleged storage capacity at *5 zettabytes*. + +That seems to be a bit much which I tried to explain to that guy, but I +was quickly blocked by the common conspiracy argument that government +technology is somehow far beyond the wildest dreams of us mere mortals - +thus I wrote a very long reply that will most likely never be seen by +anybody. Therefore I've decided to repost it here. + +------------------------------------------------------------------------ + +I feel like I've entered /r/conspiracy. Please have some facts (and do +read them!) + +A one terabyte SSD (I assume that\'s what you meant by flash-drive) +would require 5000000000 of those. That is *five billion* of those flash +drives. Can you visualise how much five billion flash-drives are? + +A single SSD is roughly 2cm\*13cm\*13cm with an approximate weight of +80g. That would make 400 000 metric tons of SSDs, a weight equivalent to +*over one thousand Boeing 747 airplanes*. Even if we assume that they +solder the flash chips directly onto some kind of controller (which also +weighs something), the raw material for that would be completely insane. + +Another visualization: If you stacked 5 billion SSDs on top of each +other you would get an SSD tower that is a hundred thousand kilometres +high, that is equivalent to 2,5 x the equatorial circumference of +*Earth* or 62000 miles. + +The volume of those SSDs would be clocking in at 1690000000 cubic +metres, more than the Empire State building. Are you still with me? + +Lets speak cost. The Samsung SSD that I assume you are referring to will +clock in at \$600, lets assume that the NSA gets a discount when buying +*five billion* of those and gets them at the cheap price of \$250. That +makes 1.25 trillion dollars. That would be a significant chunk of the +current US national debt. + +And all of this is just SSDs to stick into servers and storage units, +which need a whole bunch of other equipment as well to support them - +the cost would probably shoot up to something like 8 trillion dollars if +they were to build this. It would with very high certainty be more than +the annual production of SSDs (I can\'t find numbers on that +unfortunately) and take up *slightly* more space than they have in the +Utah data centre (assuming you\'re not going to tell me that it is in +fact attached to an underground base that goes down to the core of the +Earth). + +Lets look at the \"But the government has better technologies!\" idea. + +Putting aside the fact that the military *most likely* does not have a +secret base on Mars that deals with advanced science that the rest of us +can only dream of, and doing this under the assumption that they do have +this base, lets assume that they build a storage chip that stores 100TB. +This reduces the amount of needed chips to \"just\" 50 million, lets say +they get 10 of those into a server / some kind of specialized storage +unit and we only need 5 million of those specially engineered servers, +with custom connectors, software, chips, storage, most likely also power +sources and whatever - 10 million completely custom units built with +technology that is not available to the market. Google is estimated to +have about a million servers in total, I don\'t know exactly in how many +data centres those are placed but numbers I heard recently said that +it\'s about 40. When Apple assembles a new iPhone model they need +massive factories with thousands of workers and supplies from many +different countries, over several months, to assemble just a few million +units for their launch month. + +You are seriously proposing that the NSA is better than Google and Apple +and the rest of the tech industry, world-wide, combined at designing +*everything* in tech, manufacturing *everything* in tech, without *any* +information about that leaking and without *any* of the science behind +it being known? That\'s not just insane, that\'s outright impossible. + +And we haven\'t even touched upon how they would route the necessary +amounts of bandwidth (crazy insane) to save *the entire internet* into +that data center. + +------------------------------------------------------------------------ + +I\'m not saying that the NSA is not building a data center to store +surveillance information, to have more capacity to spy on people and all +that - I\'m merely making the point that the extent in which conspiracy +sites say they do this vastly overestimates their actual abilities. They +don\'t have magic available to them! Instead of making up insane figures +like that you should focus on what we actually know about their +operations, because using those figures in a debate with somebody who is +responsible for this (and knows what they\'re talking about) will end +with you being destroyed - nobody will listen to the rest of what +you\'re saying when that happens. + +\"Stick to the facts\" is valid for our side as well. diff --git a/web/blog/posts/reversing-watchguard-vpn.md b/web/blog/posts/reversing-watchguard-vpn.md new file mode 100644 index 000000000000..f1b779d8d993 --- /dev/null +++ b/web/blog/posts/reversing-watchguard-vpn.md @@ -0,0 +1,158 @@ +TIP: WatchGuard has +[responded](https://www.reddit.com/r/netsec/comments/5tg0f9/reverseengineering_watchguard_mobile_vpn/dds6knx/) +to this post on Reddit. If you haven\'t read the post yet I\'d recommend +doing that first before reading the response to have the proper context. + +------------------------------------------------------------------------ + +One of my current client makes use of +[WatchGuard](http://www.watchguard.com/help/docs/fireware/11/en-US/Content/en-US/mvpn/ssl/mvpn_ssl_client-install_c.html) +Mobile VPN software to provide access to the internal network. + +Currently WatchGuard only provides clients for OS X and Windows, neither +of which I am very fond of. In addition an OpenVPN configuration file is +provided, but it quickly turned out that this was only a piece of the +puzzle. + +The problem is that this VPN setup is secured using 2-factor +authentication (good!), but it does not use OpenVPN\'s default +[challenge/response](https://openvpn.net/index.php/open-source/documentation/miscellaneous/79-management-interface.html) +functionality to negotiate the credentials. + +Connecting with the OpenVPN config that the website supplied caused the +VPN server to send me a token to my phone, but I simply couldn\'t figure +out how to supply it back to the server. In a normal challenge/response +setting the token would be supplied as the password on the second +authentication round, but the VPN server kept rejecting that. + +Other possibilities were various combinations of username&password +(I\'ve seen a lot of those around) so I tried a whole bunch, for example +`$password:$token` or even a `sha1(password, token)` - to no avail. + +At this point it was time to crank out +[Hopper](https://www.hopperapp.com/) and see what\'s actually going on +in the official OS X client - which uses OpenVPN under the hood! + +Diving into the client +---------------------- + +The first surprise came up right after opening the executable: It had +debug symbols in it - and was written in Objective-C! + +![Debug symbols](/static/img/watchblob_1.webp) + +A good first step when looking at an application binary is going through +the strings that are included in it, and the WatchGuard client had a lot +to offer. Among the most interesting were a bunch of URIs that looked +important: + +![Some URIs](/static/img/watchblob_2.webp) + +I started with the first one + + %@?action=sslvpn_download&filename=%@&fw_password=%@&fw_username=%@ + +and just curled it on the VPN host, replacing the username and +password fields with bogus data and the filename field with +`client.wgssl` - another string in the executable that looked like a +filename. + +To my surprise this endpoint immediately responded with a GZIPed file +containing the OpenVPN config, CA certificate, and the client +*certificate and key*, which I previously thought was only accessible +after logging in to the web UI - oh well. + +The next endpoint I tried ended up being a bit more interesting still: + + /?action=sslvpn_logon&fw_username=%@&fw_password=%@&style=fw_logon_progress.xsl&fw_logon_type=logon&fw_domain=Firebox-DB + +Inserting the correct username and password into the query parameters +actually triggered the process that sent a token to my phone. The +response was a simple XML blob: + +```xml +<?xml version="1.0" encoding="UTF-8"?> +<resp> + <action>sslvpn_logon</action> + <logon_status>4</logon_status> + <auth-domain-list> + <auth-domain> + <name>RADIUS</name> + </auth-domain> + </auth-domain-list> + <logon_id>441</logon_id> + <chaStr>Enter Your 6 Digit Passcode </chaStr> +</resp> +``` + +Somewhat unsurprisingly that `chaStr` field is actually the challenge +string displayed in the client when logging in. + +This was obviously going in the right direction so I proceeded to the +procedures making use of this string. The first step was a relatively +uninteresting function called `-[VPNController sslvpnLogon]` which +formatted the URL, opened it and checked whether the `logon_status` was +`4` before proceeding with the `logon_id` and `chaStr` contained in the +response. + +*(Code snippets from here on are Hopper\'s pseudo-Objective-C)* + +![sslvpnLogon](/static/img/watchblob_3.webp) + +It proceeded to the function `-[VPNController processTokenPrompt]` which +showed the dialog window into which the user enters the token, sent it +off to the next URL and checked the `logon_status` again: + +(`r12` is the reference to the `VPNController` instance, i.e. `self`). + +![processTokenPrompt](/static/img/watchblob_4.webp) + +If the `logon_status` was `1` (apparently \"success\" here) it proceeded +to do something quite interesting: + +![processTokenPrompt2](/static/img/watchblob_5.webp) + +The user\'s password was overwritten with the (verified) OTP token - +before OpenVPN had even been started! + +Reading a bit more of the code in the subsequent +`-[VPNController doLogin]` method revealed that it shelled out to +`openvpn` and enabled the management socket, which makes it possible to +remotely control an `openvpn` process by sending it commands over TCP. + +It then simply sent the username and the OTP token as the credentials +after configuring OpenVPN with the correct config file: + +![doLogin](/static/img/watchblob_6.webp) + +... and the OpenVPN connection then succeeds. + +TL;DR +----- + +Rather than using OpenVPN\'s built-in challenge/response mechanism, the +WatchGuard client validates user credentials *outside* of the VPN +connection protocol and then passes on the OTP token, which seems to be +temporarily in a \'blessed\' state after verification, as the user\'s +password. + +I didn\'t check to see how much verification of this token is performed +(does it check the source IP against the IP that performed the challenge +validation?), but this certainly seems like a bit of a security issue - +considering that an attacker on the same network would, if they time the +attack right, only need your username and 6-digit OTP token to +authenticate. + +Don\'t roll your own security, folks! + +Bonus +----- + +The whole reason why I set out to do this is so I could connect to this +VPN from Linux, so this blog post wouldn\'t be complete without a +solution for that. + +To make this process really easy I\'ve written a [little +tool](https://github.com/tazjin/watchblob) that performs the steps +mentioned above from the CLI and lets users know when they can +authenticate using their OTP token. diff --git a/web/blog/posts/sick-in-sweden.md b/web/blog/posts/sick-in-sweden.md new file mode 100644 index 000000000000..0c43c5832d73 --- /dev/null +++ b/web/blog/posts/sick-in-sweden.md @@ -0,0 +1,26 @@ +I\'ve been sick more in the two years in Sweden than in the ten years +before that. + +Why? I have a theory about it and after briefly discussing it with one +of my roommates (who is experiencing the same thing) I\'d like to share +it with you: + +Normally when people get sick, are coughing, have a fever and so on they +take a few days off from work and stay at home. The reasons are twofold: +You want to rest a bit in order to get rid of the disease and you want +to *avoid infecting your co-workers*. + +In Sweden people will drag themselves into work anyways, because of a +concept called the +[karensdag](https://www.forsakringskassan.se/wps/portal/sjukvard/sjukskrivning_och_sjukpenning/karensdag_och_forstadagsintyg). +The TL;DR of this is \'if you take days off sick you won\'t get paid for +the first day, and only 80% of your salary on the remaining days\'. + +Many people are not willing to take that financial hit. In combination +with Sweden\'s rather mediocre healthcare system you end up constantly +being surrounded by sick people, not just in your own office but also on +public transport and basically all other public places. + +Oh and the best thing about this? Swedish politicians [often ignore +this](https://www.aftonbladet.se/nyheter/article10506886.ab) rule and +just don\'t report their sick days. Nice. diff --git a/web/blog/posts/the-smu-problem.md b/web/blog/posts/the-smu-problem.md new file mode 100644 index 000000000000..f411e3116046 --- /dev/null +++ b/web/blog/posts/the-smu-problem.md @@ -0,0 +1,151 @@ +After having tested countless messaging apps over the years, being +unsatisfied with most of them and finally getting stuck with +[Telegram](https://telegram.org/) I have developed a little theory about +messaging apps. + +SMU stands for *Security*, *Multi-Device* and *Usability*. Quite like +the [CAP-theorem](https://en.wikipedia.org/wiki/CAP_theorem) I believe +that you can - using current models - only solve two out of three things +on this list. Let me elaborate what I mean by the individual points: + +**Security**: This is mainly about encryption of messages, not so much +about hiding identities to third-parties. Commonly some kind of +asymmetric encryption scheme. Verification of keys used must be possible +for the user. + +**Multi-Device**: Messaging-app clients for multiple devices, with +devices being linked to the same identifier, receiving the same messages +and being independent of each other. A nice bonus is also an open +protocol (like Telegram\'s) that would let people write new clients. + +**Usability**: Usability is a bit of a broad term, but what I mean by it +here is handling contacts and identities. It should be easy to create +accounts, give contact information to people and have everything just +work in a somewhat automated fashion. + +Some categorisation of popular messaging apps: + +**SU**: Threema + +**MU**: Telegram, Google Hangouts, iMessage, Facebook Messenger + +**SM**: +[Signal](https://gist.github.com/TheBlueMatt/d2fcfb78d29faca117f5) + +*Side note: The most popular messaging app - WhatsApp - only scores a +single letter (U). This makes it completely uninteresting to me.* + +Let\'s talk about **SM** - which might contain the key to solving SMU. +Two approaches are interesting here. + +The single key model +-------------------- + +In Signal there is a single identity key which can be used to register a +device on the server. There exists a process for sharing this identity +key from a primary device to a secondary one, so that the secondary +device can register itself (see the link above for a description). + +This *almost* breaks M because there is still a dependence on a primary +device and newly onboarded devices can not be used to onboard further +devices. However, for lack of a better SM example I\'ll give it a pass. + +The other thing it obviously breaks is U as the process for setting it +up is annoying and having to rely on the primary device is a SPOF (there +might be a way to recover from a lost primary device, but I didn\'t find +any information so far). + +The multiple key model +---------------------- + +In iMessage every device that a user logs into creates a new key pair +and submits its public key to a per-account key pool. Senders fetch all +available public keys for a recipient and encrypt to all of the keys. + +Devices that join can catch up on history by receiving it from other +devices that use its public key. + +This *almost* solves all of SMU, but its compliance with S breaks due to +the fact that the key pool is not auditable, and controlled by a +third-party (Apple). How can you verify that they don\'t go and add +another key to your pool? + +A possible solution +------------------- + +Out of these two approaches I believe the multiple key one looks more +promising. If there was a third-party handling the key pool but in a way +that is verifiable, transparent and auditable that model could be used +to solve SMU. + +The technology I have been thinking about for this is some kind of +blockchain model and here\'s how I think it could work: + +1. Bob installs the app and begins onboarding. The first device + generates its keypair, submits the public key and an account + creation request. + +2. Bob\'s account is created on the messaging apps\' servers and a + unique identifier plus the fingerprint of the first device\'s public + key is written to the chain. + +3. Alice sends a message to Bob, her device asks the messaging service + for Bob\'s account\'s identity and public keys. Her device verifies + the public key fingerprint against the one in the blockchain before + encrypting to it and sending the message. + +4. Bob receives Alice\'s message on his first device. + +5. Bob logs in to his account on a second device. The device generates + a key pair and sends the public key to the service, the service + writes it to the blockchain using its identifier. + +6. The messaging service requests that Bob\'s first device signs the + second device\'s key and triggers a simple confirmation popup. + +7. Bob confirms the second device on his first device. It signs the key + and writes the signature to the chain. + +8. Alice sends another message, her device requests Bob\'s current keys + and receives the new key. It verifies that both the messaging + service and one of Bob\'s older devices have confirmed this key in + the chain. It encrypts the message to both keys and sends it on. + +9. Bob receives Alice\'s message on both devices. + +After this the second device can request conversation history from the +first one to synchronise old messages. + +Further devices added to an account can be confirmed by any of the +devices already in the account. + +The messaging service could not add new keys for an account on its own +because it does not control any of the private keys confirmed by the +chain. + +In case all devices were lost, the messaging service could associate the +account with a fresh identity in the block chain. Message history +synchronisation would of course be impossible. + +Feedback welcome +---------------- + +I would love to hear some input on this idea, especially if anyone knows +of an attempt to implement a similar model already. Possible attack +vectors would also be really interesting. + +Until something like this comes to fruition, I\'ll continue using +Telegram with GPG as the security layer when needed. + +**Update:** WhatsApp has launched an integration with the Signal guys +and added their protocol to the official WhatsApp app. This means +WhatsApp now firmly sits in the SU-category, but it still does not solve +this problem. + +**Update 2:** Facebook Messenger has also integrated with Signal, but +their secret chats do not support multi-device well (it is Signal +afterall). This means it scores either SU or MU depending on which mode +you use it in. + +An interesting service I have not yet evaluated properly is +[Matrix](http://matrix.org/). |