about summary refs log tree commit diff
path: root/doc/design.rst
diff options
context:
space:
mode:
authorVincent Ambo <mail@tazj.in>2020-07-15T07·20+0100
committerVincent Ambo <mail@tazj.in>2020-07-15T07·20+0100
commit7f19d641647ac4ef313ed88d6b5c140983ce5436 (patch)
tree31b66c81465293da5c093c5dde3e419758c0d6cc /doc/design.rst
Squashed 'third_party/immer/' content from commit ad3e3556d
git-subtree-dir: third_party/immer
git-subtree-split: ad3e3556d38bb75966dd24c61a774970a7c7957e
Diffstat (limited to 'doc/design.rst')
-rw-r--r--doc/design.rst268
1 files changed, 268 insertions, 0 deletions
diff --git a/doc/design.rst b/doc/design.rst
new file mode 100644
index 000000000000..27d29e9e617a
--- /dev/null
+++ b/doc/design.rst
@@ -0,0 +1,268 @@
+.. _design:
+
+Design
+======
+
+This is a library of **immutable containers**.
+
+These containers have all their methods marked ``const``.  Instead of
+mutating them *in place*, they provide manipulation functions that
+*return a new transformed value*, leaving the original value
+unaltered.  In the context of data-structures, this property of
+preserving old values is called **persistence**.
+
+Most of these containers use data-structures in which these operations
+can be done *efficiently*.  In particular, not all data is copied when
+a new value is produced.  Instead, the new values may share,
+internally, common data with other objects.  We sometimes refer to
+this property as **structural sharing**.  This behaviour is
+transparent to the user.
+
+Assigment
+---------
+
+We are sorry, we lied. These containers provide *one mutating
+operation*: **assignment** --- i.e. ``operator=``.
+
+There is a good reason: without ``operator=`` everything becomes
+complicated in C++.  For example, one may not contain non-assignable
+types in many standard containers, assignment would also be disabled
+from your custom types holding immutable containers, and so on and so
+forth.
+
+C++ is a multi-paradigm language with an imperative core.  Thus, it is
+built on the foundation that *variables* can be mutated ---
+i.e. assigned to.  We don't want to ride against this tide.  What we
+want to prevent is in-place *object* manipulation.  Because of C++
+semantics, *variable* assignment is defined in terms of *object
+mutation*, so we have to provide *this very particular mutating
+operation*, but nothing else.  Of course, you are free to mark your
+variables ``const`` to completely forbid assignment.
+
+.. warning::
+
+   **Assignment is not thread safe**.  When a *mutable* variable is
+   shared across multiple threads, protect access using some other
+   mechanism.
+
+   For obvious reasons, all other methods, which are ``const``, are
+   thread-safe.  It is safe to share *immutable* state across multiple
+   threads.
+
+To ``const`` or not to ``const``
+--------------------------------
+
+Many C++ programmers, influenced by functional programming, are trying
+to escape the evils of mutability by using ``const`` whenever
+possible.  We also do it ourselves in many of our examples to
+reinforce the property of immutability.
+
+While in general this is a good practice backed up with very good
+intentions, it has one caveat: *it disables moveability*.  It does so
+even when ``std::move()`` is used.  This makes sense, since moving from
+an object may mutate it, and ``const``, my friends, prevents *all*
+mutations.  For example:
+
+.. literalinclude:: ../example/vector/move.cpp
+   :language: c++
+   :start-after: move-bad/start
+   :end-before:  move-bad/end
+
+One may think that the variable ``v`` is moved into the
+``push_back()`` call.  This is not the case, because the variable
+``v`` is marked ``const``.  Of course, one may enable the move by
+removing it, as in:
+
+.. literalinclude:: ../example/vector/move.cpp
+   :language: c++
+   :start-after: move-good/start
+   :end-before:  move-good/end
+
+So, is it bad style then to use ``const`` as much as possible?  I
+wouldn't say so and it is advisable when ``std::move()`` is not used.
+An alternative style is to not use ``const`` but adopt an `AAA-style
+<aaa>`_ (*Almost Always use Auto*).  This way, it is easy to look for
+mutations by looking for lines that contain ``=`` but no ``auto``.
+Remember that when using our immutable containers ``operator=`` is the
+only way to mutate a variable.
+
+.. _aaa: https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/
+
+.. admonition:: Why does ``const`` prevent move semantics?
+
+   For those adventurous into the grainy details C++, here is why.
+   ``std::move()`` does not move anything, it is just *a cast* from
+   normal *l-value* references (``T&``) to *r-value* reference
+   (``T&&``).  This is, you pass it a variable, and it returns a
+   reference to its object disguised as an intermediate result.  In
+   exchange, you promise not to do anything with this variable later
+   [#f1]_. It is the role of the thing that *receives the moved-from
+   value* (in the previous example, ``push_back``) to actually do
+   anything interesting with it --- for example, steal its contents
+   😈.
+
+   So if you pass a ``T&`` to ``std::move()`` you get a ``T&&`` and,
+   unsurprisingly, if you pass a ``const T&`` you get a ``const T&&``.
+   But the receivers of the moved-from value (like constructors or our
+   ``push_back()``) maybe be moved-into because they provide an
+   overload that expects ``T&&`` --- without the ``const``!  Since a
+   ``const T&&`` can not be converted into a ``T&&``, the compiler
+   looks up for you another viable overload, and most often finds a
+   copy constructor or something alike that expects a ``const T&`` or
+   just ``T``, to which a ``const T&&`` can be converted.  The code
+   compiles and works correctly, but it is less efficient than we
+   expected.  Our call to ``std::move()`` was fruitless.
+
+   .. [#f1] For the sake of completeness: it is actually allowed to do stuff
+            with the variable *after another value is assigned to it*.
+
+.. _move-semantics:
+
+Leveraging move semantics
+-------------------------
+
+When using :ref:`reference counting<rc>` (which is the default)
+mutating operations can often be faster when operating on *r-value
+references* (temporaries and moved-from values).  Note that this
+removes *persistence*, since one can not access the moved-from value
+anymore!  However, this may be a good idea when doing a chain of
+operations where the intermediate values are not important to us.
+
+For example, let's say we want to write a function that inserts all
+integers in the range :math:`[first, last)` into an immutable vector.
+From the point of view of the caller of the function, this function is
+a *transaction*.  Whatever intermediate vectors are generated inside
+of it can be discarded since the caller can only see the initial
+vector (the one passed in as argument) and the vector with *all* the
+elements.  We may write such function like this:
+
+.. literalinclude:: ../example/vector/iota-move.cpp
+   :language: c++
+   :start-after: myiota/start
+   :end-before:  myiota/end
+
+The intermediate values are *moved* into the next ``push_back()``
+call.  They are going to be discarded anyways, this little
+``std::move`` just makes the whole thing faster, letting ``push_back``
+mutate part of the internal data structure in place when possible.
+
+If you don't like this syntax, :doc:`transients<transients>` may be
+used to obtain similar performance benefits.
+
+.. admonition:: Assigment guarantees
+
+   From the language point of view, the only requirement on moved from
+   values is that they should still be destructible.  We provide the
+   following two additional guarantees:
+
+   - **It is valid to assign to a moved-from variable**.  The variable
+     gets the assigned value and becomes usable again.  This is the
+     behaviour of standard types.
+
+   - **It is valid to assign a moved-from variable to itself**.  For
+     most standard types this is *undefined behaviour*.  However, for our
+     immutable containers types, expressions of the form ``v =
+     std::move(v)`` are well-defined.
+
+Recursive types
+---------------
+
+Most containers will fail to be instantiated with a type of unknown
+size, this is, an *incomplete type*.  This prevents using them for
+building recursive types.  The following code fails to compile:
+
+.. code-block:: c++
+
+  struct my_type
+  {
+      int data;
+      immer::vector<my_type> children;
+  };
+
+However, we can easily workaround this by using an ``immer::box`` to wrap
+the elements in the vector, as in:
+
+.. code-block:: c++
+
+  struct my_type
+  {
+      int data;
+      immer::vector<immer::box<my_type>> children;
+  };
+
+.. admonition:: Standard containers and incomplete types
+
+  While the first example might seem to compile when using some
+  implementations of ``std::vector`` instead of ``immer::vector``, such
+  use is actually forbidden by the standard:
+
+    **17.6.4.8** *Other functions (...)* 2. the effects are undefined in
+    the following cases: (...) In particular---if an incomplete type (3.9)
+    is used as a template argument when instantiating a template
+    component, unless specifically allowed for that component.
+
+.. _batch-update:
+Efficient batch manipulations
+-----------------------------
+
+Sometimes you may write a function that needs to do multiple changes
+to a container.  Like most code you write with this library, this
+function is *pure*: it takes one container value in, and produces a
+new container value out, no side-effects.
+
+Let's say we want to write a function that inserts all integers in the
+range :math:`[first, last)` into an immutable vector:
+
+.. literalinclude:: ../example/vector/iota-slow.cpp
+   :language: c++
+   :start-after: include:myiota/start
+   :end-before:  include:myiota/end
+
+This function works as expected, but it is slower than necessary.
+On every loop iteration, a new value is produced, just to be
+forgotten in the next iteration.
+
+Instead, we can grab a mutable view on the value, a :ref:`transient`.
+Then, we manipulate it *in-place*.  When we are done with it, we
+extract back an immutable value from it.  The code now looks like
+this:
+
+.. _iota-transient:
+
+.. literalinclude:: ../example/vector/iota-transient.cpp
+   :language: c++
+   :start-after: include:myiota/start
+   :end-before:  include:myiota/end
+
+Both conversions are :math:`O(1)`.  Note that calling ``transient()``
+does not break the immutability of the variable it is called on.  The
+new mutable object will adopt its contents, but when a mutation is
+performed, it will copy the data necessary using *copy on write*.
+Subsequent manipulations may hit parts that have already been copied,
+and these changes are done in-place.  Because of this, it does not
+make sense to use transients to do only one change.
+
+.. tip::
+
+   Note that :ref:`move semantics<move-semantics>` can be used instead to
+   support a similar use-case.  However, transients optimise updates
+   even when reference counting is disabled.
+
+.. _std-compat:
+Standard library compatibility
+------------------------------
+
+While the immutable containers provide an interface that follows a
+functional style, this is incompatible with what the standard library
+algorithms sometimes expect. :ref:`transients` try to provide an
+interface as similar as possible to similar standard library
+containers.  Thus, can also be used to interoperate with standard
+library components.
+
+For example the :ref:`myiota() function above<iota-transient>` may as
+well be written using standard library tools:
+
+.. literalinclude:: ../example/vector/iota-transient-std.cpp
+   :language: c++
+   :start-after: include:myiota/start
+   :end-before:  include:myiota/end