about summary refs log tree commit diff
path: root/tools/nixery/popcount/README.md
diff options
context:
space:
mode:
authorVincent Ambo <tazjin@google.com>2019-08-12T16·47+0100
committerVincent Ambo <github@tazj.in>2019-08-13T23·02+0100
commit6035bf36eb93bc30db6ac40739913358e71d1121 (patch)
tree360e479422e04b75c2b5b920ced12886a44f7e74 /tools/nixery/popcount/README.md
parent6d718bf2713a7e2209197247976390b878f51313 (diff)
feat(popcount): Clean up popularity counting script
Adds the script used to generate the popularity information for all of
nixpkgs.

The README lists the (currently somewhat rough) usage instructions.
Diffstat (limited to 'tools/nixery/popcount/README.md')
-rw-r--r--tools/nixery/popcount/README.md39
1 files changed, 39 insertions, 0 deletions
diff --git a/tools/nixery/popcount/README.md b/tools/nixery/popcount/README.md
new file mode 100644
index 0000000000..8485a4d30e
--- /dev/null
+++ b/tools/nixery/popcount/README.md
@@ -0,0 +1,39 @@
+popcount
+========
+
+This script is used to count the popularity for each package in `nixpkgs`, by
+determining how many other packages depend on it.
+
+It skips over all packages that fail to build, are not cached or are unfree -
+but these omissions do not meaningfully affect the statistics.
+
+It currently does not evaluate nested attribute sets (such as
+`haskellPackages`).
+
+## Usage
+
+1. Generate a list of all top-level attributes in `nixpkgs`:
+
+   ```shell
+   nix eval '(with builtins; toJSON (attrNames (import <nixpkgs> {})))' | jq -r | jq > all-top-level.json
+   ```
+
+2. Run `./popcount > all-runtime-deps.txt`
+
+3. Collect and count the results with the following magic incantation:
+
+   ```shell
+   cat all-runtime-deps.txt \
+     | sed -r 's|/nix/store/[a-z0-9]+-||g' \
+     | sort \
+     | uniq -c \
+     | sort -n -r \
+     | awk '{ print "{\"" $2 "\":" $1 "}"}' \
+     | jq -c -s '. | add | with_entries(select(.value > 1))' \
+     > your-output-file
+   ```
+
+   In essence, this will trim Nix's store paths and hashes from the output,
+   count the occurences of each package and return the output as JSON. All
+   packages that have no references other than themselves are removed from the
+   output.