§ Stuff I learnt in 2019
I write these retrospective blog posts every year since 2017. I tend to post a
collection of papers, books, and ideas I've stumbled across that year.
Unfortunately, this year, the paper list will be sparser, since I lost some
data along the way to the year, and hence I don't have links to everything I
read. So this is going to be a sparser list, consisting of things that I found
I also re-organised my website, letting the link die, since keeping it up was
taking far too many cycles (In particular, CertBot was far too annoying to
maintain, and the feedback of hugo was also very annoying). I now have a
single file, the
to which I add notes on things I find interesting. I've bound the
(for idea) on all my shells everywhere, to open the
README.md file, wait
for me to write to it, run a
git commit so I can type out a commit, and
then push. This has been massive for what I manage to write down: I feel
like I've managed to write down a lot of one-liners / small facts that I've
picked up which I would not otherwise. I'm attempting to similarly pare down
other friction-inducing parts of my workflow. Suggestions here would be very
If there's a theme of the year (insofar as my scattered reading has a
theme...), it's "lattices and geometry". Geometry in terms of differential
geometry, topology, and topoi. Lattices in the sense of a bunch of abstract
interpretation and semantics.
§ Course work: optimisation theory, quantum computation, statistics
My course work was less interesting to me this time, due to the fact that I had
chosen to study some wild stuff earlier on, and now have to take reasonable stuff
to graduate. However, there were courses that filled in a lot of gaps in my
self-taught knowledge for me, and the ones I listed were the top ones in that
I wound up reading
Boyd on optimisation theory,
Nielsen and Chuang for quantum computation,
where I also
solved a bunch of exercises in Q#
which was very fun and rewarding. I'm beginning to feel that learning quantum
computation is the right route to grokking things like entanglement and
superposition, unlike the physics which is first of all much harder due to
infinite dimensionality, and less accessible since we can't program it.
§ Formal research work: Compilers, Formal verification, Programming languages
My research work is on the above topics, so I try to stay abreast of what's
going on in the field. What I've read over the past year on these topics is:
A^2I: meta-abstract interpretation.This paper extends the theory of abstract interpretation to perform abstractinterpretation on program analyses themselves. I'm not sure how useful thisis going to be, as I still hold on to the belief that AI as a framework istoo general to allow one to prove complex results. But I am still interestedin trying to adapt this to some problems I have at hand. Perhaps it's goingto work.
- Cubicial Agda. This paper introducescubical type theory and its implementation in Agda. It appears to solve manyproblems that I had struggled with during my formalization of loopoptimisations: In particular, dealing with Coinductive types in Coq, and thatof defining quotient types / setoids. Supposedly, cubical Agda makes dealingwith Coinduction far easier. It allows allows the creation of "real" quotienttypes that respect equality, without having to deal with
setoid styleobjects that make for large Gallina terms. I don't fully understand how thetheory works: In particular, as far as I can tell, the synthetic intervaltype
I allows one to only access the start and end points (
1),but not anything in between, so I don't really see how it allows forinterpolation. I also don't understand how this allows us to make Univalencecomputable. I feel I need to practice with this new technology before I'mwell versed, but it's definitely a paper I'm going to read many, many timestill I grok it.
- Naive Cubical type theory. This paperpromises a way to perform informal reasoning with cubical type theory, theway we are able to do so with, say, a polymorphic type theory for lambdacalculus. The section names such as "how do we think of paths","what can we do with paths", inspire confidence
- Call by need is Clairvoyant call by value. This key insight is to notice that call by needis "just" call by value, when we evaluate only those values that areeventually forced, and throw away the rest. Thus, if we had an oracle thattells us which values are eventually forced, we can convert call by need intocall by value, relative to this oracle. This cleans up many proofs in theliterature, and might make it far more intuitive to teach call by need topeople as well. Slick paper, I personally really enjoyed reading this.
- Shift/Reset the Penultimate BackpropagatorThis paper describes how to implement backprop using delimited continuations.Also, supposedly, using staging / building a compiler out of this paradigmallows one to write high performance compilers for backprop without havingto suffer, which is always nice.
- Closed forms for numerical loopsThis paper introduces a new algebra of polynomials with exponentials. It thenstudies the eigenvalues of the matrix that describes the loop, and tries tofind closed forms in terms of polynomials and exponentials. They chooseto only work with rationals, but not extensions of rational numbers(in terms of field extensions of the rationals). Supposedly, this is easierto implement and reason about. Once again, this is a paper I'd like toreimplement to understand fully, but the paper is well-done!
- Composable, sound transformations of Nested recursion and loops.This paper attempts to bring ideas from polyhedral compilationinto working with nested recursion. They create a representation usingmultitape finite automata, using which they provide a representation fornested recursion. I was somewhat disappointed that it does not handlemutual recursion, since my current understanding is that one can alwaysconvert nested recursion into a "reasonable" straight line program bysimply inlining calls and then re-using polyhedral techniques.
§ Internship at Tweag.io over the summer: Hacking on Asterius (Haskell -> WebAssembly compiler)
I really enjoyed my time at Tweag! It was fun, and
was a great mentor. I must admit that I was somewhat distracted, by all the new
and shiny things I was learning thanks to all the cool people there
particular, I wound up bugging
and Matthias Meschede
quite a bit, about type theory, MCMC sampling, and signal processing of storm
I wound up reading a decent chunk of GHC source code, and while I can't link
to specifics here, I understood a lot of the RTS much better than I did before.
It was an enlightening experience, to say the least, and being paid to hack on
a GHC backend was a really fun way to spend the summer.
It also led me to fun discoveries, such as
how does one debug debug info?
I also really loved Paris as a city. My AirBnb host was a charming artist who
suggest spots for me around the city, which I really appreciated. Getting
around was disorienting for the first week or so, due to the fact that I could
not (and still do not) really understand how to decide in which direction to
walk inside the subways to find a particular line
going in a particular direction.
The city has some great spots for quiet work, though! In particular, the
was a really nice place to hang out and grab coffee. The model is great: you
pay for hours spent at the Anticafe, with coffee and snacks free. They also
had a discount for students which I gratefully used.
I bumped into interesting artists, programmers, and students who were open for
conversation there. I highly recommend hanging out there.
§ Probabilistic programming & giving a talk at FunctionalConf
This was the first talk I'd ever given, and it was on probabilistic programming
in haskell. In particular, I explained the
monad-bayes approach of
doing this, and why this was profitable.
The slides are available here.
It was a fun experience giving a talk, and I'd like to do more of it, since I
got a lot out of attempting to explain the ideas to people. I wish I had more
time, and had a clearer idea of who the audience was. I got quite a bit of
help from Michael Snoyman to whip the talk into
shape, which I greatly appreciated.
The major ideas of probabilistic programming as I described it are
from Adam Scibior's thesis:
Along the way, I and others at tweag read the other major papers in the space,
- Church, a language for generative models,which is nice since it describes it's semantics in terms of sampling. This isunlike Adam's thesis, where they define the denotational semantics in termsof measure theory, which is then approximated by sampling.
- Riemann Manifold Langevin and Hamiltonian Monte Carlowhich describes how to perform Hamiltonian Monte Carlo on the information geometry manifold. So, for example, if we are trying to sample from gaussians, we sample from a 2D Riemannian manifold with parameters mean and varince, and metric as the Fisher information metric.This is philosophically the "correct" manifold to sample from, since itrepresents the intrinsic geometry of the space we want to sample from.
- An elementary introduction to Information geometry by Frank Nielsensomething I stumbled onto as I continued reading about sampling fromdistributions. The above description about the "correct" manifold forgaussians comes from this branch of math, but generalises it quite a bitfurther. I've tried to reread it several times as I gradually gained maturityin differential geometry. I can't say I understand it just yet, but I hope todo so in a couple of months. I need more time for sure to meditate on theobjects.
- Reimplementation of
monad-bayes.This repo holds the original implementation on which the talk is based on.I read through the
monad-bayes source code, and then re-implemented thebits I found interesting. It was a nice exercise, and you can seethe git history tell a tale of my numerous mis-understandings of MCMC methods,till I finally got what the hell was going on.
§ Presburger Arithmetic
Since we use a bunch of presburger arithmetic
for polyhedral compilation
which is a large research interest of mine, I've been trying to build a
"complete" understanding of this space. So this time, I wanted to learn
how to build good solvers:
bollu/gutenberger is a decisionprocedure for Presburger arithmetic that exploits their encoding as finiteautomata. One thing that I was experimenting with was that we only usenumbers of finite bit-width, so we can explore the entire state spaceof the automata and then perform NFA reduction usingDFA minimisation. Thereference I used for this was the excellent textbookAutomata theory: An algorithmic approach, Chapter 10
- The taming of the semi-linear setThis uses a different encoding of presburger sets, which allows them to bounda different quantity (the norm) rather than the bitwidth descriptions. This allowsthem to compute exponentially better bounds for some operations thanwere known before, which is quite cool. This is a paper I keep trying toread and failing due to density. I should really find a week away from civilizationto just plonk down and meditate upon this.
§ Open questions for which I want answers
I want better references to being able to regenerate the inequalities
description from a given automata which accepts the presburger set automata.
This will allow one to smoothly switch between the geometric description
and the algebraic description. There are some operations that only work
well on the geometry (such as optimisation), and others that only work well on
the algebraic description (such as state-space minimisation). I have not found
any good results for this, only scattered fragments of partial results.
If nothing else, I would like some kind of intuition for why this is hard.
Having tried my stab at it, the general impression that I have is that the
space of automata is much larger than the things that can be encoded as
presburger sets. Indeed, it was shown that automata accept numbers which
are ultimately periodic.
But yes, it's known that automata accept a language that's broader than just
first order logic + "arithmetic with +", which means it's hard to dis-entangle
the presburger gits from the non-presburger bits of the automata.
- first order logic + "arithmetic with +" + (another operation I cannot recall).I'm going to fill this in once I re-find the reference.
I wanted to get a better understading of how prolog works under the hood, so I began
re-implementing the WAM: warren abstract machine.
It's really weird, this is the only stable reference I can find to implementing
high-performance prolog interpreters. I don't really understand how to chase the
paper-trail in this space, I'd greatly appreciate references. My implementation
I had to give up due to a really hard-to-debug bug.
It's crazy to debug this abstract machine, since the internal representation gets
super convoluted and hard to track, due to the kind of optimised encoding it
uses on the heap.
If anyone has a better/cleaner design for implementing good prologs, I'd love
Another fun paper I found in this space thanks to Edward Kmett was
the Rete matching algorithm,
which allows one to declare many many pattern matches, which are then "fused"
together into an optimal matcher that tries to reuse work across failed
§ General Relativity
This was on my "list of things I want to understand before I die", so I wound
up taking up an Independent Study in university, which basically means that
I study something on my own, and visit a professor once every couple weeks,
and am graded at the end of the term. For GR, I wound up referencing a wide
variety of sources, as well as a bunch of pure math diffgeo books. I've read
everything referenced to various levels. I feel I did take away the core
ideas of differential and Riemannian geometry. I'm much less sure I've grokked
general relativity, but I can at least read the equations and I know all the
terms, so that's something.
- The theoretical minimum by Leonard Susskind.The lectures are breezy in style, building up the minimal theory (and no proofs)for the math, and a bunch of lectures spent analysing the physics. While I wishit were a little more proof heavy, it was a really great reference to learn thebasic theory! I definitely recommend following this and then reading otherbooks to fill in the gaps.
- Gravitation by Misner Thorne and Wheeler)This is an imposing book. I first read through the entire thing (Well, the parts I thought I needed),to be able to get a vague sense of what they're going for. They're rigorous ina very curious way: It has a bunch of great physics perspectives of lookingat things, and that was invaluable to me. Their view of forms as "slot machines"is also fun. In general, I found myself repeatedly consulting this book forthe "true physical" meaning of a thing, such as curvature, parallel transport,the equation of a geodesic, and whatnot.
- Differential Geometry of Curves and Surfaces by do CarmoThis is the best book to intro differential geometry I found. It throws awayall of the high powered definitions that "modern" treatments offer, andstarts from the ground up, building up the theory in 2D and 3D. This is amazing,since it gives you small, computable examples for things like"the Jacobian represents how tangents on a surface are transformed locally".
- Symplectic geometry & classical mechanics by Tobias OsborneThis lecture series was great, since it re-did a lot of the math I'd seenin a more physicist style, especially around vector fields, flows, andLie brackets. Unfortunately for me, I never even got to the classicalmechanics part by the time the semester ended. I begantaking down notes in my repo,which I plan to complete.
- Introduction to Smooth manifolds: John LeeThis was a very well written mathematical introduction to differential geometry.So it gets to the physically important bits (metrics, covariant derivatives)far later, so I mostly used it as a reference for problems and more rigour.
- Einstein's original paper introducing GR, translatedfinally made it click as to whyhe wanted to use tensor equations: tensor equations of the form
T = 0 areinvariant in any coordinate system, since on change of coordinates,
Tchanges by a multiplicative factor! It's a small thing in hindsight, but itwas nice to see it explicitly spelled out, since as I understand, no oneamong the physicists knew tensor calculus at the time, so he had to introduceall of it.
§ Discrete differential geometry
I can't recall how I ran across this: I think it was because I was trying to
get a better understanding of Cohomology, which led me to Google for
"computational differential geometry", that finally led me to Discrete
It's a really nice collection of theories that show us how to discretize
differential geometry in low dimensions, leading to rich intuitions and
a myriad of applications for computer graphics.
- The textbook by Kennan Crane on the topicwhich I read over the summer when I was stuck (more often than I'd like) inthe Paris metro. The book is very accessible, and requires just someimagination to grok. Discretizing differential geometry leads to most thingsbeing linear algebra, which means one can calculate things on paper easily.That's such a blessing.
- Geodesics in Heatexplores a really nice way to discover geodesics by simulating the heatequation for a short time. The intuition is that we should think of the heatequation as describing the evolution of particles that are performing randomwalks. Now, if we simulate this system for a short while and then look at thedistribution, particles that reach a particular location on the graph must have taken the shortest path, since any longer path would not have allowed particles to reach there. Thus, the distribution of particles at time
dt does truly represent distances from a given point. The paperexplores this analogy to find accurate geodesics on complex computationalgrids. This is aided by the use of differential geometry, appropriatelydiscretized.
- The vector heat methodexplores computing the parallel transport of a vector across a discretemanifold efficiently, borrowing techniques from the 'Geodesics in Heat'paper.
- Another paper by Kennan Crane: Lie group integrators for animation and control of vehiclesThis paper describes a general recipe to tailor-make integrators for a systemof constraints, by directly integrating over the lie group of theconfiguration space. This leads to much more stable integrators. I have somemisguided hope that we can perhaps adapt these techniques to build better FRP(functional reactive programming) systems, but I need to meditate on this alot more to say anything definitively.
§ Synthetic differential geometry
It was Arnaud Spiwack
who pointed me to this. It's a nice axiomatic
system of differential geometry, where we can use physicist style proofs of
"taking stuff upto order
dx", and having everything work upto mathematical
The TL;DR is that we want to add a new number called
dx into the reals,
dx^2 = 0. But if such a number did exist, then clearly
dx = 0.
However, the punchline is that to prove that
dx^2 = 0 => dx = 0 requires
the use of contradiction!
So, if we banish the law of excluded middle (and therefore no longer use
proof by contradiction), we are able to postulate the existence of a new
dx, which obeys
dx^2 = 0. Using this, we can build up the
whole theory of differential geometry in a pleasing way, without having to
go through the horror that is real analysis. (I am being hyperbolic, but really,
real analytic proofs are not pleasant).
I began formalizing this in Coq and got a formalism going:
Once I was done with that, I realised I don't know how to exhibit models of
the damn thing! So, reading up on that made me realise that I need around 8
chapters worth of a grad level textbook (the aptly named
Models of Smooth Infinitesimal Analysis).
I was disheartened, so I asked on
(also my first ever question there), where I learnt about tangent categories and
differential lambda calculus. Unfortunately, I don't have the bandwidth to read
another 150-page tome, so this has languished.
§ Optimisation on Manifolds
I began reading
Absil: Optimisation on matrix manifolds
which describes how to perform optimisation / gradient descent on
arbitrary Riemannian manifolds, as well as closed forms for well-known
manifolds. The exposition in this book is really good, since it picks a
concrete manifold and churns out all the basic properties of it manually. The
only problem I had with the books was that there were quite a few gaps (?) in
the proofs -- perhaps I missed a bunch.
This led me to learn Lie theory to some degree, since that was the natural
setting for many of the proofs. I finally saw why anyone gives a shit about
the tangent space at the identity: because it's easier to compute! For a
flavour of this,
consider this question on
math.se by me that asks about computing tangent spaces of .
§ AIRCS workshop
I attended the
AI risk for computer scientists
workshop hosted by
MIRI (Machine intelligence research institute) in
December. Here, a bunch of people were housed at a bed & breakfast for a
week, and we discussed AI risk, why it's potentially the most important thing
to work on, and anything our hearts desired, really. I came away with new
branches of math I wanted to read, a better appreciation of the AI risk
community and a sense of what their "risk timelines" were, and some
explanations about sheaves and number theory that I was sorely lacking. All in
all, it was a great time, and I'd love to go back.
§ P-adic numbers
While I was on a particularly rough flight back from the USA to India when
coming back from the AIRCS workshop, I began to read the textbook
Introduction to p-adic numbers by Fernando Gouvea,
which fascinated me, so I then
wrote up the cool parts introduced in the first two chapters as a blog post.
I wish to learn more about the p-adics and p-adic analysis, since they
seem to be deep objects in number theory.
In particular, a question that I thought might have a somewhat trivial answer
(why do the p-adics use base p in defining norm)
turned out to have answers that were quite deep, which was something
unexpected and joyful!
§ Topology of functional programs
Both of these describe a general method to transfer all topological ideas as
statements about computability in a way that's far more natural (at least for
me, as a computer scientist). The notion of what a continuous function "should
be" (keeping inverse images of open sets open) arises naturally from this
computational viewpoint, and many proofs of topology amount to finding
functional programs that fit a certain type. It's a great read, and I feel gave
me a much better sense of what topology is trying to do.
I've wanted to understand philosophy as a whole for a while now, at least
enough to get a general sense of what happened in each century. The last year,
I meandered through some philosophy of science, which led me to some really
wild ideas (such as that of
Paul Feyerabend's 'science as an anarchic enterprise'
which I really enjoyed).
I also seem to get a lot more out of audio and video than text in general, so
I've been looking for podcasts and video lectures. I've been following:
- The history of philosophy without any gapsfor a detailed exposition on, say, the greeks, or the arabic philosophers.Unfortunately, this podcast focuses on far too much detail for me to have beenable to use it as a way to get a broad idea about philosophy in itself.
I also attempted to read a bunch of philosophers, but the only ones I could
make a good dent on were the ones listed below. I struggled in writing this
section, since it's much harder to sanity check my understanding of philosophy,
versus mathematics, since there seems to be a range of interpretations of the
same philosophical work, and the general imprecise nature of language doesn't
help here at all. So please take all the descriptions below with some salt
- Philosophize This! by Stephen WestIs a good philosophy podcast for a broad overview of different areasof Philosophy. I got a lot out of this, since I was able to get a senseof the progression of ideas in (Western) Philosophy. So I now know whatPhenomenology is,or what Foucault was reacting against.
- Discipline and Punish by Michel FoucaultHere, Foucault traces the history of the criminal justice system of society,and how it began as something performed 'on the body' (punishment),which was then expanded to a control 'of the mind' (reform). As usual,the perspective is fun, and I'm still going through the book.
- Madness and Civilization by Michel Foucaultwhich attempts to chronicle how our view of madness evolved as society did.It describes how madmen, who were on the edges of society, but still"respected" (for exmaple, considered as 'being touched by the gods') werepathologized by the Renaissance, and were seen as requiring treatment. I'mstill reading it, but it's enjoyable so far, as a new perspective for me.
- The value of science by Henri Poincare.Here, he defends the importance of experimentation, as well as the value ofintuition to mathematics, along with the importance of what we considerformal logic. It's a tough read sometimes, but I think I got something out ofit, at least in terms of perspective about science and mathematics.
§ Information theory
I've been on a quest to understand information theory far better than I
currently do. In general, I feel like this might be a much better way to
internalize probability theory, since it feels like it states probabilistic
objects in terms of "couting" / "optimisation of encodings", which is a
perspective I find far more natural.
Towards this aim, I wound up reading:
- Information theory, Learning, and inference algorithmsThis book attempts to provide the holistic view I was hoping for. It hasgreat illustrations of the basic objects of information theory. However,I was hoping that the three topics would be more "unified" in the book,rather than being presented as three separate sections with some amountof back-and-forth-referencing among them. Even so, it was a really fun read.
§ Building intuition for Sheaves, Topoi, Logic
I don't understand the trifecta of sheaves, topoi, geometry, and logic, and
I'm trying to attack this knot from multiple sides at once.
All of these provide geometric viewpoints of what sheaves are, in low-dimensional
examples of graphs which are easy to visualize. I'm also trudging through the
which appears to follow the "correct path" of the algebraic geometers, but this
requires a lot of bandwidth.
This is a hardcore algebraic geometry textbook, and is arguably
great for studying sheaves because of it. Sheaves are Chapter 2, and allows
one to see them be developed in their "true setting" as it were. In that
Grothendeick first invented sheaves for algebraic geometry, so it's good to
see them in the home they were born in. Once again, this is a book I lack
bandwidth for except to breezily read it as I go to bed. I did get something
out from doing this. I'm considering taking this book up as an independent
study, say the first four chapters. I'll need someone who knows algebraic
geometry to supervise me, though, which is hard to find in an institute geared
purely for computer science. (If anyone on the internet is kind enough to
volunteer some of their time to answer questions, I'll be very glad! Please
email me at
§ The attic
This section contains random assortments that I don't recall how I stumbled
across, but too cool to not include on the list. I usually read these in bits
and pieces, or as bedtime reading right before I go to bed to skim. I find
that skimming such things gives me access to knowing about tools I would not
have known otherwise. I like knowing the existence of things, even if I don't
recall the exact thing, since knowing that something like
X exists has saved me
from having to reinvent
X from scratch.