index

An Advent of Thought

I was intending to write and post one (hypo)thesis [relating to thinking (with an eye toward alignment)] each day this Advent, starting on 24/12/01 and finishing on 24/12/24.

Ok, so that didn’t happen, but whatever — an advent of thought can happen whenever :). I’ll be posting the first 8 notes today (25/03/17). (But much of the writing was done in 24/12/[01–24].)

Most of these notes deal with questions that would really deserve to have very much more said about them — the brief treatments I give these questions here won’t be doing them any justice. I hope to think and write more about many of the topics here in the future.¹

I’ve tried to state strong claims. I do (inside view?) believe each individual claim (at maybe typically \(p\geq 0.85\)²), but I certainly feel uneasy about many claims (feel free to imagine a version of the notes with even more “probably”s and “plausibly”s if you’d prefer that style) — I feel like I haven’t thought about many of the questions adequately, and I’m surely missing many important considerations.³ I’m certainly worried I’m wrong about it all!^{[especially given that to
a significant extent, the claims stand together or fall
together]}[if in a few years I don’t think I was wrong/confused about major things here, I should seriously consider considering myself to have died :)]

Even if you happen to find my theses/arguments/analysis wrong/lacking/confused, I’m hopeful you might find [the hypotheses]/[the questions my notes are trying to make progress on] interesting.

If you reason me out of some claim in this list, I’d find that valuable!⁴

While many notes can stand alone, there are some dependencies, so I’d recommend reading the notes in the given order. The (hypo)theses present a somewhat unified view, there are occasional overlaps in their contents, and there are positive correlations between their truth values. A table of contents:

notes 1–8: on thought, (its) history (past and future), and alignment, with an aim to make us relate more appropriately to “understanding thinking” and “solving alignment”⁵
notes 9–18: on values/valuing, their/its history (past and future), and their/its relation to understanding, with an aim to help us think better about our own values and the values present/dominant in the world if we create an artifact distinct and separate from humanity which is smarter than humanity (not published yet)
notes 19–24: on worthwhile futures just having humanity grow more intelligent and skillful indefinitely (as opposed to ever creating an artifact distinct and separate from us which outgrows us⁶), with an aim to get us to set our minds tentatively to doing just that (not published yet)

Acknowledgments. I have benefited from and made use of the following people’s unpublished and/or published ideas on these topics: especially Sam Eisenstat; second-most-importantly Tsvi Benson-Tilsen; also: Clem von Stengel, Jake Mendel, Kirke Joamets, Jessica Taylor, Dmitry Vaintrob, Simon Skade, Rio Popper, Lucius Bushnaq, Mariven, Hoagy Cunningham, Hugo Eberhard, Peli Grietzer, Rudolf Laine, Samuel Buteau, Jeremy Gillen, Kaur Aare Saar, Nate Soares, Eliezer Yudkowsky, Hasok Chang, Ian Hacking, Ludwig Wittgenstein, Martin Heidegger and Hubert Dreyfus, Georg Wilhelm Friedrich Hegel and Gregory B. Sadler, various other canonical philosophers, and surely various others I’m currently forgetting.⁷

In particular, I might improve/rewrite/expand and republish some of the present notes in the future.↩︎
though I expect a bunch of them to eventually come to be of type nonsense↩︎
You know how when people in the room are saying \(X\) and you think sorta-\(X\)-but-sorta-not-\(X\), then you might find yourself arguing for not-\(X\) in this room (but if you were trying to be helpful in a room of not-\(X\)-ers, you’d find yourself arguing for \(X\) in that room), and it’s easy to end up exaggerating your view somewehat in the direction of not-\(X\) in this situation? These notes have an early archeological layer in which I was doing more of that, but I decided later that this was annoying/bad, so this early layer has now largely been covered up in the present palimpsest. The title (hypo)theses are a main exception — to keep them crisp, I’ve kept many of them hyperbolic (but stated my actual position in the body of the note).↩︎
Still, it could happen that I don’t respond to a response; in particular, it could happen that [I won’t find your attempt to reason me out of some position compelling, but I also don’t provide counterarguments], and it could happen that I learn something from your comment but fail to thank you. So, you know, sorry/thanks ahead of time :).↩︎
I use scare quotes throughout these notes to indicate terms/concepts which I consider particularly bad/confused/unreliable/suspect/uncomfortable/unfortunate/in-need-of-a-rework; I do not mean to further indicate sneering. (I also use double quotation marks in the more common way though — i.e., just to denote phrases.)↩︎
or multiple artifacts distinct and separate from us which outgrow us↩︎
I might go through the notes at some point later and add more specific acknowledgments — there are currently a bunch of things which are either fairly directly from someone or in response to someone or developed together with someone. Many things in these notes are really responses to past me(s), though I don’t take them to exactly have taken a contrary (so wrong :)) position most of the time, but more [to have lacked a clear view on] or [to have had only bad ways to think about] matters.↩︎