Secretary problem under value change - optimizing for a utility function. Should you aim for the best possible uf or not? Can you double back? Do you have ordinal or cardinal knowledge of its goodness? Is indexical uncertainty another thing again, is cluelessness. What are the costs to waiting? What are the chances of rejection - akrasia models this (partly - it double counts as akrasia applies to all actions including changing your values?). What about the fact that you can change the range of values you’ll be exposed to, how does that fit into explore exploit stuff? Does the marginal utility of further exploration decrease. Aren’t there an infinite range of functions? Uncountably or countably infinite?

And then how will game theoretic considerations come into it? I guess utility functions are mostly public goods. Where do different functions fit on the rivalrous-anti-rivalrous, excludable-anti-excludable spectrum? E.g. benefits and harms of ethical vs economic specialisation. Also ethical specialisation of sub agents somehow? Schemas? Master and servants to be deployed?

More variations of secretary problem and combinations with other problems?

How do these examples of bounded rationality fit with moral psych and behavioural econ findings?

Advice and age and death - MAB shows the near dead should exploit - different to us - and maybe the exploiting is part of why their lives go better

How does competition effect trade-off - e.g. Hollywood sequels

Switching costs, lack of geometric discounting, something about intervals? Make gittins index less applicable to ethics.

Will ufs that discount get weighted more by hacking the moral uncertainty algorithm?

Upper confidence bound and gittins both beat EV in MAB, might they beat it in moral uncertainty too?

Justification of optimism and weighting value of info under logarithmic rate of increasing regret under regret minimising algorithm.

Why might we exploit late but hire early? Social reasons?

If the restless bandit problem is intractable, does that mean no one is responsible for their actions in restless bandit problems? Or that the computational limit puts an upper bound on the threshold for virtuous behaviour somehow? Can perception bypass the computational limits on reasoning about reasons like Weil or Murdoch says? What about virtues of salience - do they raise the upper bound of virtue thresholds/blameworthiness since should implies can (compute)

Do old people also over exploit? Does forgetting that you will lose capacities to exploit (and explore) mean people switch to exploit too late in life? So the over exploring in short interval experimental tasks may generalise to life as a whole. The interesting point is that different tasks have different intervals for exploring and exploiting, so you have to switch at different times. Additionally different kinds of exploring will generalise to different extents, should they be prioritised more? What about meta-exploration?

  • do back-chaining consequentialist measures or myopic gradient descent methods work better or worse in explore/exploit phases?

Races vs fights equal single metric ranks vs pairwise comparison sorts. Relevance for dominance hierarchies. Also link to applied divinity studies post on social fog of war.

Social fog of war is also interestingly connected to yu vs fa debate, could pitch it as division of powers/legalism vs social fog of war/Confucianism. Apparently Montesqieu recapitulates the confucian position that a government of rites is better than a govenrment of laws.

Least recently used caching = that property where lifespan correlates with longevity in cultural artefacts contra? doomsday argument. / Copernican principle Heritage, NIMBYism, and the long term

Can measures of computational complexity provide criteria for legitimate maxims in Kant (as willing is a limited complexity activity) or preference aggregation algorithms in Rawls style implicit social contracts (because implicitly consenting is also limited complexity)

Social emotions esp anger are in built enforcement mechanisms for contracts, like mechanism design. If other people expect us to be irrationally angry about theft, we will rarely have to be irrationally angry. So for love and guilt, e.g. marriage. We avoid rational jumping on from marriages by being driven by love anchored to that particular person, not any set of their properties. Love is like a mafia boss. Marriage is a prisoners dilemma where you get to choose the partner. Emotion driven involuntary falling in love both spare recursive over thinking their intentions and change the payoffs to shift us to a better equilibrium, similarly for a capacity for heartbreak makes us a trustworthy partner.

Vikrie auctions and deontological (cluelessness and mind-modelling regress-avoiding rules) are both winning strategies for computationally limited agents. Relevant to cluelessness, alignment stuff? In Vikry auctions honesty is the best policy, look for that structure elsewhere.

Computability of Nash equilibria in bluffing games/ 2/3rds poll and it’s relevance for the correct model of honesty.

You can use honesty as a model of philosophical investigation.

Do deontologists have ways of achieving a conservative field using only systems of rules not utility functions?

Montezuma’s revenge, learnable by humans but not a DQN shows the necessity of curiosity in overcoming spare rewards. How does this tie in to curiosity and dopamine wrt intrinsically incentivising exploration?

How different will natural abstractions in different language games, explanation vs justification say, be? Consider the different concepts common to all embodied creatures, not embodied ones, asocial, social, and musical creatures, etc. Maybe also Wittgenstein and sellars?

Psychologists always incentivising to create experiments prevents them from investigating the behaviour of organisms in response to complex or minimal incentives

What’s the point of writing

Paul Graham notes

  • Superlinear returns to effort can arise via either exponential processes, or sharp cutoffs in outcome value, (like win/loss). Often both these factors go together, where winning in a sharp-cutoff environment, e.g. winning a battle, makes it more likely to win the next one, leading to exponential growth.