Explore vs. exploit

Whenever you go out for some ice cream, go to an Italian restaurant, or go on vacation, why not try something completely new (i.e., explore)? Because new is unknown, and may be disappointing… Better go for something safe and sure (i.e., exploit). But without exploring, there's nothing to exploit.

Algorithms To Live By mentions an effective method of finding a balance between the two: Epsilon-over-N Greedy. This method randomly chooses between exploration and exploitation based on your history. At first, it favors exploration, so you can find new favorites. However, you'll come across disappointing options too. That's why, in order to decrease your average regret over time, it then starts to strongly favor exploitation of your (newly found) favorites instead, without ever fully giving up on exploring.

For more details on the workings of this method, see my blog post.

What would you like to try today? (Or would you like to log in first?)