Blog - Explore vs. exploit

Added on Monday, 2017-08-14 22:52 CEST in category Programming
For my birthday I was gifted the book Algorithms To Live By, which talks about several computer algorithms and how they can be applied in real life. The topics included selling real estate, sorting the papers on your desk in least recently used order, and how to make auctions fairer.

The algorithm that inspired this blog post deals with exploration vs. exploitation. Suppose you go out for some ice cream, go to an Italian restaurant, or go on vacation, why not try something completely new (i.e., explore)? Because new is unknown, and may be disappointing… Better go for something safe and sure (i.e., exploit). But without exploring, there's nothing to exploit.

TL;DR: check out if you should explore something new, or exploit a favorite!

Algorithms To Live By introduces a few methods of finding a balance between the two. You could e.g. decide to pick something new, say, 10% of the time. That way you mainly get to enjoy your favorites, but every once in a while you try something new. This approach yields so-called "linear regret": the number of disappointing options you're likely to encounter, grows linearly over time.

Fortunately, we can do better by using an algorithm the book dubs Epsilon-over-N Greedy, which tackles linear regret by lowering the chances of exploring over time. During the 1st round, your chances of exploring are 1 / 1, i.e., always. During the 2nd round, your chances are 1 / 2, during the 3rd round they're 1 / 3, and so on. Thus, you initially focus on exploration, but over time switch to exploitation of your newly found favorites. Still, you never fully give up on exploring new options either. (Note that if you didn't like what you chose, then that round doesn't count, leaving you to explore a bit more.)

Such a strategy of ever lower chances of exploring on average yields so-called "logarithmic regret". (Worst case you still end up with linear regret, in the very unlucky case of never liking anything new.)

You can try out this algorithm yourself here. Just fill out your name, email address, or any other unique identifier for yourself, and the category you want to explore/exploit. In my experience, exploring has already opened up a few interesting new choices, so make sure to give it a go!