Statistics Slam

Sudoku Slam 2008 Annual Report

Here at Sudoku Slam, we've been collecting statistics on people's usage of the site. We thought you might be interested in getting a look at the data. Note that aside from IP addresses, we do not store any information about users' identity. Protecting your privacy is very important to us.

Table of Contents

  1. Who visits the site?
    1. The browser wars
    2. Puzzles served
    3. Puzzles served by weekday
    4. Return visits
    5. Consecutive puzzles finished
    6. Puzzle frequency
  2. How hard are the puzzles?
    1. Difficulty breakdown
    2. Mode breakdown
    3. Mode and difficulty breakdown
    4. Probability puzzle is finished
    5. Hints used
    6. Effect of hints on solving time
    7. Solving time CDF (traditional)
    8. Solving time CDF (sumo)

Who visits the site?

To date (November 30th, 2008) the site has served over 3,000,000 puzzles. In this report we'll look at the most recent 500,000 puzzles, of which 365,000 were completed. We serve between 3,000 and 4,000 puzzles per day. We are grateful for the 50 donations we have received so far.

We first looked at the kind of browser that people use when they visit the site. Over 40% of puzzles are served to FireFox users. Only 20% of puzzles go to "legacy" browsers like FireFox 2 and IE 6. The unknown category is split between FireFox 1, various FireFox betas, and Opera.

Although we not advertise in any way, our traffic has increased moderately throughout 2008. Word of mouth may be responsible.

Next we looked at our traffic over the course of a single week. Although the site is recreational, it seems to be used more often during the working hours. Unfortunately, since we don't know user's time zones, it's difficult to draw many conclusions from this graph (all times are shown in PST). Nevertheless, the big spikes seem to be before and after lunch and right before work ends. We get less traffic on weekends.

We did a simple analysis to see how many times a given user (really a given IP address) finishes a puzzle. According to the data, 75% of people finish more than one puzzle at the site. The number is likely even higher, since dynamic IP assignment makes tracking actual users difficult.

Some people really like the site a lot! We looked through the data for "streaks." A user has a streak of length N when they solve N puzzles, possibly taking a break for up to one hour between puzzles. (We track users by IP address, which is admittedly error-prone.) The graph above shows a histogram of the lengths of streaks. Forty percent of streaks have length 1, meaning the user did one puzzle and stopped. About 80% of the time, people stop after four puzzles or fewer. However, about 5% of streaks involve 32 puzzles or more, possibly lasting for many hours. Way to go, guys!

We hate to admit it, but there's a very remote possibility that you'll see the same puzzle twice on Sudoku Slam. The histogram above shows how many times a given puzzle is finished. We have lots of piledriver and easy puzzles available, so they are more likely to be seen. The other difficulties are harder to generate, so we have a smaller set of them to serve up to people. The unexpected spike at 1 is explained by the fact that some puzzles are entered by users and never served up to anyone else.

How hard are the puzzles?

It's very important to us to rank puzzles correctly (easy, medium, hard, etc.). The data below helps to assess the quality of our puzzle rankings.

Since medium puzzles are the default, it's no surprise that most completed puzzles are mediums. Trivial puzzles are seen only when users enter them manually. The data is somewhat surprising to us, since much of our effort in the past has been directed at improving the hard and bodyslam puzzles. However, 60% of the puzzles we serve are of easy or medium difficulty. Only 9% of people choose piledriver puzzles; these are likely to be people who are not satisfied by bodyslam puzzles.

A related statistic is the mode that people use to solve puzzles. Our support for "traditional" mode has always been a soft spot, since hints don't work well. Luckily, most people seem to choose sumo mode. Of course, this may be a case of selection bias—maybe people who prefer traditional mode simply use other sites.

This graph breaks things down a bit more and shows that people who pick easy puzzles use traditional mode, and everyone else prefers sumo mode. To simplify the user interface, it might make sense to select traditional mode by default when a user picks an easy puzzle, and select sumo mode otherwise. Of course it still makes sense to offer a choice in some way, since advanced users may prefer the challenge of using traditional mode on difficult puzzles.

The next graph shows how likely it is that people finish the puzzles they start. On average, people seem to finish a puzzle 3/4 of the time. This seems like a reasonable number, given the likelihood of distractions and boredom. More interesting is that fact that all levels except piledriver have similar probabilities of being finished. One might expect that the harder puzzles would be less likely to be finished. There are two reasons why this might not be true: (1) perhaps people only give up due to external distractions, and if they get frustrated they just start using hints; (2) bodyslam puzzles are harder than easy puzzles, but the people who choose the bodyslam level are likely to be better solvers. The fact that piledriver puzzles actually have a lower probability of completion supports the first hypothesis, since hints don't work as well for piledriver puzzles.

This next graph shows how often people request a hint. It does not take into account the "Do I need a Smart Hint?" feature, which reveals less information. Perhaps as expected, people solving easy puzzles do not use hints (they don't work very well there). Piledriver solvers also don't use hints much, probably for the same reason.

Otherwise, we expected that hint usage would be an increasing function of difficulty. This turns out not to be the case. Medium puzzles indeed require fewer hints than hard puzzles, but bodyslam puzzles require fewer hints than hard (although most of the time when people require lots of hints, they're doing a bodyslam puzzle).

It may be that solvers who request bodyslam puzzles are already very experienced and need few hints. And perhaps solvers choosing hard puzzles originally started at medium on our site and then worked their way up to hard. However, if this is true it suggests that these solvers "get stuck" and never work their way up to bodyslam. More study of how individual solvers "learn" would be quite interesting.

The graph above tries to assess how much a hint can decrease the time it takes to solve a puzzle. (It would be nice to know the effect of hints on whether a puzzle is finished at all, but we don't have enough data to analyze that case.) Since solving time is heavily dependent on whether traditional or sumo mode is used, we did not look at easy mode (where traditional mode is usually used). We plot the median time, rather than the average, since some people leave a puzzle for a long time without pausing, inflating the time significantly.

There seem to be two counteracting effects here. On the one hand, people who use hints take less time solving the puzzle than if they didn't use a hint. On the other hand, people who don't use hints are likely to be better (and thus faster) solvers.

For medium puzzles, the second effect seems to dominate: people who need lots of hints are slower. For hard and bodyslam puzzles the first effect is more important—using a hint helps you solve faster. In all cases, the people who don't use hints are the fastest, since they are likely to be the most competent solvers. Piledriver puzzles may be somewhat anomalous, since hints often fail to be of much use.

This cumulative distribution function shows how long it takes users to solve puzzles in traditional mode. Note that the scale on the X axis is logarithmic. To find the median times, locate the 0.5 mark on the Y axes and find the time where the line for the given difficulty reaches 0.5. If your average time is 3 minutes on easy puzzles, then find the height of the easy line at 3 minutes—0.2—this means you are in the top 20% of solvers.

Mostly the data is as expected: harder puzzles take longer. However, piledriver puzzles are sometimes solved very quickly. We tracked the knee in the piledriver curve down to a single user (who we call "Rain Man") who only solves piledriver puzzles in traditional mode. This user has solved over 1,138 puzzles this way, with a median solving time of 67 seconds and a minimum time of 47 seconds. At first we suspected cheating, but the user's activity is spread out over a period of months, which makes it seem genuine.

This graph is similar to the one above, but it shows the data for puzzles solved in sumo mode. The main difference is that puzzles are solved faster in sumo mode. Easy puzzles are solved especially quickly. In many cases, the auto-fill feature of sumo mode can completely solve an easy puzzle without any user intervention (sumo mode is not designed to be used on easy puzzles—perhaps it should be disabled). Even if it doesn't completely solve the puzzle, it usually solves it almost entirely, with little need for help from the user.

As before, the quick times for piledriver puzzles is troublesome. In this case, it is more easily explained. A piledriver puzzle is any puzzle that requires some "logical leap" that is not encapsulated in any of our solver's tactics (such as colors, X-wing, etc.). Otherwise, the puzzle may be very easy to solve. Hard and bodyslam puzzles are designed to have at least some hard steps, but also to be challenging throughout. Thus, it's reasonable that they would take longer to solve.

Last modified: Wed Dec 24 20:42:13 PST 2008