My task management system is very simple. I use t and this means that it’s just a text file in which every row corresponds to a task. Done. I have direct access to the plain-text “database”, and this means I can have fun with it.

This system has helped me get a lot more productive over the last few months. I like numbers, and this system shows me a lot of numbers. But many systems will show you a lot of numbers. The difference is that, because of the plain-text file mentioned above, I can make the numbers I want. I can at a glance see how many tasks are in the list, how many tasks are marked for today, and the total number of tasks I’ve completed.

I can, for example, use my .zshrc file along with powerline to show me some numbers in my bash prompt:

If I want to look at my longer-term trends, I have a script that automatically generates this 10-day moving mean of number of tasks completed per day.

But the numbers are not a great proxy for my productivity. They’re okay most of the time, but sometimes they are a really bad measure. A couple of days ago was one of those days in which the amount of work I had done mapped very poorly onto the number of tasks crossed off the list.

Anyone who has used task management systems knows why: the size of the task can vary wildly. Completing one task (e.g., Send email about payment to research assistants) can take a minute. Another task (e.g., Read a chapter of Machine Learning for Hackers) will take longer. Other tasks (e.g., Write tests for <insert name of project here>) can vary wildly in the time they will take.

The day before yesterday was one of the days that had one task that took a large portion of my day. The task was to implement a new feature in a project that I am going to put up on GitHub pretty soon. I was productive. I spent many hours in focused work. I learned a lot as I worked. Pleasantly enough, my productivity is apparent when you look at another metric: logged keystrokes.

There are some lessons to take from this:

  • Operationalizing1 is hard.
  • Increasing the number of variables you are using to measure a phenomenon is good. It allows you to perform sanity checks and helps you address suspicions you might have about any one variable.
  • Your theory and reasoning can always trump any data.
  • A measure doesn’t have to be perfect to be usable.

Those last two points are the main take aways from this post. I just explained how one measure (the number of tasks I completed in a day) can be really bad sometimes. But I’m not going to stop using it, because it still has utility for me. I still like seeing that number go down during the day. I still like looking at automatically-generated plots of personal analytics. The numbers and the plots help by giving me that extra boost to start a task that I am irrationally procrastinating on starting.

As for the problems with the measure, no one measure is perfect, and productivity is a really hard thing to operationalize. If my usage is consistent, then those one-minute tasks and 6-hour tasks should balance each other out over the long term, and the long term trends, as a result, would still be informative.

When in doubt, I can look at other corroborative data like my keystrokes, although those, too, have their problems. I only count my keystrokes. I have no way to look at a breakdown of what those keystrokes were doing, or what application they were used in. Tons of keystrokes on a day can mean a lot of coding work, or they could mean that it was a Messages- and Adium-heavy day.

There is one more lesson to take away from this:

No data can substitute for your thinking, reasoning, and theory. No data speaks for itself. Take no one’s word for it, and that includes data.

Oh yeah, and induction is false!

  1. “a process of defining the measurement of a phenomenon that is not directly measurable, though its existence is indicated by other phenomena.”