Week 5: Learning and Doing

5 July 2014 - Chicago

This is the fifth in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.
You can read my last post here:

Week 4: Bringing Humans To The Data

To get automatically notified about new posts, you can subscribe by clicking here. You can also subscribe via RSS to this blog to get updates.

My number one goal in participating in the 2014 DSSG Fellowship was to create something of value. By something valuable, I had in mind a piece of software that helped satisfy the needs of the non profit partners I would work with.

Now halfway through the Fellowship, I’ve increasingly noticed a tension between my goal and that of the Fellowship. As mentioned in my initial reflections about DSSG, the goals of the Fellowship were focused around:

“(a) helping Fellows learn how to use various techniques and tools and (b) developing each Fellow’s interests in working towards social good, open government and open science.”

Both of these are also priorities of mine, but I believe that they could both be obtained primarily through creating a valuable piece of software. Rather than through listening to lectures or reading, I find myself learning most productively (as measured by amount of content retained per unit of time) when I have the chance to apply them in practice.[1] Being exposed to and navigating this tension has made me more aware of how the different goals of learning versus doing translate into mindsets and behaviors.

As someone who came into the Fellowship leaning more towards the doing camp, my mindset towards problem-solving is to focus on effectiveness, and not necessarily on efficiency. I work with the implicit assumption that my code won’t be as clean or properly abstracted as I would like it to be. I take an iterative approach where I take small stabs at the problem and refine my code as I build up my understanding of the problem.

When I come across edge cases in the data (e.g., a client who has a negative age, or a field with multiple values that really mean the same thing), I put aside my curiosity to dig further, make a mental note to explore it later and exclude this data from my analyses. With less hand-wringing about how to deal with strange outliers or edge cases, I default towards simplicity and building the least complex model. In fact, the first model my team and I looked at for Health Leads was one of the simplest possible: a single-variable logistic regression.

As a result of prioritizing doing over learning, I work primarily in iPython Notebook, a web-based Python interpreter. Only after properly mapping out and charting the territory of the problem do I then try to translate the code I’ve hacked together into more cleanly abstracted modules and scripts.[2]

In contrast to my attitude when my aim is doing, when I am clearly optimizing for learning, I focus on efficiency of process rather than effectiveness. Ira Glass, the host of the spectacular radio show This American Life, once said that it’s your taste combined with the relentless effort to materialize it your work that will propel you to greatness.

When I’m optimizing for learning, I’m driven more by the curiosity to understand than by the goal of achieving. I work more slowly, pausing to try to understand the edge cases. By poking around the shells of these outliers, searching for cracks, I end up discovering potholes to fill up in my knowledge. In this state of mind, progress on my work feels slower, but the density of learning is much higher.

Putting these thoughts into the framework of cognitive theories of behavior, I suspect that prioritizing learning over doing is aligns your mindset with an attitude of deliberate practice, key to becoming great at your craft.

In this way, learning is complementary to doing. A strong burst of effort towards learning the fundamentals, the shortcuts and the heuristics are the precursors to getting a ton done.

The Learning/Doing Curve

As the graph I made above shows, I suspect the beginning of a new project will require a heavy commitment towards learning. However, each of the dips in the curve represents a cycle of work that attempts to meet a project deadline happening at the trough of the curve. The oscillations that occur afterwards represent the various stumbling blocks that you encounter during the course of a project.

As I progress I’ll have to keep in mind just how much I want to optimize towards learning versus doing, and try to feel out where the various inflection points are.

##Footnotes

[1] I’ve noticed that I’ve been in the state of ‘flow’ more often when I’m in the act of creation, such as through writing or coding, than when I’m passively absorbing information, such as through watching a talk. One intermediate point between these two different ends is that I’ve also noticed I can easily go into a state of ‘flow’ when I’m absorbing information through reading. However even when I read I find myself underlining key passages, jotting notes and counterpoints and actively thinking about the content.

[2] No matter how much I focus on doing, I can’t escape my aesthetic preference for cleanly and well-factored code. I spent last night obsessing over how to speed a function. I left the office happily at 1:30am with the code running faster by a factor of about 8.

I write weekly posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by clicking here.