• Gold standard data: lessons from the trenches

    This article is a draft of a talk I am giving at PyData Berlin in July 2017. It is intended for a non-technical audience, but I plan to expand it into a more technical piece soonTM.

  • An exploration of scipy sparse matrices

    My colleague Matti Lyra recently faced an interesting computational problem. He wanted to see how quickly a stream of temporaly-ordered documents evolves, and he chose to do it by looking at how often new words appear in the steam. This post is about how to do this efficiently in Python.

  • Profiling Python

    This article explains the basics of profiling Python code. The hardest part is installing all the great tools that make it trivial to find the bottleneck in your code.

Subscribe via RSS