Engineer Better Research Results From a Solid Workbench

Treating the process of your work as important as the result will improve the quality of your results. All of the most successful projects that I’ve seen share a common factor: they are a delight to work on. When your workspace is organized, your tools are sharp, and the goals are clear, it’s easier to stay in a flow state and to do your best work. Projects that are mired in tedium, don’t have a good feedback loop, and don’t have a solid pattern of delivery can easily get into trouble. Without enough institutional momentum to make up for the poor engineering environment, they can fail. A lot of focus gets put on building the right thing for customers, and rightfully so, but it’s important to remember that before we can ship anything, we have to first build our workbench. Whether we do that haphazardly or intentionally can have an enormous impact on the quality of our results.

Inspirations from Human-Centered Design and The Toyota Way

Two sources I draw inspiration from are Don Norman’s school of human-centered design and The Toyota Way. Norman’s writing on product design is so seminal that his name has even been used to denote those confusing doors that you don’t know whether to push or pull: Norman doors. I believe his design principles are as applicable to our workspaces as engineers and researchers as they are to consumer product design. Concepts like feedback, mapping, and constraints can guide us in creating a workspace that works for us instead of against us. A lossy summary of those concepts might be: The things we use should work the way we expect them to, with intuitive and obvious controls. The Toyota Way is a 14-principle system that guides every aspect of Toyota’s business. A few of those principles that resonate with me are: creating processes that bring problems to the surface (and stopping to fix them), standardizing tasks and processes, and “go and see for yourself.” These principles include an organization method called 5S, which essentially boils down to “everything has a place, and keep the workspace tidy.” In the culinary world, chefs have a similar concept: mise en place. Everything should be arranged properly before you start to cook.

Of course I’m not the only software engineer to recognize the utility of these disciplines. These ideas have been applied in many ecosystems around the software community. You can find traces of The Toyota Way in Joel Spolsky’s The Joel Test, whether intentional or not. “Can you make a build in one step?” is a prime example of standardizing tasks and processes. “Do you fix bugs before writing new code?” sounds a lot like Toyota’s practice of stopping the production line to fix quality issues. We can thank the Ruby community for pioneering a doctrine of optimizing for developer happiness. Practices born out of that philosophy, like database migration scripts, dependency management, and continuous integration/continuous delivery, have become ubiquitous over the last decade. The idea goes: Tools that are nice to use will get used better, and happy developers are more likely to make good decisions.

Engineering Empowers Research

An area that’s still an open field is the intersection of research and software engineering. Researchers have different backgrounds from engineers. They are accustomed to different tools and practices, and the environment of academia doesn’t instill a quality engineering mindset. The work researchers do is amazing, but I dare you to find the repo for any SOTA machine learning paper and try to run it yourself. This is where research engineers can shine: keeping machine learning projects organized and on track, and empowering researchers to deliver their best work. Here are a few hard-earned tips that I’ve learned by working on machine learning and computer vision systems:

  • Visualizations and process automation are the two biggest hitters when it comes to tooling. Trying to repeat tedious processes manually without making mistakes is a fool's errand, and trying to debug a visual system without visualizations is like debugging normal code without logging anything to the console. 

  • Providing runnable examples can help new team members build a conceptual model more quickly by providing an entry point into the system.

  • Lastly, delivering the results isn’t enough. You should package them up and put a bow on top by including a manifest file that helps to contextualize them.

Visualizations

Visualizations are invaluable when it comes to debugging and quality control of a computer vision project. A bug that can take hours or days to solve by looking at code and logs may be completely obvious the moment you render a frame of video with annotations drawn on it. Good diagnostic visuals can also help you avoid regressions, or catch poor performance against new data. I usually add a “--diagnostics” flag to my programs, which will render a bunch of images into the output directory, like error histograms, confusion matrices, or annotated image frames. I’ve also found it helpful to have an interactive diagnostic mode where the program will halt in the middle of what it’s doing, expose the system’s state visually, and let me explore. Think: the equivalent of “pdb”, but for visual debugging. Building tooling along with the product itself vastly improved my productivity while working on a 3D triangulation system. As I was green-fielding the code, I was able to see which 2D detections I was grabbing and visually check that my projections were sane, geometry was correct, and, ultimately, that we ended up with 3D triangulations with strong inlier support. Afterwards, it came in handy over and over to investigate bugs.

Automation

Speaking of bugs, any time you encounter a set of tedious steps that must be repeated manually, you’re looking at a potential source of bugs – bugs that may not be apparent until several steps later in the pipeline. Doing the steps manually might be expedient at first, but the second you need someone else to repeat them, you’re asking for screw-ups. Even if you document your process meticulously, there’s often too much left up to interpretation or chance, and eventually someone will make a mistake. Instead, build tooling to aid yourself early. Striking the right balance between building something you’ll never need versus something useful takes experience, but I suggest erring on the side of over-tooling until you get the hang of it. On a recent project I was tasked with building a camera calibration. One of the documented steps was to extract frames from a video at 1 FPS and then manually delete blurry frames. After a couple of failures, I thought about this process. “Blurry images” is ill-specified. Different people following the same instructions would produce different results. I took a step back and automated the process with a Python script that sampled for the sharpest frame in every one-second interval of the video, according to Laplacian variance. The outcome was that we got higher-quality results. Furthermore, the process is now automated for the next person who comes along.

Examples

Now that you’re scripting everything, make sure everyone else knows how to use your tools. Add a “scripts/” directory to your repo that includes shell scripts to launch each program. Show off every CLI parameter that each program has. It’s like documentation, but better – it’s runnable documentation. Include a small set of sample data in your repo that each script can exercise. Being able to clone a repo, install dependencies, and then immediately run example scripts enables new team members to get up to speed more quickly. It saves you time explaining basic usage over and over. It lets you surface and solve problems sooner rather than later. You can use them as entry points to sanity-check your changes as you work. You can even run these scripts in your CI system for better test coverage. To use another example from the 3D triangulation project, we included a tiny dataset of 2D detections and a camera calibration in the repo. This allowed us to run the triangulator and write outputs in less than a second. Although we had more extensive QA, example scripts enabled us to iterate quickly, while making sure that there weren’t any glaring regressions. One time, when this script suddenly doubled in runtime, we knew some changes we made to our lens distortion code were suspect.

Manifest File

But how do you compare the results of subsequent runs? Make your programs write a manifest file alongside the results and accompanying visualizations. Write into it all the arguments used, input file paths (expand to absolute paths), current date, runtime, and any high-level stats that would help you evaluate a run at a glance. Don’t overthink it and duplicate the information that your visualizations already provide, and don’t bother to make the manifest a machine-readable format. A simple text file will do. In conjunction with your sample scripts, manifests let you quickly assess whether your changes are an improvement, or, conversely, if you broke something. They’re also useful if another team encounters issues while running your code. You can ask them for the manifest produced by the errant run, and it will help you to determine whether there’s a user error or a legitimate bug. They also act as a durable record of past performance. During the 3D triangulation project, we made several major changes that improved runtime performance and accuracy: We improved epipolar geometry, introduced per-frame parallelism, and did a bunch of Python performance tuning. At each juncture, we had manifests to compare. Having an archive of manifests from historic runs allowed us to go back in time and compare how we had done in the past, enabling us to show that our work led to reduced runtime with ever-increasing accuracy.

These are just a few ideas for reducing friction and keeping you and your team happy and productive, but the ideas of human-centered design come to mind over and over while I’m at work. If you haven’t already, I encourage you to read Don Norman’s The Design of Everyday Things, and consider how your relationship with your project workspace might be improved by thinking of yourself as a user of that workspace. Likewise, the way Toyota runs their plants is fascinating, and there is a deep well of ideas to draw upon for running software projects. By paying attention to your tooling and processes, and creating a workspace that makes you glad to be at work every day, I believe your customers will reap the benefits of higher-quality work delivered on a smoother schedule.

– Zac Stewart, ML Engineer @ Hop