Hacking LPS

Benjamin Nelson is an astronomy postdoc and data scientist at Northwestern University. He originally got involved with LPS as a letter writer in late 2015. He had fun with the experience and planned to renew until other academic responsibilities came up. Little did he know that in the next couple years he’d be working with LPS again but in a much more significant way: Ben organized the team that wrote the code to match up this year’s scientist and pre-scientist pen pals! His story picks up several months after first getting matched to a pre-scientist.

Rewind to Astro Hack Week 2016. This was a week-long astronomy “unconference”, an informal meeting that doesn’t follow the typical top-down organization of a science conference. The content is really driven by the participants: the mornings consist of lectures on popular data science topics (e.g., machine learning, Bayesian statistics, data visualization), followed by “hack” pitches from the attendees, and then hacking in the afternoon. In this context, “hacking” just means to code something up as quickly as possible without fussing over the details, mostly through exploring and experimenting. A good hack is something that can be accomplished in a day or over the course of the meeting.

One cool aspect of this conference was a computer selected all the participants. Before the meeting, we had to fill out an online application. An algorithm looked across many different parameters (e.g., data science skills, academic seniority, gender, ethnicity, etc.) to create an optimally diverse group of participants based on the pool of applicants. The code that did this is publicly available.

But in this large group of mostly strangers, I didn’t really know who I needed to talk to about my research problems without talking with a lot of people. If a computer could organize a conference, then maybe it could also automate the “meet-and-greet” portion to these meetings.

So I pitched a hack that eventually evolved into “collaboratr”, an academic matching service. It takes survey data of people willing to “teach” a topic and people who want to “learn” about said topic, then notifies the “learners” who they should talk to at that meeting. Several of us hacked on it throughout the week and got a working prototype.


Now, how does this all connect to LPS? Fast forward again to March 2017. I gave some science talks at a local middle school, which rekindled my interest in doing more astronomy outreach. Specifically, I wondered if I could use “collaboratr” to effectively do my own local pen pal program. My department does a lot of outreach, and our outreach coordinator hooked me up with a Chicago school and many non-astronomers at the university likely to participate. I had a class of ~30 pre-scientists and ~20 scientists to work with. I created a Google form for both to fill out and ran “collaboratr” on their survey responses. In less than a second, I had pen pal matches for everyone. We did four letter exchanges over six weeks. I also got to play mailman for a couple days (a childhood dream of mine!).


It turns out that one of the NU scientists participating was Louisa Savereide, one of the LPS classroom coordinators. We ended up talking about LPS and she set up a meeting with LPS co-founder Anna Goldstein to show what we did. Anna said it normally takes weeks for the LPS team to do all their pen pal matching, though understandably so. LPS has many hundred pre-scientists and many hundred scientists to match up, with a somewhat rigid set of criteria. So they were eager to integrate our code into their matching pipeline.

By now, it was summer 2017 and I was at Astro Hack Week again. This one had a “community hacking” day, where we would focus on hacks that make the astronomy community better. Anna sent me the LPS pen pal form data, and I pitched getting “collaboratr” to work with these spreadsheets. A team formed, ranging from astronomy undergraduates to professors.

In the end, we had an algorithm that could match up all ~550 pre-scientists for this academic year in a couple seconds. This was a huge time saver, roughly 1,000,000 times faster than the manual matching LPS had to deal with!

All of this year’s matches were done with this code. We’ll continue to improve our matching algorithm by using more pen pal form data.


Your email address will not be published. Required fields are marked *