❮   all STORIES

The virtues of integrating documentation into your workflow

How do you teach a computer to think like a human? How can you replicate what our neural networks do with code? Can a system learn to recognize images and distinguish voices the way our brains do? These are the questions Amirsina Torfi, computer science and electrical engineering doctoral student, is working to answer.

“It’s the million dollar question,” Torfi says. “If we have a large amount of data, theoretically an infinite source of data, [then] deep learning is exactly equivalent to humans.”

It starts much the same way you start as a baby: slowly learning lots of data, and putting pieces together to construct the world around you.

“If we feed it with abundant data, the network, ideally, can learn to create stories, to classify between different subjects, between different categories,” Torfi explains. “So, basically, deep learning is learning with a neural network. It’s just an in-depth architecture, with numerous layers… just add the data, and that’s deep learning. As simple as that.”

Of course, the execution isn’t as easy as the explanation. Torfi’s research centers around developing code to teach systems how to match, and learn from, audio and video channels.

The project was that [if] you have a video stream and an audio stream, can you tell if both of them have the same author or not,” Torfi says. “If my video has a 0.5-second delay compared to my audio, can we recognize that or not?”

Torfi’s other projects have been similar code-wise, with different applications, including teaching systems how to hear multiple speakers and working with drone software on recognizing an image from different angles. Complex code, to be sure—enough so that Torfi realized if he wanted to share his work, he’d have to make matters visual.

“I realized if anyone wanted to see the paper, it would be much easier for them to understand if they could implement the code,” Torfi says. “People are good at visual things.”

But proper documentation has taken a backseat to showcasing complicated code in many cases, which to Torfi, limits the true accessibility and longevity of the work—even for the programmer who writes it.

Demos and comprehensive documentation, he found, were key.

“The first time I released code on GitHub without any demo, it got maybe one star after a month,” Torfi says. But after adding a demo? “My repository became the trending repository of the day. The only thing that changed was the demo!”

In the field of computer science, code sharing feels natural. But proper documentation has taken a backseat to showcasing complicated code in many cases, which to Torfi, limits the true accessibility and longevity of the work—even for the programmer who writes it.

“When I do detailed documentation, then anytime in the future when I go back to my code, I can easily realize what was happening, what can I reproduce, what can I use again,” Torfi says. “Recently I created protocols for myself for documenting… A one-page list to make a project open-source. So I just tick every one of them: Did I make automatic testing or not? Did I make contributions part or not? Did I make corrections or not? Did I make a demo or not?”

“Sharing is a requirement,” Torfi states, matter-of-factly. “Whatever is not shared is forgotten someday.”

A humble scientist, Torfi takes care to stress that many of the programmers whose code he sees on sites like GitHub are better than him (though he was pleased to be ranked by Git-Awards as one of the top 100 Python developers in the United States). What distinguishes his code from the rest, Torfi says, is his dutiful diligence—and passion—for explaining his work.

“When I create the document, I'm not doing it just for myself—that’s why I become more motivated in creating it. When someone sees my documentation, if it's good, they might have a blessing for me,” he adds, laughing.

It’s no surprise, then, that when asked what his favorite part of his work is, Torfi’s answer is, without hesitation: documentation.

It’s about more than “showing off,” he emphasizes. While making code open-source and easily reproducible certainly helps young scientists land jobs and grants, sharing enriches the community as a whole, adding to the industry so that new researchers can absorb and grow in turn. It forces the programmer to clarify, to change, and to learn constantly.

“Sharing is a requirement,” Torfi states, matter-of-factly. “Whatever is not shared is forgotten someday.”

Ritu Prasad

Ritu Prasad is a freelance journalist with roots in Raleigh-Durham and Chicago. She reports on science, culture, and women's issues.

Ritu Prasad

Ritu Prasad is a freelance journalist with roots in Raleigh-Durham and Chicago. She reports on science, culture, and women's issues.

Explore Torfi's work on his personal site and public code

You might also like...

Join the conversation:

Commenting Guidelines