Random Forest Classifier - Interview with Random Forest Classifier architect Anna Quach

Anna Quach is a Data Scientist and Machine Learning researcher, specialising in Random forests (one of the most commonly used Machine Learning algorithms), statistical computing and dataviz. Experienced in analysing large scale, sometimes noisy datasets using machine learning, she enjoys finding those hidden gems in data, telling stories using effective data visualizations, programming and keeping up to date with new tools in R programming language. Developing a visualization method for asymmetric data for her PhD, Anna also found ways to improve Random Forests application on Genome Wide Association Study data and developed a fast and effective filtering method to reduce high-dimensional genetic data.

So happy to invite Anna back to speak at the 3rd edition of AI With the Bestonline developer conference 29–30th April to hear her top tips for building a Random Forest Classifier — but for now we got to ask her a few questions.

Q. We’re pleased to welcome you back to AI With The Best! What have been the most significant advances for you, in machine learning, over the past 6 months?

Great question! I’ve been working on detecting genetic interactions. It’s quite a challenging problem because the number of possible pairs to look at grows exponentially as the number of genes increases and the sample size usually ranges in the thousands. I came up with a new filtering method that is fast, yet is competitive against existing methods in identifying ground truth. I’ve also made improvements to Random Forests ability to detect interactions as well.

Q. Tell us about your research on Random Forests?

Random Forests is a great machine learning algorithm that a lot of people love because it’s so easy to build a Random Forests model and get great accuracy at the same time. However, there are some weaknesses to Random Forests and that’s what I’ve been working on to improve. One thing I’m excited about is improving the interpretation of Random Forests through proximities. I think this where Random Forests has an advantage over other machine learning algorithms. You can determine global and local variable importance, identify clusters, determine which observations are difficult to classify, and more.

Q. At the conference, you’ll be discussing “Tips and Tricks to building a Random Forest Classifier”, how easy is it to diffuse research breakthroughs to software architects and developers?

I feel like it’s easier these days due to social media. There are so many websites that people check daily so it’s easy to get the message out when you have something to share about your work.

Q. How do you ensure you keep up to date with new research in your field?

I do a couple of things. I have a Google scholar account which notifies me and recommends papers that are relevant to my research. I follow other researchers. There’s a website called Research Gate that sends me notifications when someone I’m following has published research. I try to go to conferences or at least check out what papers were published related to my interests.

Q. What advice would you give to budding AI developers?

I think it’s important to stay on top of current cutting edge research and to talk to many different people, experts if possible, in AI. I was fortunate to work with a world expert in Random Forests so I got a lot of insight and direction to problems that are more important to work on and would have the most impact to the community.

Q. Are you excited about speaking at AI With The Best? What made you want to be a speaker?

Yes, I am! The AI With The Best conference is such a great idea, especially for students who want to hear from AI experts but can’t afford the travel costs. The main reason why I wanted to be a speaker is because I thought it would be a great opportunity to talk about Random Forests. A lot of people use Random Forests but they don’t all make use of all the great features it has to offer and there are common mistakes and/or misunderstanding that I would like to help people avoid.

Thank you Anna!

You can learn some Tips and Tricks for building a Random Forests Classifier and ask Anna your own questions at our upcoming AI With The Best, Online Developer Conference 29–30th April.