CYBRARY PODCASTS

Ep. 16 John Liu, Jimmy Whitaker, Uday Kamath | Deep Learning for NLP and Speech Recognition

podcast default

In this episode of the Cybrary Podcast, we sit down with Uday Kamath the Chief Analytics Officer at Digital Reasoning, Jimmy Whitaker the Director of Applied Research (Audio) at Digital Reasoning and John Liu, the CEO of Intelluron. Speaking with Leif Jackson, Cybrary's VP of Content and Community, they discuss their book Deep Learning for NLP and Speech Recognition, which includes case studies and implementation of machine learning that are beneficial to both researchers and engineers alike.

Hosted by: Leif Jackson, Uday Kamath, Jimmy Whitaker, John Liu
Length: 22 minutes
Released on: March 25th, 2020
Listen to the Audio
Enjoyed this podcast?
Share it with friends now!
Summary

Uday Kamath, Chief Analytics Officer at Digital Reasoning, Jim Whitaker, Director of Applied ML at Digital Reasoning and John Liu, Founder, and CEO of Intelluron have come together in this episode of Cybrary podcast with the host Leif Jackson to discuss AI-related stuff and their book called Deep Learning for NLP and Speech Recognition.

There were a wide variety of cookbooks in Python, ML basics, and step by step tutorials, but there was not anything tying them together and leading them toward a single focal point. The authors of the book realized a lack of resources that bring together application and theory in great detail on areas that are important to focus, and they all came up with the idea of publishing such a priceless book. The main application of this book in cyberspace is communication. If we think of it, generally, attacks start with communication. So understanding normal communication patterns in an organization among chats, emails and logs can be way beneficial for mitigating the risk of being attacked by detecting suspicious emails for instance. There are different useful concepts covered in this book such as embedding which is good for capturing interesting relationship patterns, and that is way good for capturing malware to use in firewalls. And after all, recognizing the data patterns can help way more than what we think in decreasing the chance of being breached.

The most interesting chapter according to Uday is Reinforcement Learning which is that part of learning, which is more AI than like any other technique because you understand just like humans if you are following certain actions, how do you learn from that actions? How do you understand the policies or rules that have guided you through the actions and whether it is a normal action or an anomalous action, like in malware attacks, you normally say you can train a reinforcement agent to understand, make it more explainable, and even have a policy-based system that guides you through that. This book has some very useful resources from theories to practical part snippets that can be run even with real-world data. It can be applied to Cybersecurity. Security Researchers, for instance, can use it as well as Security Engineers.

The Major skill set that needed for taking on this path is understanding data at the foundation level, structuring problems, getting creative, manipulating data, and figuring out the right tool in the right situation. These skills are pretty crucial for someone who wants to take on this path and have a go with AI, ML, DL, and particularly NLP. All these science are based on data that is served to the machine, and the machine decides based on those provided data. Structuring data is domain-specific. For example, health care would do it differently capturing the patient records and capturing the doctors’ notes and finding interesting patterns across to help automate some stuff around. Jimmy thinks he would call it software 2.0 because it is getting a key focus in the world and shortly nearly all the software will be machine-learning oriented. It can be used in the bug fixing process, and one can make it automated and all those bug fixes can be done periodically and completely automatically.

There are many techniques exposed in this book such as semi-supervised learning, one-shot learning, and series-shot learning. Some of these techniques are being at the moment which addresses small sets of data and labeling them. Labeling data can be used in prediction in block-chain markets for example. Uday, Jimmy, and John encourage people to take on this path because it is getting an important science at the moment and they have brought up almost everything needed in their book for the people to enable them to learn AI.