CYBRARY PODCASTS

Ep.13 Saul Costa | Data Science with Next Tech

podcast default

In this episode of the Cybrary Podcast, we sit down with Saul Costa the founder and CEO of Next Tech. Speaking with Leif Jackson the VP of Content and Community for Cybrary Saul talks about what Next Tech is up to and which programming language he thinks you should learn first.

Hosted by: Leif Jackson, Saul Costa
Length: 27 minutes
Released on: March 4th, 2020
podcast default

Listen to the Audio

Enjoyed this podcast?
Share it with friends now!

Summary

Leif Jackson, Head of Content and Community of Cybrary, gets an opportunity to interview Saul Costa, Founder, and CEO of NextTech. Saul talks Leif through his experiences in education, data science, and development.

Saul talks about NextTech, which is a startup that was founded by him about five years ago, right after Saul graduated from college. NextTech tries to make it easy for people to get hands-on experience with various tech skills. They focus on skills like data science, machine learning, computer programming, and web development. The way they do it is they provide the students with access to these live coding environments that they can then use directly from the browser. They were building a learning management system that was built for instructors and students. Instructors could log into the product and they could build out an assignment and specify a series of test cases that they wanted to run against the code that their students uploaded. They at NextTech are also building out their content library. They have about 400 hours of content on its website that you can sign up and take.

Saul tells us they are working with Cybrary merger between cybersecurity, data science, and programming. Saul and Leif discuss the changing market as well.

Saul explains how he got into IT and data science which led him to founding NextTech. His way went from QBasic to Ruby on Rails through C++. Saul also shares his experiences with MixRank, where he previously worked for, in data analytics, data engineering, and machine learning.

We also get to know which programming language Saul is suggesting to start with, the recent focus areas of NextTech, the hot topics in the content library, how you could become a data scientist, coding bootcamps, the importance of experience, and some other interesting topics.

If you find this interesting, please listen to the podcast or read the full transcript.

Transcript

Leif: Hi everybody. Leif Jackson here, Head of Content and Community here with Saul Costa. Welcome, Saul.

Saul: Thank you for having me.

Leif: CEO of NextTech. Super excited to have you here today. Just here to learn a little bit about you and then also talk a little data science, right. And development.

Saul: Yeah.

Leif: So of super interest to a lot of people on our platform. So really excited to kind of chat with you today about that.

Saul: For sure. Yeah. So as you said, I'm Saul, Costa. NextTech is a startup that I founded about five years ago, right after I graduated from college. And essentially what NextTech does is we try to make it really easy for people to get hands-on experience with various tech skills. So we focus on skills like data science and machine learning, computer programming, web development, things like that. And the way we do it is we provide them with access to these live coding environments that they can then use directly from the browser. So they don't need to download anything. They don't need to do any setup themselves. And it allows them to focus on what they're learning versus having to wrangle the tools that they're using.

Leif: Absolutely. A huge growth area in our market. Right?

Saul: Yeah.

Leif: Huge growth area in security. What are some kinds of applicable use cases that you could see in security?

Saul: I think security is really interesting because of the amount of interactivity that you see in that field. So when I was in college and we were doing, I was working through my cybersecurity degree. What we did pretty much in every class was alright, let's sit down, let's set up a cyber range. Let's work through this and do, you know, capture the flag red blue type of exercise. So, because security is so much around, not just the theory, but also the let's sit down and fix the issue, identify the issue. I think that interactivity in that space makes, makes a lot of sense. One thing that I'm really excited about and this, something that we're going to be working with Cybrary an is kind of the merger between cybersecurity and data science and programming, and starting to focus on these kinds of new ways to look at security using these existing approaches like data science.

Leif: Absolutely. Yeah. I think that's something that we saw, Edward Amoroso did a course on our site on, basically 50 CISO Security Controls and he built a graphical market of every security control of the market, right across like network controls, governance, the enterprise, you name it right? Two areas where the growth market is happening. Right?

Saul: Yeah.

Leif: Data science and cloud actually are the kind of the big two. Right. And I think it's super changing the game, right. Of how we analyze log files, how we analyze vulnerabilities, those kinds of things. So I think that the content that you're going to be delivering on the platform is really gonna take them to the next step, right? That's upon with your company, right.

Saul: We've run into a lot of cases where you do that. Like, what are we working on next? Oh, we did it again.

Leif: So I mean, just backing up a bit, like, so what got you originally interested in kind of programming and data science?

Saul: Yeah, for sure. So I've really proud to be a lifelong programmer. I wrote my first program when I was about seven and it was in QBasic, which is really interesting for people that are, you know, use QBasic and probably Windows95 at that point. You used to be able to launch QBasic very easily from your Start menu in Windows. So it was like one click. Basically, you have a programming environment set up and you can start learning.

Leif: Right.

Saul: And fast forward, like 10 years from there to when I started to formulate some opinions that eventually led to me founding NextTech. It was much harder to get to a point where you could start writing code. So we'd actually regressed from that state where me, seven years old, could sit down and learn QBasic. And then now students, 10 years later actually having to do all this really complex setup. But yeah, so I started with that. I moved on to C++ and started in college doing a Ruby on Rails application development. And most of the work that I've done on the data science side has actually been around what's called data engineering. So rather than building out data science models, it's figuring out how do we take billions and billions of data points and process them at scale. So what are the types of aggregations used to do? Not only to be able to store that type of data, but to be able to extract it fast enough for it to be relevant to the users of your product.

Leif: Interesting.

Saul: Yeah.

Leif: And what would you call that? What would you call it? Machine learning, deep learning, AI, one of these marketing terms for statistical analysis.

Saul: Yeah it was more on the data side. More like data analytics, data engineering. yeah, we, the company that I worked for did machine learning too, it was a really interesting company actually called MixRank. And what they would do is look at what different websites had been created with the technologies that were used with a website or same thing for mobile applications.

Leif: Got you.

Saul: So they would do a predictive analysis on kind of the footprints that these application SDKs would leave inside of an app. And then they would use that to predict whether or not that was actually an SDK that was being used. And then they'd actually take similar approaches and apply that to matching people back with their companies.

Leif: Got you. Super interesting. So tell me a little bit about the company. How has it evolved since you first started it? Who's using the platform. Those kinds of things.

Saul: Yeah, for sure. So, like I said, we started about five years ago. At that time, really what we were building was a learning management system that was built for instructors and students. And so instructors could log into the product and they could build out an assignment and specify a series of test cases that they wanted to run against the code that their students uploaded. And what we found very quickly was that. We really wanted to work with companies that worked with instructors. And the reason for that is we from a business perspective, could close one deal. And then suddenly we have hundreds of thousands of students using the product. So it became much easier for us to scale as a startup team where we're kind of. We're managing a relationship that's with that vendor, for the instructors, as opposed to working with each instructor individually. So it's scaled much better for us. From there we focused mostly on that, what we at that point started to call our platform. So the ability to not only have the lab environment for the students, but we really built out this content creation tool that allows for programming and data science and other tech field content to be created very, very quickly. And so what that has now evolved into kind of the third phase of that phase of the company is this content library that we've built out ourselves. And so we now have the ability to, we have about 400 hours of content on our website that you can sign up and take. So not only are we partnering with these content companies and these companies that are selling to instructors and students, but we're also building out our own content library.

Leif: Great. So. Tell me what's in the content library and you know, kind of how it's building.

Saul: Yeah, for sure. So the three areas that stand out the most are definitely data science, machine learning, and introductory programming. The reason for that is we partnered with a company called Pact Publishers out of the UK. And what we have with them is really special where they produce this really great library of eBooks for various tech skills, I focus wholly on tech skills as a publisher. And our partnership allows us to essentially access that catalog and build course material around it. And so what we did when we started this just over a year ago, now we just tried everything. And we built out content for, you know, everything from Bitcoin to data science, machine learning, to rust programming and all of that. And we marketed it all equally and we found that the data science machine learning content really hit a note with the community. And that's where we saw the most traction for. The introductory programming that what we find is also getting a lot of usages. And I think that's because, you know, at that point there are so many people that are starting off programming and there are a lot of opinions out there around which programming language should you should start with. Sowe kind of just try to offer one for each language.

Leif: Got you. So which one should you start with?

Saul: Well, I would encourage you to start with C++, it's stuff if you want to learn programming, it's definitely going to take more time and more Red Bull than starting with Python, for example. But it gives you a coverage of the subject area, especially like moving beyond just the application, but into the theory of it. That you're really not going to get a fuse of higher-level language like Python. It's what I used for the majority of the time when I was really getting ramped up with programming and looking back and then taking into account the languages that you, that I could have used, I'm definitely, really glad that I learned C++, and a lot of these other programming languages like Python and Ruby and so forth they're actually built on top of C, which is a very, very similar language to C++. So you get that foundational language and then you can build up from there.

Leif: Got you. No, that makes sense. So what are kind of the major trends that you see from like data science learners recently?

Saul: Yeah. Um, it's really exciting, I think because the main trend there is that everyone's doing it now.

Leif: Right?

Saul: So you're seeing data sign, you know, previously you were seeing data science with applications like I was talking about with mixed rank where it's kind of much more technical and the programmers on the team are working on the data science and the machine learning applications. What you're seeing now are like finance teams and customer support teams and analysts that have never written a line of code in their life before. But now they are presented with this opportunity to take whatever processes that they're currently doing and do it much better if they get the understanding of introductory programming and data science. And the great thing about data science is that it's something that you can jump into very quickly and immediately start to be productive with drawing insights from your data set. And I think in the day and age that we're living in now with the volume of data that's being produced, it's kind of like you're either in or you're out and you, you need to kind of get on the bandwagon to stay relevant and to keep up with the pace of the data that's being produced.

Leif: Yeah, absolutely. I mean, what 60% of the world's skills, the next five years haven't been invented yet. So that's something I always say all the time and the reason that we have the fastest-growing catalog in the space, right? Because we want to keep up with the tech skills and the security skills that are necessary for the tech teams out there. Who performs best on NextTech, like from either a learner or a creator perspective? Like, how can I determine, like, this is something for me?

Saul: One of the things we focused on a lot recently in terms of the student experience are people who are taking our more interactive courses. So we actually just last week ran a whole bunch of analytics on our usage. And what we found is that the courses that we have that have any layer of interactivity, and by that, I mean, they're actual tasks for the student to complete. And these like live challenges for them tend to, they make up about. 30% of our catalog and 80% of our usage. And so what we started to do a lot more is invest into producing those types of courses. And I think that you know, between a student and a creator, we try to develop an experience. That's good for both of them. And it's actually funny when you look at our product because the interface for a student in the interface for Creator are very similar to each other. So between the two, you almost can't tell which is which, and so the content creator is able to come in there and very, very quickly produce content. And then see what it's going to look like for the students' experience, and then push that out for them.

Leif: Interesting. And you talked a little bit about how. You select content based on work roles and the skills related to that work role. Can you talk about what those work roles are that you're looking for and then what content that you're adding based on those work roles?

Saul: Yeah, for sure. So the content that we're the rather the skill paths, which is what we call scope out, yeah. That we're focusing on right now are for Python data scientists, Python machine learning practitioners, and then Python web developers. And so what we tend to do when we're picking content for that is we look at well, number one, we have a bit of experience ourselves to build on, which is really great. And we're also then looking at what the industry standards are. So what's really cool is there are companies out there like Stackoverflow, for example, or a HackerRank that have gone through, and they've surveyed people based on their roles and said, what skills are you learning? What don't, you know, that you wish. You did, for the data science and machine learning topics that we are a scale past that we have. Those are very heavily focused on looking at data from multiple levels. So there might be something that's like data science. That's going to use some algorithms. There might be something that's more like developing an application using data science algorithms. We're gonna throw a lot of SQL in there as well. And then kind of the, when we look at the programming side of it, it's just what programming skills do you need to be productive in those subject areas? And the nice thing about data science, like I was saying is that you don't need it. With it, like you would for say a Python web developer, um, you really just need to know kind of the basics of the language. And then there are tons of libraries that can help you get started with processing your data and drawing insights from it. So those tend to be a little bit less heavy on programming. These are some of the other skill paths.

Leif: Hmm. Interesting. So if I were a, let's say I'm a SOC analyst, right? Like, so I'm a blue team, right? Like what, and I don't know any data science. Right. But I understand security super well. What would you recommend to me?

Saul: Get as much data as you can. Watching the data scientists in our teamwork through some of our data sets and also work on, she does some of the course development too, has been really interesting because what I've seen her do more than anything is spend time understanding the data. So it's not so much the end application that you're going to build. It's more, what data am I bringing to the table? And what in this data is relevant to the question that I'm trying to answer. And more importantly, what isn't and most of the time, you know, 70 plus percent of the data you're going to bring to the table is irrelevant. One of the other big steps in that is data cleaning. So early on Python programming program expects in a particular format. So you're going to learn, how do I parse this data out? How do I reject invalid records or backfill the data with something that's, that I do have available to me? And so, and then as you're working through that, ideally you're kind of getting a sense of what are the most important attributes of this dataset? And once you've hit that point where you're going to start doing is actually writing some code and starting to test those various attributes and determining whether or not they bear significance on the question you're trying to answer.

Leif: Right. So basic regression sometimes, right? Like, I mean, the weather might not be correlated with the crime or something like that. Right. So. Maybe take the temperature out of the dataset in order to its correlation, not causation.

Saul: Yeah, exactly.

Leif: So that totally makes sense.

Saul: I think the SOC analyst has an advantage when they're coming to the table already, which is that they understand what those metrics are. And so I think, you know, over, if they're just kind of handing that off to a data scientist, then there might be something lost in that translation. So if they can actually come and say, no, this is what this metric means. This is a really important metric for us to be looking at. That context can be really useful as they start building that application.

Leif: Especially if they're analyzing log files and those kinds of things like trying to determine, where those outliers, right? Like what is actually happening here? What, where our vulnerabilities actually, you know. How about a pentester on the road team side?

Saul: Pentester on the red team's side. Now you have me thinking back to when I was on the red team. I think a lot of that's going to revolve around the scans that they're doing. One of we actually used a tool sometime back that kind of did something similar to that, where it was doing these scans and mass, and then trying to work backward once it had collected that data to identify what are the weakest points in the chain.

Leif: Sure. Yep. And then seeing if you can get in.

Saul: Yup. Yeah. That's step two.

Leif: Yep. Absolutely. Just send an adorable cap photo and then you'd be surprised at how quickly you can get him, as normally out works. What are you seeing, like kind of, is the hardest skill to come by in the market. You mentioned a little bit about HackerRank and Stack Overflow, like as markets where those skills, what they're looking for, but what, where are you seeing like, Hey, these are really difficult skills to find.

Saul: Yeah, for sure. So I've actually written a bit about this on my blog and I think it's not a tech skill, for sure. It's a lot of these skills that you pick up along the way, being a developer or being a data scientist. And so going back to the data scientists in our team, she came in and started working with a company and she didn't have any prior programming experience, but she had a lot of analytics experience. And so watching her progress, as she's learned more about programming, has been really interesting to follow along with because seeing skills like debugging, for example, where she's learned, this is how I read a stack trace. And now that I've identified this error and I've googled it and Google hasn't been a help, how do I put in these breakpoints into my program and think about it. Logically to get down to what the root causes. So I think it comes down to using your resources wisely. A lot of people early on in computer programming tend to suffer a lot from imposter syndrome and, you know like I do, everyone else must know this scale and I don't know it and I'm a bad programmer and that's just a bunch of malarkey like there's, you know, and I think it's a little bit born out of the way that we test people on their technical skills. We'll kind of say, you know, you have to complete this algorithm and no help. We're going to lock you in a room with a whiteboard. And that's just not really how people learn tech skills or how they operate as a developer. And so I think like using your resources, using Google, having a really good understanding of how to debug a program. And then the third one is probably the most important is knowing how to read code. And so as you become a better and better programmer, you might learn one particular language. For example, like I learned Python really well. And then when I started working with Go, I kind of already had the general grammar of it. And so I was able to then look through a lot of lines, Go is a really great language. And it's really great because of the, it has, what's called signatures for the function. So you always know what you're going to put in. You always know what you're going to get out. However, because of that, the documentation in that overall community can be a little bit lacking. And so what I ended up doing when I learned to Go is I just went and read Go code. And eventually you like I said, you start to understand what the grammar of a programming project is, you know? Okay. I'm going to look here for this particular import and so forth. And really just like getting exposed to other people's code, you know, check, if you're reading on a, if you're going on a, get everybody to read through it, how many stars does it have? You know, like, are you reading quality code? Because at the end of the day, there are a bunch of different ways you could write a particular program, but if you can see what experienced developer has already done with something similar than you might see ways you've never even thought of before that, you know, could be more performant or cleaner or even more secure.

Leif: Interesting. Yep. And how do you see those, you know, kind of those skills evolving and then changing over the next two, three, maybe four or five years, but we'll go through three years because who knows beyond that. Right?

Saul: Right. Yeah. I think we've seen an interesting trend over the last few years where a lot of people have gone through these coding bootcamp programs. And unfortunately, they've focused not as heavily on that as they should have. They focused more on a kind of like parrot coding in a sense. And don't get me wrong. Like there are some really good kind of bootcamps like we work with them. But what ends up happening is the programmers that come out of that tend to be able to replicate code, but not really be able to break outside of those boundaries of what they already know. And so talking with other people who are hiring in the technical field, they've expressed similar sentiments that I'm sharing, which is yeah, yeah, I get you can, you know, the programming language, the syntax, do you know how to think like a developer and some of that comes from going into it with the awareness that it's not all just about the code and some of it comes from just sitting down and writing code and being stuck and debugging your way through it. And yeah, so I think that the trend you're going to see back to your question is hopefully a higher emphasis on those types of skills producing better developers, not just better programmers. I think you're also starting to see this shift in a different way inside of the higher education space, where people are less focused on like I said, just, you know, can you solve the programming problem, but more around. Let's get you hooked up with an internship or let's do your senior capstone as being this big project. And my hope is that there's going to be a general mentality shift towards that more and more where you're not just producing people who know how to code, but people who know how to write applications as a developer.

Leif: Yeah, absolutely. I mean, you have to know the pain, right. I remember I think one time it took me like two days to find a semicolon.

Saul: We hold them there.

Leif: Yeah. Very memorable experience. Right. But you kind of need to build that. I call it grit, right? Like as you're going through the code and it's not compiling and you don't know why it ain't telling you why. Right? Like, so…

Saul: It's almost like a kind of creepy intuition after you've been programming for like decades. You, the thing that I've found is that I always start with the least intuitive answer, like the thing that no way is it going to be this thing, and surprisingly enough, a lot of times that's where the issue is. And I kind of sit there like almost shaking, like, Oh God, what if I hadn't checked that first? Like, this would have been days of looking down the wrong path. And yeah, I think you start, especially if it's an application that you've written, you start to get a feel for like, Oh, this book smells like this particular thing. So, and some of that just comes with time. And, you know, I get asked a lot, how do I become a programmer and get a job? And my biggest piece of advice is going to build something like there are lots of great resources out there. Cybrary has them. NextTech has them. We're building awesome ones together. But at the end of the day, it's sitting down and coming up with any idea, replicating an idea. It doesn't matter, but actually going through the motions and building an application for yourself or running some aggregations doing some data science also, like there's plenty of different ways you can approach that. And you know, actually sitting down and doing it yourself, like you can't replace that in a classroom or a coding boot camp program or anywhere else. So…

Leif: Yeah. Just stepping back a bit. I think NextTech actually does a really good job of, kind of stepping through the content. So when I was watching your product, like it's not like what we experienced with like, Hey, the semicolon, you know, and it won't tell you where it is, right? Like when you're actually using your product, like, it does tell you like, Oh, you know, this is the spot where this might be missing. So it kind of, yeah. Is spoonfeeds a little bit right? Like is that fair to say, like, you know, the user so that way they have, it's an easier entry point for those that. Probably less than what you and I experienced. Right. For those starting to learn the language.

Saul: For sure. Yeah. So we and those are the courses I mentioned earlier that just get like a massive amount of usage for us. And what we try to do as we build those courses is we also layer in project opportunities. So we'll say, great, you've learned this goes up to this point. Here's an idea for a project and just kind of a blank space for you to get started with it, and then actually build something for yourself.

Leif: Got you. And that's, is that where you replicate the pain than in the project?

Saul: Basically. Yeah, here's the spot to knock your head against the wall, and just like, you know, figure it out and then if you need to, then you can continue on to the course and pick up some new skills. So…

Leif: Awesome. Anything else that you have for us today?

Saul: I think that's pretty much it. Really excited to be here says, yeah, really cool.

Leif: I'm so excited to work with you and thanks for coming today.

Saul: Yeah, for sure. Thanks for having me.

Leif: Thanks all.