Dev Interrupted

The Real Measure of Success in Platform Engineering | MassDriver CEO Cory O'Daniel

Season 4 Episode 42

It’s time we recognize the idea of a ‘Golden Path’ is unrealistic.

In this episode of Dev Interrupted, host Dan Lines is joined by Cory O'Daniel, CEO of MassDriver, to discuss Cory’s provocative article 'DevOps is Bullshit'. They cover the pitfalls of DevOps, the evolution of cloud operations and whether or not platform engineering is the solution the industry needs .

Cory shares insights on why many organizations struggle with DevOps implementation, the impact of cloud technology on traditional operations, and how internal developer platforms are reshaping the industry.

Topics:

  • 01:03 Where has DevOps gone wrong?
  • 04:19 Have we changed or does DevOps mean something different?
  • 11:51 Platform engineering in 2024
  • 20:10 How can platform engineering leaders measure success?
  • 26:40 Why are new hires being put in charge of DevOps?
  • 29:53 Getting buy in for a better platform engineering experience
  • 39:13 Are internal developer platforms a fad? 


Show Notes:

Support the show:

Offers:

Cory O'Daniel:

If you were to talk to me like two years ago, I was like, I was sunshine, rainbows, kittens, all the good shit on platform engineering. Because I saw an opportunity there. I saw a word that was getting popularized, books are getting written, blog posts are getting written, and I was like, this is the moment where we can take this word, and we can market it correctly within our orgs to get the buy in, and maybe actually use that to actually get the work done. And But what's happening is we're just, it's, it's rebranding very similarly to Ops to DevOps, DevOps to SRE, SRE to Platform Engineering. So you meet some teams where it's like, Hey, we're doing platform engineering. we manage all the infrastructure in AWS. It's like, oh, how do you do that? And they're like, uh, we just, we're not software developers. So we just click, just click around the AWS console and they're like, oh, wow. So it's really exactly the same as it was 10 years ago. It's just a different word. Right.

Developer productivity can make or break whether companies deliver value to their customers. But are you tracking the right metrics that truly make a difference? Understanding the right productivity metrics can be the difference between hitting your goals and falling behind. To highlight how you can adjust your approach to both measure what matters and identify the right corrective strategies. LinearB just released the engineering leader's guide to accelerating developer productivity. Download the guide at the link of the show notes and take the steps you need to improve your organization's developer productivity today

Dan Lines:

Hey everyone, welcome back to another episode of Dev Interrupted. I'm your host Dan Lines, LinearB co founder and COO. And today I'm joined by Corey O'Daniel, CEO and co founder at MassDriver. Uh, with over 20 years of experience in DevTools, SRE, Platform Eng, and Super excited, uh, to have you on the show today, Corey. Welcome.

Cory O'Daniel:

Yeah. Thanks for having me.

Dan Lines:

You wrote this, I was reading it before coming on the pod. You wrote this killer, article. I was actually late to the pod cause I started to click around to the different links. And of course, you know, we've talked with a lot of different, you know, pros and DevOps and Platform Eng and all of that. But I, and I, and I know you're probably doing it on purpose. You wrote an article called DevOps is Bullshit. So, you know, I'm shi That's you. I'm sure we'll include it. Where has DevOps gone wrong? You know, how, how, how are we failing on this? Or are we failing to meet business needs? Like, what, what is your, like, high level take on all this?

Cory O'Daniel:

Yeah. So I think, I think one of the key things and, and depending on how far you get into the article, I think you, you realize this, but I'm, I'm on your side. If you've got DevOps in your title, I'm on your side. If you know that DevOps shouldn't be a title, I'm also on your side. But like the way we've gone about it for the past, what is it? 15, 16 years or so, since like the term was coined, like we've, we've gone, we've gone about it in a pretty bullshitty way. Um, and. You know, when you look at the trends over the past few years, like looking at like Stack Overflow survey, looking at number of breaches that we have, like the, the amount of operations experts or expertise that we have as like a community is going down year over year because we're making so many developers, you know, out of bootcamps, out of colleges and whatnot now, and we're not teaching them the cloud and operations and whatnot. Like, it's just kind of, Hey, you'll get there. Eventually you'll learn it in prod at some point in time. Like you don't come into it. Knowing it. Like you do, Hey, I came in from a bootcamp. I know Rails. I came in from a bootcamp. I know React. I came in from a university. I know C well, this thing that is extremely important to how we integrate software or work with other people. It's just, we just, we're just winging it when we get there. Right. And so like, that's what the article is really about. It's like, we've gone, we've been a bit bullshitty about like the way that we've approached it. And what you see a lot is people are like, well, you're, you're just not culturing it right. And it's just like, well, yeah. Most people aren't culturing it right. You're right. But most organizations don't have the resources to culture it right. And so this whole idea of like, Hey, everybody should do DevOps. We're doing DevOps. You're doing it wrong. You're holding it wrong. Like, I think it just kind of undermines all of us doing the right work. And I think we need to figure out a way of getting that experience into developers hands earlier.

Dan Lines:

Yeah, no, actually, that, That's a good, good take. You were saying, hey, you know, so first of all, there's more developers in the world. Everyone's a developer now type of thing. There's boot camps, great career, all of that. Very difficult to simulate what it's actually like to work at a company, release code to prod, what it actually takes to do so. I don't know if we'll ever be able to do that in a boot camp. Maybe. Right? So, I think you're right that most people coming in to their first few roles, like, all this stuff is new. It's not taught. And I also remember when DevOps, maybe it's even like, is it like 20 years ago now? Maybe it's like 15. When DevOps, that terminology first came out, I remember some teams just being like, Yeah, we're going to call ourselves DevOps now? I guess we were Ops before and now we're DevOps, and so we're like cutting edge.

Cory O'Daniel:

Nice little pay raise, get that cool little title, get a couple of, get a couple of thumbs up on LinkedIn for the role change. It's nice. Now you just got to do the work

Dan Lines:

you gotta do the hard part. You gotta do the hard

Cory O'Daniel:

now you got to do the hard part.

Dan Lines:

And when we're talking about like DevOps being bullshit, are you more so saying like, Hey, we haven't actually like changed or is it like maybe we have changed, but we're kind of missing it? Like what, what are your thoughts there?

Cory O'Daniel:

I think, I think there's two or three things. So I think one is for many organizations, like much hasn't changed. So, you know, in my day job, I interact with some very large organizations. Companies been around for 50 years, a hundred years. And then you get in there and they're like, I'm definitely an ops person. I click a bunch of stuff in the browser for AWS to build it for my engineers. And they ask for stuff through ServiceNow. Like they want, they want a database. They ask me through ServiceNow, bug me via Slack for a couple of months and I'll get there. I'm a dev op and it's just like, you're not. I mean, your work is important, but you guys haven't figured out how to do that. And it's like, it's still some big companies that are still there. Like we haven't. Moved much in 15 years, but at the same time, while we haven't moved much, the world around us has just absolutely changed. Right. So like, um, there's a followup article to, uh, DevOps is bullshit called elephant in the cloud, where I go into this a bit more, but if you look at like what we do as software developers, our job hasn't changed much in like 50 to 60 years, like we write some code, there's some ifs, there's some variables, there's some loops, like sure, the syntax changes from language to language, but like. Writing software is writing software. If I take a node engineer and I throw them in a Ruby codebase, it'll take them a while to realize that it's sane and there's no event loop, but they'll get it, right? Now, when you talk about the cloud, the way, like, everything has changed about the way that we operate software in the past 20 years. Like, 20 years ago, let's say 24 years ago, we were on metal. And then there was like the, the VMs started coming out. It's like, okay, we're putting things on VMs, on metal. And then there was like the slice hoses and then there was the passes, and then there was like this like brief like containerization Mesos moment. And then there was the Kubernetes moment, and then there's lambda, and now there's serverless containers. And like everything about the way that we've operated software has changed. So that's interesting. That's changing on us constantly, but also the footprint of our applications has changed significantly. Like people are always talking about, ah, monolith versus microservices. But I think the bigger pain point in our applications is even if you have a monolith, There's a good portion of it that is secretly microservices because you're using 14, 15 different cloud services to do your thing. I've got a Ruby monolith, but I'm using SQS. Okay. Well, that's interesting because now this configuring a cloud thing just became a part of my job. If I'm doing DevOps, I should be able to do that. Okay. Now I do do parity. Okay. And now I got to do compliance. Okay. And now I got to do the security around it. Okay. Now I got to become an IAM expert. Okay. And now a whole bunch of stuff got stacked on me. And I'm just trying to build a feature. I'm just trying to fucking process a queue, man. I'm trying to get some stuff out and get some stuff done. But now all this other work came up. Right. And so when we look at like DevOps 15 years ago, it was, Hey, we got to figure out how to like ship stuff more frequently, get things on to some of these VMs. And like the big cloud wasn't a thing then. Right. And now where we are today is you have a lot of cloud around you, not just underneath your application. It's integrated into your application that you have to figure out how to manage and deal with. And that's where I think it starts to get real bullshitty and hand wavy. Like that idea of DevOps, where you literally did everything, or like you've figured out a way to like make your team super harmonious. Like that's just hard with how things are today. And I think we need to look at it in a different scope. And I think one of the things that we really screwed up with DevOps is We didn't market it. Everybody just took that title. They took that money. They took the thumbs up on LinkedIn and like some companies got in, they did good marketing. They got the stakeholders to understand why it was important. They got them bought in and they got the culture to happen. And a bunch of other people were just like, ah, my, my title changed. Right. And I think that is kind of The source of the bullshit. People started to get different definitions of what it is. The entire world's changing around us. And just a whole conglomerate of services are kind of getting thrown at us now to like meet global scale. That like, what even is DevOps? And you could argue on Reddit all day long about what it is, but no one's going to agree.

Dan Lines:

Yeah. I mean,

Cory O'Daniel:

that's my rant.

Dan Lines:

think, no, I think it's right. And I remember when it was like 15 years ago, I'm looking at your I think it's actually the picture where it was like, Worked fine in dev, ops problem now, With a house on fire in the background. Like, I think that's what the original dev ops, At least to me, the meaning was, Where it was like developers are throwing things over the wall to ops, And then there's no good communication. I don't even think that is really what it is now. What caught on when you were talking is, It's almost like you're saying, I'm trying to get this feature done, I'm trying to get, I'm a developer trying to get the feature done, but now there's all these other things on my head that I'm expected to do. Um, are you seeing out there that it is like, on top of the developer to actually like, coordinate all this stuff, know all the technology, everything that it takes to get off of it? to like, uh, production? Or are you seeing it more like, Okay, if you're doing DevOps like, properly, in your opinion, it looks different than that?

Cory O'Daniel:

Yeah. I mean, I think that's part of the problem is like so many people, like we see so many different teams that like, when they're like, Hey, we're yeah. Like I'm the DevOps team or we do DevOps. I have no idea what you mean until like 15 minutes

Dan Lines:

Yeah.

Cory O'Daniel:

I have to like, I have to see your world. I'm like, Oh, I get, yeah, we can, we can call this DevOps. That's a little DevOps too. I get it. Like, you know, eh, like you're just doing it your way. Right. And like, and so you do see teams where it's like, I'm responsible for everything. We've shifted it all left. And like, that's, that's the rub is like, there's, there's too much to shift left nowadays. Right? And like, and that's kind of where, you know, in the post, like I moved towards platform engineering and it's like, Now there's caveats there too. Like there's some bullshit in platform engineering. There's a lot of bullshit in platform engineering too, but like what we need to be doing as teams that have the resources to do DevOps or platform engineering effectively, and as providers like MassDriver, AWS, et cetera, is starting to build more of this toil and tasks that devs are kind of getting caught up in, into their platforms. Whether you're building it or whether you're buying it, like we need to start doing more of that because just shifting everything left onto the developer, isn't, isn't working. It's not what you pay them to do. You pay them to build features, to build, generate revenue. I don't pay a

Dan Lines:

Make value,

Cory O'Daniel:

fiddle with checkoff. You're right. Like, it's like, go do something that makes value. Right. At the same time, we have to remain compliant and secure and all this other stuff. Like whose job is that? Right. And you might say the security team, but now you're tapping and you're waiting on somebody. Right. And so like, like the whole thing, it's just like, like we don't have. We got to grease the gears of the way we're doing this, right? Everybody's becoming a software company now. There's more stuff moving to the cloud and we're not building out this expertise. And we're either saying, slow it down and tap that guy or it's all your problem. And neither of those I think are effective strategies for scaling in the cloud. Scaling you or people in the cloud, not necessarily your compute.

Dan Lines:

I would love to pick your brain on platform engineering. I know it's another word, so it could be bullshit as well, but there's like DevOps, there's platform engineering. When you think about what platform engineering is, can you describe to our, like, what comes to mind for you? If you think of like some elite platform engineering team, like how does that look to you?

Cory O'Daniel:

Yeah, so this is where the bullshit starts to come in because the exact same thing is happening with platform engineering. So if you were to talk to me like, if you were to talk to me like two years ago, I was like, I was sunshine, rainbows, kittens, all the good shit on platform engineering. Because I saw an opportunity there. I saw a word that was getting popularized, books are getting written, blog posts are getting written, and I was like, this is the moment where we can take this word, and we can market it correctly within our orgs to get the buy in, and maybe actually use that to actually get the fucking work done. And But what's happening is we're just, it's, it's rebranding very similarly to Ops to DevOps, DevOps to SRE, SRE to Platform Engineering. So you meet some teams where it's like, Hey, we're doing platform engineering. We have this, you know, separate part of the business where it's like, we're building this thing as a product and we're giving it to our developer customers to use, like they're doing like the platform engineering. And then you meet a lot of folks where it's like, Oh, what do you do as a platform engineer? It's like, ah, yeah, we, we manage all the infrastructure in AWS. It's like, oh, how do you do that? And they're like, uh, we just, we're not software developers. So we just click, just click around the AWS console and they're like, oh, wow. So it's really exactly the same as it was 10 years ago. It's just a different word. Right. And so when I think platform engineering, I I'm thinking of the company that you work for has decided that they've gotten to a level of maturity in the DevOps maturity model where. They can actually do DevOps well, and they're ready to take that to the next level. We need to pull more of this responsibility off the devs. We need to start building something that feels a bit like maybe Heroku. Like, it feels like a pass, it feels like a platform. It feels like I can write code and it just goes and runs. I can build it and run it without taking on 400 other jobs. Like, that's the goal. Of platform engineering. I think that's the distinct difference between this and where we were 15 years ago, because we didn't have the cloud then. And we see a lot with platform engineering is infrastructure provisioning, being a part of it, talking about the cloud, talking about getting self service of cloud resources, which wasn't like the terms that we were using 15 years ago. But these terms really matter now. As a developer, I need a cue. Do I need to know how compliance, security, IAM, Uh, doing parody with Terraform work. Do I need to learn HCL? No, I just need a Q. Why isn't it easy? I can do it in the AWS console. How do I do it at work without tapping you or going through hell and back to like set up a bunch of CI CD pipelines to like do it right. And I think that's, that's what platform engineering is. We're mature enough. We have the resources, we have the buy in and now we're building a product to solve the biggest problems of our developers, cadence and delivery.

Dan Lines:

I, that really clicked for me, the way that you said it. Cause I do think like, uh, If you have a platform engineering team or you have a DevOps team, like I always, I'm, I'm the type of person that has to go back to like what my mission is. Otherwise I can't figure out like what, what my purpose is on earth. So I got to know my mission. And what you said that I thought was cool is remember how it feels like to use a pass as a developer. Now, of course you're boxed in more and it's not, you know, you might not be able to use everything that you want to use, But the cool thing about it was you could just code and then like the other stuff just works. Like the app just runs, the code makes it to where it needs to be made it to, the customers get it. It's like super simple. That's the coolest thing about it. And I think you said something like, if you're a great platform engineering team, the vibe that you're giving to your developers that you're servicing, that's kind of how they feel for your company. They can just do their job. Like, that totally resonated with me.

Cory O'Daniel:

I feel like you should be able to take that team and be like, man, this platform engineering team has done such a good job. We're spinning them out as their own business. Like that's like that to me. I'm like that team is killing it right there. Like if that's the way you feel about him. But there's also many of these teams where it's just like, Hey, we're the one, we manage Kubernetes for everybody. And like, this is a place where I see like, I'm a, I'm a Kubernetes fanboy. I develop on Kubernetes locally. I like the thing, but there's a lot of people that are just like, Oh, we're platform engineers. Kubernetes is our platform. It's like, no, no, no, no, no. That's, that's not it. Like, if you think about like what Kubernetes is and how big it's getting and how its control plane is expanding into controlling the cloud as well. Kubernetes is just kind of another cloud API, right? It's just another cloud. That's like saying, Oh, our platform, we have a platform engineering team. They manage AWS. It kind of starts to feel the same. Right. And you can't just like throw a Kubernetes at a developer and be like, we platform engineered it for you. Like there is DevEx that needs to happen around it. There is some smoothing out. Like if your developers are thinking about Kubernetes all day, I'd say that you haven't done platform engineering. You're still in that world of. Like mixed, devy, opsy bits, right? Right. But that's, but that's, that is interesting. Cause it also starts to get, it gets interesting, right? So it's like, you see these companies where it's like, they have a team that just manages Kubernetes and that's their job and that's a hard job. There's a lot of work to do there, right? But who, like, what is that team? Are they DevOps team? Are they ops team? Are they the, you know, the compute management team, right? Like, but you know, calling it platform engineering where you're not actually doing engineering work to build out this stuff that's going to serve these engineers well, like, I feel like that's a, just another misuse of the term, but that's where we are and that's fine. Um, so I'll be writing a blog post here in about two years called platform engineering is bullshit. And

Dan Lines:

And I'll have you back on the pod.

Cory O'Daniel:

yeah, hopefully we'll find another really good word to use that like actually sticks this time, but.

Dan Lines:

Probably never. You know, I think whatever, whatever, whatever word it is, it will be weird.

Cory O'Daniel:

that's the thing is like, you know, it's really hard. It's so easy to like write a blog post or like hop on a podcast and talk about like the way things should be. But the reality is, is like the real world's messy production is messy. Organizations are messy and your teams are probably messy and your code's probably messy. Like there's, there's a ton of mess, right? And so like. Having this like, like this golden idea or a golden path where it's like, everything's going to fit into this perfectly. It's like, it's so unrealistic, right? And so when you start looking at orgs, you'll see orgs that like, you'll see the Googles of the world where they went and did their platform engineering and it came out as something called GCP, right? And then you'll see other companies like we do platform engineering and it's very much like, Hey, we write Ansible scripts, right? Like the way this is servicing in many companies, this looks very different, very similar to DevOps, but it's because our companies are different, right? Think about the average, like, Series A company, Series A, Series B. You can't give them a platform, right? I can't go to like a series B company and be like, everyone's using Heroku now. From now on, that's what you're doing. Sure. You could, there's some architecture decisions you're gonna have to deal with, but then you're, you know, your team that's working on like Next. js, JAMstack stuff, they're not going to have like the greatest experience on Heroku. They probably want something more like Vercel. Your AI and ML team's probably not going to have a great experience on Heroku, right? Like, you're not going to find this like one thing that's perfect that works for everybody. And that's the other thing that's kind of problematic about this is like, that happens at the team level. It happens at org levels. And I think that's why you're seeing like these very different definitions of what platform engineering is. It's people sitting around thinking, I, I got to make something happen. And this is the way I'm going to do it. It's going to be, It's just Kubernetes or it's this complicated software thing that we built. And sometimes they're doing it and they're doing it all in CI. Like you can get pretty far doing platform engineering and infrastructure provisioning and self service with just GitHub Actions. Like you can do it. cumbersome to navigate, but you can do it. You can deliver self service and DevEx that way. But

Dan Lines:

So many questions now. I mean, certainly the take away there is like. Platform engineering is specific to your organization or like the mission of what your engineering team needs to accomplish in order to sell the product that you're probably creating for your business. Like that totally makes sense to me. And then I think the, I would say like the level of eliteness that you go, you know, I think it's about the experience. Like I keep going back to what you first said, like, if I'm like the head of platform engineering, even if I'm like in a startup or a mid sized company, like, how am I measuring my success? Like, how would I rate myself? Or like, what is a metric that I would use in order to say like, I'm in the right way or am I just doing bullshit stuff?

Cory O'Daniel:

Yeah. I think this one's, this one's an interesting one. And my answer is going to sound a little bullshitty, unfortunately, but bear with me. You can go and define golden signals and SRE stuff. You need to, like, those are, those are things that you have to measure when you're running a platform. Those are things you should probably be measuring when you're running a product for users. Like if you got users, you should care about the thing being up. I'm going to kick all that shit to the side. I'm sure people are stoked about measuring that. Two things matter. Dev happiness. And how unblocked people are like that's in the grand scale of things. I think that's what really matters. Are developers happy with the way that the thing works? Is it solving a real problem for them? And is it. Improving the cadence and delivery of our software. And that second one's a little harder to measure sometimes, because sometimes you'll have teams that like the stuff's real stable, right? Then maybe they're not releasing 14 times a day. Maybe they're releasing once a week and they're making billions of dollars. Right? So like, what does that stability look like? What does that release process look like for those developers? Is it breaking often? And like, we need to, we need to make that a bit smoother, but that's really, I think what it comes down to is like that. The, the qualitative happiness of the developers using the thing. And then that, that delivery experience, my shipping software faster and my shipping software with, with less like build failures of getting the thing out. Now you can come in and like. Come up with like plenty of other ones. I think you could, again, like this comes back to the org. Like it really depends on like what the mission of this platform engineering team is. And since it can pan out so many different ways in orgs, it's hard to like pinpoint it. Like those are the two that I find most important, but you'll see other people where it's like, Hey, one of our goals is like minimizing cloud costs. It's a weird one, but like, we've seen it, but it's a real problem. Developers have just been clicking stuff forever. Or you have an ops team has been doing it for devs, but since they're kind of a world apart, like there's a bunch of stuff in prod that they don't need anymore. And it's like, Hey, by moving to platform engineering where everything's self service and like the devs kind of own it, but aren't burdened with the, you know, the actual like monstrosity of it all. Like, can I get my cloud costs under control? Yeah, you probably can. And that might be an important measure for you is to like get that under control and keep it stable. Or maybe you only see some growth that's relative to like your traffic or whatnot. But, um, I think what it really comes down to, like how you know that you're doing a good job is people are stoked and people are, are doing their jobs better, faster with less breakage.

Dan Lines:

Yeah, I think it makes sense. I think for our listeners, they already know this about this pod and LinearB and all that. But yeah, like on your second point of the measurement. They're coming to us platform engineering teams. I need to measure wait time for my developers. I need to measure cycle time. I need to see how easy it is for their code to get out into production. What phase is it stuck in? Where's the bottleneck? Like that's the type of metrics that I'm familiar that they're looking at. So that's like The quantitative side and of course on the qualitative side like you said, you know, how stoked are they? Do they want to keep working here? Would they recommend? Are they saying like, yeah, other people should come work here? That's like the qualitative like a survey site and you put both of those together And I think you have like a pretty good picture But I did ask you because I do think it's like I'm always trying to Get in my mind if you're a listener here to say like, okay I have a platform engineering team or I'm on a platform engineering team. Like what does good look like? What does good actually look like? And if I know like the criteria to measure, like, for example, for us, you know, at LinearB, we want to make sure, I think our benchmark is like 24 hours or less to go from coding to production. Like if I'm coding something and I need to get it out to prod, can I do that in 24 hours or less? Like that's what elite looks like in the industry. What does not good look like? Five days, six days, seven days, that's the type of thing that we're highly familiar with.

Cory O'Daniel:

yeah. And that's another fun one. Um, like, I don't, I don't know what the, what the audience is like, like what stage of, of, of business

Dan Lines:

All over, yeah. I think it's like all around. Like you got some listeners in startups, you got other ones that are like, Hey, I'm at a big enterprise, but I've been tasked with, uh, I need to start a platform engineering team or a developer experience team.

Cory O'Daniel:

so it's really interesting is like, you know, I've, I've been in this like DevOps world for 15 years. I've been on ops and the development side. I don't think I've worked at a company in the last 15 years that people couldn't get a feature out on their first day. that to me is like a good dev team. It's like, I should be able to run a command or two and like my environment's up locally and I can ship something to prod on day one. Doesn't have to blow the world apart, like away, like just get something in there. So like, you feel like you've contributed to this thing on your first day, right? Like that's a great place to be. Now, when you say like, Hey, sometimes it takes five days, six days. Some people might be hearing that and be like, five days, that's forever. Some people are probably listening to this thinking, fuck, I wish we could get it down to five days. Like we're sitting at 30. Those people that are like five days is a long time here. 30 and they're like, 30 is a real long

Dan Lines:

Yeah.

Cory O'Daniel:

How often is this happening? Well, here's one of the things that's really interesting. Like going back to like DevOps and bullshit and like how we're still kind of where we were 15 years ago. If you look at the Dora report, it's got DevOps in the title. 50 percent of the respondents that are tuned into responding to this report say that an outage can last them up to five days, and the average time of deploying software sits around once every 30 days. 50 percent of the people that are tuned into the DORA report are saying, We are not doing this well, right? And like that, that's, that's one of those numbers. Like, again, we can't sample the entire world. We can't survey everybody. But like, when you start to see these consistent patterns around, like just ops skill vanishing, that's, that's, I think the power of platform engineering, again, whether it's in your org or whether it's, you know, somebody like MassDriver, Humanitech, Kovri, like all the people in this space, like we're trying to make these operations teams. More efficient. We need to scale those people. That's, that's our scaling problem in 2024. It's not scaling compute. It might be scaling your GPUs, but it's scaling these operations folk. Everybody needs them. A lot of organizations struggle to hire them. A lot of people are just hiring people straight out of college and saying you're the DevOps person now. Like that's such a common trend on Reddit.

Dan Lines:

That's the trend? Saying you're the DevOps person now out of the school? That's, that's scary.

Cory O'Daniel:

go on Reddit slash DevOps. And the amount of, the amount of people where they're just like, Hey, I just got my first job and I'm the DevOps engineer. And it's just like, Oh, like that's

Dan Lines:

like you've never released code before in a professional environment. You're the DevOps

Cory O'Daniel:

yeah. And it's like, Hey, we got like 8 million GPUs that we're managing. And it's just like, Oh, you got, you got just tossed into a world of scale there. I feel like I see that almost every day or two on like Reddit DevOps. And somebody's just like, I'm brand new to the career. And they put me in this role. Like,

Dan Lines:

Okay, that's, uh, take like a detour for a second. Why Why do you think that is it? No one wants to do this. Like, why is that happening? No one wants to do this job. Nobody wants it's like that is that is the scariest and oddest thing. I know I'm more old school because when I was developing it's a while back now. But we definitely wouldn't have been like, Oh, hey, new hire, you're responsible for operations or like, that's like, no way.

Cory O'Daniel:

So here's, here's, here's one of the things that we see that's kind of scary in like the hiring. So, so one of our early go to market strategies was looking at people that were hiring their first ops role. And so, uh, uh, mass driver, like we, so we help people with cloud operations, platform engineering, gotta, gotta sell that stuff. but we were looking at people that are hiring this first role. I'm like, okay, great. Like, let's go and mine a bunch of job descriptions. It's a great way to figure it out. Like you can, job descriptions are actually really funny. Like, I don't know how many hackers use this, but like, you can get a pretty good idea of people's infrastructure, just like looking at their job descriptions. Right. And so when you see these people that are hiring their first ops person, nobody's hiring an ops person proactively. Nobody's saying, Hey, you know what? We're almost done with our MVP. Let's go ahead and poach a DevOps person from Google. No, they think about that first DevOps person when the world is falling apart or like people are mad and they're like, we should probably get a DevOps person. So now you're coming into a job. Series A, there's 20 developers and they're like, there's, we got 20 developers and you're the Ops person now. We haven't had one. There's no way in hell I'm applying for that job. Like I don't care how many commas, well maybe there's commas that you could convince me, not zero. You gotta put, there's gotta be multiple commas in there, but like coming into that job description. As a seasoned ops person, you're like, I'm going to have 20 customers day one. There's five years. This company has been in business five years of debt racked up. Like, like we're looking at some of these job descriptions of like what they expect out of you in this first operations role. It's like, that's a lot of work. I know there's a lot of debt. There's going to be 20 frustrated people day one. And like, you have to be a certain type of person to be excited about that. And also you see with a lot of these orgs is they're not paying Google salaries. Right. And so I was like, okay, like they got this DevOps role and it's 85, 000 a year.

Dan Lines:

yeah, so it's like, it sounds like it's a, it's coming from like a desperation without the proper investment. That's the way that I would, I would put it. Desperate, but not properly invest. Okay, here's another thing that was coming to mind. So let's say, you know, we know we have a problem. Developers are complaining. It's tough to get code out to prod, like, you know, some of the basic stuff. And I'm trying to advocate for platform engineering and I'm trying to go to the business and saying, Hey, we have a real issue here. How, how do you get buy in or investment into this? Do you have any advice around that?

Cory O'Daniel:

I do. So first off is I feel like people are reaching for the term platform engineering. When they might be still want to be reaching for like that DevOps mindset first. Right. But the unfortunate thing is like, this is the hot term. It's the, it's the term that that like mid tier stakeholder might be like, Ooh, I heard that. I saw it in Hacker News. Like, I want to have that on my resume and be responsible for it. That sucks. Like that, that, that sucks is like an end for it. Right. So like, but that's unfortunately like the term that people are hooking on to. There's, there's a lot of times where you see these orgs and they're like, Hey, we need to do platform engineering. And it's like, yo, you got a good, Just good Dev and Ops and CICD practices down just In general, right? And so if we got to use the word platform engineering to get the job done, I'm fine with that. But what I hope people don't do is go out and think that they have to build a platform to do that because many times it's just investing in that CICD process. The amount of times we've seen people like, Oh, we're moving the platform engineering. Then we get in and talk to him. We're like, what's your biggest problem? And they're like, Oh, like our. Docker builds take 45 minutes to an hour. And I'm like, it's not a, it's not a platform engineering problem. That's a, you don't know how to write a Docker file problem. Right. Or you don't know how to use build caches problem. It's like, you can do a lot of that work, just like fiddling around in Docker and get up actions. Right. And, you know, you'll see other folks that are like, Hey, we're, we're getting into platform engineering. It's like, Oh, what's your, what's your biggest problem? It's like, we got a lot of cloud resources and we don't know what's used and what's not, and it's like, okay, well. That's tagging conventions and naming conventions and like, and like generating some reports from the AWS tools. Like you can call it platform engineering if that's what you got to do to get there. But like, until you've gotten to that point where you're like, we've got ops in a good place. Like we're starting to like, we're, we're fast, but now we need to be faster. We need to automate the automation, right? We need to stop thinking of. Automation, start thinking of like functional software, right? I think that's when you need to move to true platform engineering where you're building a product. But I don't think many organizations have like that need. They just, they just need their developers to be happier and move faster. Same goals as platform engineering, but like much different effort involved. And so I think the first thing you got to do is take a step back and be like, okay, am I using platform engineering just for the name? Or do I actually need to do like platform engineering? Or do we just need to be better about operations and, and software development in general and know what you actually need. Here's how you're not going to do that is you thinking, you know, what you actually need, like step one is take that initiative. And I think that's one of the best things you can do. If you're sitting in this role and you're like, everything's kind of fucked. Take the initiative. Don't go and ask a mid level engineering manager. Like, Hey, do you think we could like get a budget for a platform? Make a survey. Make a survey. Survey some people. The survey part is hard. That's one of the skills that we're missing as operations folk moving into platform engineering. But figure out what your team is stressed with. If you've got five developers, grab them all and be like, What is the thing that slows us down the most? And just see if you can solve that. Right? If you want to call it platform engineering to do it, that's fine. But solve that problem first. Make people happier. Show that there is success in your initiative. Find a KPI that gets the business excited. Hey, we just increased delivery time by 20%. Oh, faster features means more money. Yeah. Yeah. Um, imagine if we could get the rest of this stuff unblocked, if we can move a little bit more of that debt and get these people moving faster and you still might not be doing platform engineering. Like you might just be doing another small task that you need to focus in on. Like, I feel like so much

Dan Lines:

you can build up to, the,

Cory O'Daniel:

you can build up to it. But I think, you know, as far as like getting in and getting people excited, I think it's, it's really understanding like what your business needs to move to the next level of delivering software. And sometimes that's cadence, sometimes that's security, sometimes that's breaches, sometimes it's outages, right? And like, that's, that's what I think is most important is like figuring out what that is and how to solve it, like build up your, uh, You know, your credit did a good job fixing that one. Like, okay, now, like, our biggest problem now is we're moving the microservices. We're breaking this monolith up into five or six pieces. How much of a pain in the ass is it going to be? Get all of this stuff moving in the cloud. Okay, now maybe it's time to start thinking about, Let's bring in Kubernetes, create a nice little abstraction plane for developers where they're just thinking about their Docker files and not all this yaml. Like baby steps to get yourself there. Like, don't feel like you have to, you know, get a new repo and like build something from scratch that looks like, you know, MassDriver or Humanitech or Covery, like you can get started on fumes.

Dan Lines:

You know, I, I, what I really, I mean, that, that makes sense. What I really liked about what you're saying is like, show some improvement before asking. Like, hey, this is what I was already able to do. And at the end of the day, you know, from my experience, like the customers that we work with, what the business cares about when you go to advocate for yourself. Well, first of all, one piece of good news, businesses are very hot right now on developer experience. And making developers more productive, salaries are high, like you can probably get some buy in there. But they care about delivering projects on time. Like if you can ever come and say like Hey, you know, I, I talked to these five engineers, found what the bottleneck was, fixed the bottleneck, and they said this was the reason that that pro one of the biggest reasons the project was successful is you unblocked us. Now you can go to the business and say, I'm at, and usually, you know, you could be hiring right now. Hey, you want to bring on 20 more devs? This is like mandatory. We can't bring any more on without doing this. If you'd like to deliver projects on time. Like that kind of stuff, like building up that argument works. But I do agree that starting with like, Hey, I did something that helped the developers and I could do more if you'd like it. I think that works pretty well.

Cory O'Daniel:

Yeah. I think there is a challenge in that, in that statement though. Right. And I think there's probably a number of people that listen to the thing, like, how, well, how the hell do you just do it? And this one of the things I think that sucks about a lot of engineering works, right? Like if we're building bridges, right? Like let's say we're actually, we're, we're, we're real engineers. You know, we're building things and measuring stuff and making sure people don't die under it. Right. If I was working on a bridge, if I'm an architect and I'm like, you know what? I work on some like load bearing beam calculations and like, while I'm looking at these blueprints, I noticed that the bridge is made of sand. I probably mentioned something. I'd probably be like, Hey, that's not right. I might, it might not be my job per se, but my job is to be a professional. My job is to do engineering work, right? And like, while we call ourselves engineers in software development, many teams. And this is not a dig at you. I know it's your organization. Many teams are, are task doers. They're, they're people that write software to do a task that was assigned to them. Right. And our jobs as professionals is to do work for business people that don't know how to do the work, right? We're a tool at the end of the day. Like we, like we make software, they have an idea, we print out some software. And if we're not going about it in a professional manner, Like addressing debt and refactoring, right? Like, are you really doing your job? Right. And so when you, for the folks that are hearing this and thinking like, Oh, how do we just go and do it? It's your job to just go and do it. And I know that might be hard in your org, but like debt and refactoring and solving problems and making your development team faster is a part of your job. And so I have people ask me all the time, like, how do you get time for, like, to address technical debt? How do you get time to like refactor code? And it's like. I just do it. You want me to build this feature. There's something in my way of doing it faster. I'm going to refactor and fix that so that I can deliver this feature. Now, I think the catch there is there's a lot of people that get caught up in like the debt and they're like, well, I fiddled with that for two or three months instead of doing what you

Dan Lines:

And I never delivered the feature.

Cory O'Daniel:

Yeah. it's like if you can do what they ask and in the meantime, like you've delivered some success, that's good. And talk about it. Be like, Hey, I refactored this. I went in and added this caching thing that makes our builds faster. Like, like advocate for the work that you're doing and be like, all of this is happening because I did some refactoring or address some technical debt that was in my way. We should be addressing technical debt and refactoring that's in our way. And what you're going to see is, You're going to start moving up that DevOps maturity model. You're, you're actually probably going to start doing a little bit of DevOps, right? Like, Oh, I went and did some stuff to make this whole thing operate a little better, that felt nice and advocate from there. I'm not a big fan of, if somebody wants to slap my hand for making a build pipeline faster, slap away. Cause I know everybody else is going to be stoked about it. It's like, I would rather just do it and be like, eh, sorry. I, sorry. I did something you didn't ask me to do, but it made things better. So chill features still delivered back off.

Dan Lines:

you'll get a lot of, uh, backing. So it's kind of like a low risk. So we are coming up on time on the pod, but, cause I was reading your, your article it's a good article. Everyone's like, go and check it out. The DevOps is bullshit. I gotta ask you a minute about IDPs,

Cory O'Daniel:

Mm hmm.

Dan Lines:

internal developer platforms. What is your, I mean, we could do a whole pod on this, so keep in mind we don't have that much time, but like, I think they're, they're pretty hot right now. who I know is using them, it's usually like our larger customers. I don't know if that, that doesn't mean, I'm not saying that's the only one, but the people that I talk to that are using them, larger customers, more so on the enterprise for us, lots of services, also have large DevOps or platform engineering, whatever you want to say, large teams that, that But, I wanted to just get your take on IDPs, who do you see using them, who should use them, Is it all the rage or is it just a fad? Like, can you give us like a little something there?

Cory O'Daniel:

mean, I think IDPs are important. I think they're valuable, right? I mean, at the end of the day, what they are is they're an internal developer platform. What is that? That's a pass that you own, right? I mean, and there's a couple of them that are open source and you can install it in Kubernetes and you get it up and running quickly. And I think that if. That thing solves a need for your team. It solves a problem. They're great. I feel like there's a lot of people that are just like, Oh, if I get an IDP, that's going to solve my problems. And it's like, no, no, it's, it's, it's not like, right. Like it might even create more. And so I would just say like, you know, is that IDP the thing that you're willing to like tie your business to? And the thing that was like a little goofy about it again, is like going back to that, let's say you're a, you know, series D company and you're like, Hey, we're migrating the entire company to Heroku. You're probably going to have a fair number of engineers that are mad about that. Right. And so one of the things that we've always kind of had, uh, this idea of when we think about IDPs is like, you need a developer platform, potentially per team. Your teams can be very, very different. Right. And so, you know, the mass driver approach is IDPs as cattle. Like, I don't, I don't, IDBs is cattle. Like every project in MassDriver is its own IDP and it is very fine tuned to your team. So your team kind of comes in and says, Hey, this is our makeup. We're doing model builds and we're serving our application off of a Lambda. That's great. That's great. That is your internal platform for how it's going to work. Your services are going to bubble up. You can see them, but we're not jamming that same platform down somebody else's throat. Right? And so when you start looking at some of these IDPs, they're, they're, they're Kubernetes specific. They can run pods, stable sets, workloads, et cetera. That's great. Until you have somebody on your team is like, Oh, well we do a bunch of stuff on Lambda. How do I use this IDP? It's like, Oh, well you don't. Okay. Well, that's interesting because we just moved to platform engineering. We just did this IDP thing. It's going to make everybody more efficient, but now this whole group of people just has to still solve the problem on their own.

Dan Lines:

Yeah, interesting.

Cory O'Daniel:

That's, that's not great, right? So, I mean, I think that if you've got a pretty like homogenous team and way of delivering software, you might find an IDP that's perfect for you. I think in reality, like, as you start looking at your teams, you're going to, they're going to have very different ways of doing things and you're likely going to invest in one of these IDPs and then find yourself looking for something else. And I say this as a platform that like. People have come to us and they're like, Hey, we're using this IDP, but like, when, like we need, like our developers need to manage infrastructure too. And so then they're like looking at MassDriver and I'm like, why don't you just get rid of that IDP thing? Cause like we, we do that part too. And we do the infrastructure management stuff as well. But it's just funny. Cause like people come to us and they're like, Hey, we bought this thing and we thought it was what we needed. And it turns out it only solves like 30 percent of our problem. And it's like, yeah,

Dan Lines:

Yeah, so maybe this is then the next pod that we talk about because, you know, I do think the IDP, like, I'm hearing a lot of buzz about them, and it'd be great, you know, to get into the details of like, what it really solves and doesn't solve. We know it's not a cure all. But, uh, Corey, it's been awesome having you on the pod. I think a really fun, uh, conversation. Thanks for coming on the show, man.

Cory O'Daniel:

yeah, I appreciate it. Thanks for, thanks for listening to my rambles.

Dan Lines:

listeners, please subscribe to the Dev Interrupted Substack channel if you haven't already. You'll get access to our weekly newsletter and exclusive articles. So, hope to all see you there. And then, Corey, one more time, thanks man, it's been awesome.

Cory O'Daniel:

Yeah. Thanks so much.

People on this episode