00:00:00I'll put it this way open up a clod code project and ask it to design you a bridge
00:00:05Or a skyscraper
00:00:08And then I want you to take that design and I want you to go and build it
00:00:12And then I want you to go and sit in it. Are you going to be comfortable doing that?
00:00:16Are you going to be comfortable driving over that bridge that it designed?
00:00:19Right until until you stop shaking your head. We need those engineers in place
00:00:25Welcome to the better stack podcast where we have conversations about software development ai and all kinds of new technology
00:00:32I am one of your hosts andris and i'm joined today by wishes. Hello and
00:00:38Amin astani. Hi. Amin. How's it going? Happy to have you here?
00:00:44Yeah, uh, thank you for having me. This is a pretty great pleasure. So from what i've known about you
00:00:51I kind of think that you are
00:00:53Kind of like an sre wizard. You're very deep into that realm you host your own podcast on that topic
00:01:01So we're just curious about how did you start in that area? And what's been your journey through the
00:01:08software development career so far
00:01:11Yeah, good question again andris richard. Thank you for having me. Yeah, how did I get started?
00:01:17Well when I was in high school my girlfriend's mom introduced me to red hat linux
00:01:23she was taking a computer science course in community college and she
00:01:29Just showed me her desktop, which wasn't windows. It wasn't mac. I was very confused what it was was a kde desktop running on
00:01:37Red hat linux and I was enthralled and I started to get into it all the way back in high school junior year
00:01:44of high school and then got my um
00:01:47Cs degree and been very interested about infrastructure been very interested about operating systems and teaching a computer to do things
00:01:55That previously we didn't have control over so that was the beginning
00:02:00So you've always been like a proponent of open source software. I imagine if you're using linux
00:02:06Oh 100
00:02:08Linux has been my daily driver
00:02:10for over two decades now like since
00:02:122005
00:02:15Okay. Yeah, and when was the point where you got very serious into sre?
00:02:20Yeah, that was around 10 years ago. So prior to that I was a classic operations person at a rapidly growing
00:02:29Past sas company called aquia. If you ever heard of drupal the founder of the drupal open source project
00:02:34He founded a company they did the professional services and the hosting and the support for drupal websites
00:02:38So I was in their operations team, but as the company grew and it was rocket ship growth
00:02:45Lots of lots of big clients whose names you'd recognize with that scale came a lot of manual toil. There we go
00:02:51Now we're talking about sre now. There was a lot of manual toil. There was a lot of incidents
00:02:55We were starting to come up against the reality of operating at scale versus the architecture that we originally
00:03:01Started with and the processes that we started with so I started reading the phoenix project. I I got um into
00:03:09The practice of cloud system administration, which was the first book on sre
00:03:12It wasn't the google sre book it would sre was like a few pages and it had 12 practices in it
00:03:17And again, I was very very intrigued and and started to practice it
00:03:21And I don't actually know what is the phoenix project. Oh my goodness
00:03:26Yeah, yeah. Yeah, so the phoenix project was like the original devops book
00:03:31Um, and it's a really good book
00:03:34Yeah, so it's a story and it's it's about
00:03:39like the vp of it operations dealing with some significant organizational problems and
00:03:45He through the guidance of like this mysterious mentor who you originally later find out. He's like a board member of the company
00:03:53he's teaching him about
00:03:55like
00:03:56Old school plant manufacturing concepts and how it directly applies to um software engineering and it ops and
00:04:05It introduced a lot of concepts that I still use today and talk to clients about today because it's extremely relevant
00:04:11So yeah, that was the original book but then
00:04:13Beyond there. I read a whole bunch of stuff a lot of it in the world of management
00:04:18So in addition to all the technical stuff that i'd be reading just to skill up and being an sre and as a software engineer
00:04:24There's also the management stuff because devops and sre is a socio technical practice
00:04:29You're bridging you're bridging the gap between what technology can do and how humans use it to make good business outcomes
00:04:36And I imagine there's a lot of human error involved in that and handling with it, right?
00:04:42I mean, yeah. Yes, absolutely. I mean shoot there's a book about human error that that you should probably read too
00:04:50Yeah, uh, there's a gentleman I his name unfortunately escapes me right now, but he is a
00:04:57Like professor in sydney australia and all he talks about is like airline accidents and how human error, right?
00:05:05So huge like it's very useful in post-mortem culture like human error isn't where we end
00:05:10Like if you say oh, yeah, there is the reason why we had this incident this production incident was human error
00:05:16No, that's the beginning of the discussion. That's not the end of it like what factors contributed to
00:05:21You know that person in that place fat fingering that command and blowing out the production database
00:05:27Like should he should he have had access to the production database where the tools are ergonomic? Did he have sleep last night?
00:05:34How much did he get paged? You know where they're there's all kinds of things
00:05:38But yeah, when people say human error, I immediately identify with with that book. I'll have to give you a link for show notes
00:05:45Yeah, I I think I remember the first my first
00:05:49Introduction to end-to-end testing was I was listening to this course where someone said
00:05:54about this incident where there was some kind of a software developed for
00:05:58Surgeons and you had to like click the buttons in a certain order
00:06:03But the surgeons got so used to clicking them that they clicked them
00:06:07So fast that the software started glitching and nobody had expected that that might happen
00:06:13So that's kind of interesting
00:06:15Yeah, definitely intersection between humans and and automation is is something that we have to always pay attention to in the products that we operate
00:06:25Yeah
00:06:26And then I also heard that you started
00:06:28With dev ops and sre during the beeper era where you had to use pagers. Is that correct?
00:06:35Yeah, actually, yes the first time I was on call I was issued a physical
00:06:42Pager and now granted. Oh, wow. What year was that?
00:06:46It was 2005
00:06:48Oh, wow. Okay. So so my my first my first tech job. Oh, this is great
00:06:55Uh, I was working in a garage as every successful startup does in in plant city, florida
00:07:02A good friend of mine
00:07:04Her father had a side hustle where he was running a web hosting and email hosting and dial-up isp out of his garage
00:07:12His name is his name is curtis fellaini. I I have mad respect for him and he um
00:07:18He taught me everything I knew but yeah, like we were running this this on-call tool called what's up gold?
00:07:24And all it did name I know like and the the the what it did it didn't do hdp checks
00:07:33It was doing icmp pings to the servers. So just old school ping and if it didn't get a result back
00:07:40It would send
00:07:42A page and I had a pager I carried a pager for for a little while when I was uh working at that place
00:07:48Immediately after when I was uh working at my alma mater doing super compute hpc. It was nagios
00:07:54And you know text messages to your cell phone
00:07:57Was that like a deliberate choice or by that point like newer technologies had already emerged like you said sms messages and stuff like that
00:08:07I mean, yeah back back in those days there there was just more tools
00:08:11Granted limited but there were more tools available to monitor
00:08:15Infrastructure at the time and when I was at um, the hpc department at university of south florida
00:08:22We were already dealing with hundreds of server scale because it's an h it's hpc
00:08:27It's like you you have racks of servers that are doing compute jobs. Um, so we we needed
00:08:31something a little bit more than
00:08:34uh than a pager and the content in the page also mattered because if you just got
00:08:40You know a message it's like oh, okay. I need to go check to see what the what the thing is from the beeper
00:08:45but um with nagios you got at least some context like this server
00:08:50Is down or this service attached to this server is down. So even then
00:08:56Back in those days you had some sophistication in terms of being able to configure
00:09:01you know
00:09:03What you wanted a monitor on a per host or per service basis?
00:09:06Like definitely pre pre pre pre slo we're not even thinking about that
00:09:13We're just thinking about is this port open and giving me a response that I expect
00:09:17Yeah, wow 2005 I think I was in the secondary school at that time. I wasn't even thinking about like
00:09:26computer
00:09:28science back then
00:09:30Yeah, I don't look it but but i'm older than I look. Yeah, I realize that well you look very good. Thank you very much
00:09:37So yeah doing a bit of research when you it seems like you worked at meta what did you do?
00:09:43And what was it like working at meta?
00:09:45Oh my goodness. Uh, so I was a production engineer
00:09:48and a production engineering manager, so
00:09:52What that basically means it's their flavor of a sorry and I worked on two projects the first project which was just a few months
00:10:00I worked on a team called conveyor. There's actually paid public papers. I think for i triple e about
00:10:06Conveyor conveyor is probably one of the largest cd systems on the planet
00:10:11So all of all of the build artifacts that come out for the backend services at meta
00:10:18Go through conveyor and there's progressive rollout of all the changes to like thousands upon thousands upon thousands of backend services
00:10:25Are they strictly internal to meta?
00:10:29Internal only yeah
00:10:31But yeah, there's a paper by boris gubrick that you should go read about
00:10:37What they built and it's very very interesting
00:10:39So I was there for a few a few months when they were scaling it
00:10:43And helping them with their alerting because they were early game
00:10:46It's like anything you get like a hundred hundreds of alerts a week and i'm like wait
00:10:49let's clean this up and
00:10:51You know get get their get their observability into a place where at least they knew when failures were happening and then I also
00:10:58put together some
00:11:00What would you say like some emergency procedures to shut off key features using their feature flagging tool
00:11:06Um just to make sure that we were able to recover quickly from from incidents
00:11:10So I did that for a few months, but most of my chain year was spent
00:11:13on another team that
00:11:16Was in an internal heroku of sorts. So you have very simple stateless services
00:11:23um, and you know teams are turning these out all the time and you need to run them somewhere and that
00:11:29Team and service existed to make that process simple because in the old days standing up a service was very difficult. Um, so I was the first
00:11:37production engineer
00:11:40On that team and it was it was pretty wild stuff. But that's what I did there
00:11:44It was it was a definitely a lot of fun. There are people that I worked with that
00:11:48Just had brains five times
00:11:51mine in size, you know, like the people you work with the companies like that are just
00:11:55Super brilliant and I learned a a lot there. I was gonna say I think there's a company called honeycomb and
00:12:02The person who started it charity majors was that meta?
00:12:06Do you know her did you meet her?
00:12:09Um, I have that's very that's very interesting
00:12:12You asked that I I met her at an observability days event a year and a half ago in boston. I met her in person
00:12:20yes, she did work at meta, um, and
00:12:22the inspiration for honeycomb came from a
00:12:26planetary scale observability tool called scuba
00:12:30right, so I used scuba and really loved it and i'm delighted that that she um
00:12:37Is founding a company to to do that she and I have been talking on linkedin on and off a little bit lately
00:12:44because of our common interest in
00:12:47how agentic coding is going to influence reliability, so
00:12:51We've we've been talking on and off about that subject
00:12:54But yeah, she's a really cool person and it's it's a privilege to to chat with her from time to time
00:12:59Yeah, yeah, so actually that's a nice segue into what we also wanted to ask you. Where do you think ai
00:13:06Will take asari and devops and all these practices
00:13:13Oh goodness, okay. So amazing plug. Thank you very much. I'll get right into the to the thesis here. So
00:13:19we have software engineers that have been given a
00:13:24An amazing tool which is agentic development. Everybody's like getting code magically generated from wim, uh from from tools like clod and whatnot
00:13:35So what this means is that there is going to be an order of magnitude amount of code
00:13:39that's going to be flowing through our
00:13:42value stream our cicd pipelines our testing our review our code deployment our incident response our change management procedures like
00:13:50All of all of this code is going to is going to go through that system that each of us have built for our companies
00:13:56And that means we're going to stress test
00:13:58All of those capabilities. So for example
00:14:01There is a a linear
00:14:04relationship
00:14:06Between the number of changes you can make to a piece of software and the number of pages or the number of incidents that you receive
00:14:12That is established fact
00:14:14so
00:14:15It's reasonable to assume that if you increase the flow of code through through your pipeline by 10x
00:14:20You're going to have 10x alerts
00:14:23so how
00:14:26Would your organization respond to that right is really the question. So from the sre side I anticipate
00:14:32That in the future we operations people we're going to be very um
00:14:39Very popular because we're going to be where the constraint is now
00:14:42It's no longer the matter of right like writing code. Um, and and and shipping it. It's now how do we operate it?
00:14:49Is it working? Is it meeting customer expectations?
00:14:51um
00:14:53and
00:14:55And so I I anticipate that being the major shift
00:14:58It's a it's a huge challenge because a lot can go wrong during this transition
00:15:04You can like I mentioned you can have a lot more incidents. How are you learning from failure and
00:15:09You know, are you continuing to do the postmortems? Are you building a cicd pipeline that actually works, you know and
00:15:15Basically makes it so that human effort is always high leverage
00:15:20So the funny bit is that we've been preparing for this event for decades. It's just that you know
00:15:27We we tend to think about it in terms of big tech where you have 20 000 engineers
00:15:31but
00:15:33Now we have smaller organizations that could do 10 times the code and therefore the flow is the same
00:15:38So all of those fundamentals
00:15:40That those larger companies are, you know have been writing about and been talking about for for the past decade now
00:15:47We're all going to need to actually do
00:15:49Like all the fundamentals you actually have to do them. And so there's that
00:15:53there's also the matter of
00:15:56Organizations like software development organizations have organized themselves around the idea that oh it takes the longest like the most amount of time
00:16:03Like the biggest time suck is coding
00:16:05So we need to make sure the engineers are spending as much time coding as possible if that's no longer true
00:16:09We have to reorganize ourselves around those activities that are now taking the most amount of time
00:16:15So there is going to be a bit of a shift in terms of how we run these organizations and then finally
00:16:20And I think you've seen this in terms of the AISRE trend
00:16:23Which i've talked often about and in my podcast that we are going to need to skillfully introduce
00:16:31That technology into places appropriate where we can get a lot of leverage out of it to do those operations, right?
00:16:37Because like for example, there are companies out there that I think a good example is incident.io you get paged
00:16:44And before you even get out of the bed
00:16:47You know, there's already an agentic workflow. That's like, okay, what changes were made to the code base recently?
00:16:53What do the alerts actually say? What do the metrics actually say and try to
00:16:59Do a first pass diagnosis again before you even got out of bed and got the sleepy out of your eyes
00:17:04So there are some places in the operational set of responsibilities that we have that can
00:17:09Get a lot of benefit from AI but you don't just like adopt everything
00:17:12Like you need to you need to think about what the what the actual problems are what the ROI is
00:17:16But anyway, that's my thesis. I'm actually doing a webinar in a couple weeks about this very subject
00:17:23That's so interesting because um, the the the use case that you mentioned before
00:17:29Getting out of bed. It's already paging you about something
00:17:32has that already been implemented in practice because
00:17:36one thing that I see can go wrong is that like let's say you have like these triage levels where first triage like
00:17:43Maybe the AI can solve it on its own. But then the second we need a human in the loop and sometimes what I
00:17:51just just to make a comparison with coding sometimes I see clod code when it's reasoning it's like
00:17:56I should do this and then it's like hold on a second
00:17:59No
00:17:59I should probably delete all of this and start over and do this
00:18:02And I feel like the same thing could happen with these incident management bots. They're like, oh, this is a easy fix
00:18:10Hold on. Maybe I can fix it on on my own, you know situation like that
00:18:13No, you're you're 100 correct
00:18:15I think I think this quote from ibm from like the 1970s has been making its rounds lately never never have a computer perform a
00:18:22Management decision. So what I what I mean by that is there are two ways that you can use
00:18:27this
00:18:29Ai tooling or automation in general. There's two philosophies around automation
00:18:32That people have written about the first is called the leftover principle where you basically say. Oh, okay. We're gonna have software
00:18:38It doesn't have to be AI. It can just be tools that does all of these things for us
00:18:43Autonomously on our behalf. We don't have to think about it
00:18:46And then we the humans are given what's left over which is typically the stuff you need a phd to solve
00:18:51um
00:18:53a that's
00:18:54Not super wieldy because then you're spending all of your budget on phds
00:18:57And then you're also placing too much trust in in a system that's running autonomously and if it makes the wrong call you're in trouble
00:19:04the more
00:19:07Wieldy philosophy is is what's called a compensatory principle where it's like you are force multiplying human effort
00:19:15You're allowing computers and automation to do the things that it does. Well, and then you're then you're allowing the humans to do the things
00:19:22that humans do well
00:19:25which is
00:19:26We understand the context of our systems. We understand the problem that we're trying to solve we understand, you know, the the customer experience
00:19:33There's a whole lot of context in our brain that makes it a lot
00:19:38More advantageous for us to use that rather than have ai do it do all of it. So to answer your question more directly
00:19:44I think this ais retooling really shouldn't be
00:19:50Making changes to production unless it is on a well constrained well-defined path
00:19:58I'll give you an example automatic reverts of of code. We do that automatically without ai if you're thinking about kubernetes
00:20:06right if you have a deployment and you're changing, you know, the version of your of of your new container image and your
00:20:13Readiness checks don't succeed. Guess what?
00:20:16It's not going to roll out the change
00:20:19Right. So those mechanisms are are very simple easy to reason about and it allows us to be able to perform changes safely
00:20:26You know those types of of well-defined problems. I think you can leave automation to do and automation is doing it already
00:20:32but having
00:20:35Ai right now just say yeah, we should make this change to to to our infrastructure without human input
00:20:41I think is irresponsible. I think we we as humans back to the point about constraints. I think
00:20:46our time is going to be shifted to
00:20:49Is this really the right decision to make?
00:20:52You know making those those those uh approvals code review and review of any of any infrastructure change
00:20:59I think is going to be where
00:21:00We're going to be spending a lot of our effort
00:21:03And yes, we're going to want to automate that where we can but it requires a discipline and and a track record of safety
00:21:09Yeah, and then another question from what you said earlier is what you said about this linear comparison that the more code you have
00:21:19The more incidents you're going to get so you've been working as a consultant
00:21:22In this field. Have you heard that companies are experiencing it right now like
00:21:28since their
00:21:30outputting
00:21:31Or offloading more code to Ai they're experiencing more incidents as a result
00:21:36certainly definitely so
00:21:39Definitely so and I would also add to that by saying there are other failure modes that are starting to emerge due
00:21:47to
00:21:49This change I think a really easy one to think about and reason about
00:21:53is
00:21:54Your cd pipelines backing up the latency between having a bill artifact built and then having it go out to prod is taking longer and longer
00:22:00Because there's just more changes going through it. So yes organizations are starting to
00:22:05Feel these pain points all already. I mean I had just as a quick example case study. I have a
00:22:12A client that I recently did an assessment for because one of the things I do is i'm taking a look at
00:22:17The entire company's like operational posture how they ship code. How do they run it and give them guidance on what to do next?
00:22:24So one of the teams was like, yeah, this clod thing's real good. Let's start churning out some features
00:22:31and they did but
00:22:33They they didn't change how they release they didn't have a good a good solid code review process
00:22:38And so the number of incidents immediately shot up so they so instead of them
00:22:43Focusing on the code and just like shipping all the things that they thought they were going to do. They were getting buried by customer escalations
00:22:49and incidents
00:22:51So like yeah, this isn't this isn't theoretical it's happening right now
00:22:55um the this increased order of magnitude of changes to production is going to
00:23:01reveal
00:23:03All of the weak links in your operational posture
00:23:06and when you are like a
00:23:09Sre doctor that like has this client who has this problem. What is your what is your suggestion what to do in this case?
00:23:16Like you just mentioned
00:23:18Right. I I alluded I alluded to it earlier all of those fundamentals that we talked about
00:23:24Over the past decade or two we have to focus on them
00:23:27like teams
00:23:29Need to for example teams need to have a a valid testing strategy
00:23:35They need to know that the code that they're getting ready to ship into production is of high quality
00:23:39All the things that they've learned about in the past are thinking about i'm a big proponent of service level objectives
00:23:43i'm a big proponent of slos
00:23:47Because that allows us to understand the health of our production system from the customer's point of view
00:23:52so that
00:23:54Is a big piece of the puzzle because then it allows you to in a data-driven way regulate the amount of changes
00:23:59You're making a prod in terms of like feature requests and things like that. So that way
00:24:03You are not risking your existing source of revenue
00:24:07While chasing new sources of it
00:24:10That's the business reason that you want slos one thing that i've noticed with slos the previous company I worked for
00:24:16they were like
00:24:19Each team should you know maintain their slo goals and for some teams since we were moving so fast
00:24:25We were not meeting those goals
00:24:28And then a lot of teams were behind
00:24:30And the management decision was whatever it's more important to ship out features than to maintain slo goals
00:24:36Yep
00:24:37And that is probably the biggest reason why slo hasn't been as effective as google promised a decade ago
00:24:44Because here's the thing and i i've talked about this in previous content. You can have an slo program
00:24:49You can go through the process of having engineers go in a corner and be like, oh, what's what's the user journey?
00:24:55And let's quantify it and let's get some observability out there
00:24:57and set up prometheus or open telemetry and get really pretty dashboards, but
00:25:01If the product manager
00:25:05If the product organization isn't willing to play ball
00:25:07If the executive team isn't willing to play ball
00:25:09then
00:25:12It's a waste
00:25:14Then you just have sres and maybe the engineers working in a corner saying hey things are not healthy
00:25:19and you know
00:25:21You have the leadership wanting to continue to ship like in order for slo to be successful
00:25:26It needs to be a game that everyone plays and when I assess
00:25:32Organizations slo posture, which is a big thing that I do. I tend to ask that question if an error budget gets spent
00:25:38What does the product manager do?
00:25:41If the project if the product manager is like, okay next sprint we're changing the scope of next sprint
00:25:46Good
00:25:48You're you're you're at medium maturity if they don't
00:25:50You're at low maturity. I actually developed a slo maturity model from from one to five one being immature five being optimizing to help
00:25:58Companies reason about where they're at in the journey, but you're right like a lot of organizations. They don't
00:26:03They don't do slo. They have a dashboard. They have metrics on it engineers get paid for it
00:26:08but
00:26:10Business decisions are not being made using slo
00:26:12It's just alerting
00:26:16And so the type of clients that that you have that come to you what
00:26:21situation are they in that they
00:26:24Require your assistance. Are they in a good situation bad situation in between?
00:26:28It it depends. So there's two inflection points that I tend to work with with clients during the first inflection point is like
00:26:37The early stage company that just got a round of funding their sales organization is crushing it. You got product market fit
00:26:44But they're a babe in the woods and they've never won enterprise level infrastructure before
00:26:49They've never sold to enterprise clients before and they're starting to buckle
00:26:53Under the the strain of that increased scale very much the experience that I had earlier in my career
00:26:59Like that's something i'm very very knowledgeable about because I lived it. That's the first inflection point
00:27:04And that's a fun inflection point. That's like that's a good problem to have though. It can be stressful for the people working there
00:27:09the second inflection point
00:27:12Is usually when you're a larger organization and now you need to start standardizing
00:27:17How you operate across multiple teams?
00:27:21Maybe you have an sre program and you need to assess the s like how effective sre program is are the process is working
00:27:28Is the engagement model working are the engineering teams actually participating in the process is product participating in the process?
00:27:34So companies that you know do a lot of acquisitions
00:27:37They they will typically come to me because they're trying to tame this wild animal because every time they do an acquisition
00:27:44There's an entirely new stack. They need to bring in so there's a lot of like top level governance that you need to start thinking about
00:27:50In terms of making sure everyone's playing the playing by the same rules
00:27:55Sure
00:27:57And I think with the smaller companies like you mentioned the ones who have raised some money
00:28:02Um with with ai being around people using ai to build features
00:28:06The kind of lack of structure will will be increased because people just building features shipping it with ai
00:28:13Not thinking about logs metrics traces anything like that
00:28:16Um, what what are the biggest things you've noticed people in the smaller kind of phase of their company come up to you with problems that
00:28:23Are glaringly obvious to you, but they didn't see it as a problem
00:28:27Yeah, I mean I think i've alluded to it, uh in a previous answer, but i'll i'll frame it top down
00:28:34so
00:28:36The glaring thing that I see is is a broken feedback loop between what is happening in production
00:28:41And what is happening in the customer experience and then what is happening in terms of product prioritization?
00:28:46So and i've seen that for a long time across many teams many teams. It is probably the common problem
00:28:55So as I work as a consultant
00:28:58One of the things i'm thinking about behind the scenes is how do I make sure that feedback loop is established?
00:29:03So it is amazing
00:29:05How many companies will will experience an incident?
00:29:08Customers will get upset
00:29:10They'll file tickets into support support gets flooded
00:29:13With tickets support has this this amazing gift to give to the to the product engineer organization, which is called feedback. They know
00:29:23What is angering the customer and what is preventing them from enjoying the service and what is motivating them potentially to leave?
00:29:30And i've seen time and time again
00:29:33That feedback being completely ignored or being placed lower in priority than the feature work
00:29:41and so the end result is
00:29:44Full steam ahead on features even though the product is on fire and that to me
00:29:51Is something that I see immediately when i'm in it interviewing teams and assessing teams
00:29:55But usually when you work inside the company, you don't see it because it's like oh i'm a software engineer
00:29:59I'm incentivized on shipping features. I shipped 50 widgets last quarter great. I'm getting that bonus
00:30:05Product managers thinking the same thing
00:30:07Yeah, we got these projects out. This customer is signing with us because we did it. Meanwhile
00:30:11Existing customers are completely apoplectic. They're not happy with what's going on
00:30:16so broken feedback loop
00:30:19and is that because of like the culture in sf or just because
00:30:22People just want to look like they're being better than their competitors. I don't know. Why do you think that's the case?
00:30:28I don't think it's native to sf. I I I think I think it's just native to to business in general incentive structures like like
00:30:36business cultures because
00:30:39when
00:30:41you
00:30:42Have different departments
00:30:44Um as a company grows and that naturally happens because you have to organize effort
00:30:49You start the silo
00:30:51I have software engineers over here. I got product managers over here. Maybe they like they might be embedded
00:30:55With the engineering teams, but they got their own set of incentives. It's the support and the devops sre folks
00:31:01They got their own incentives, but they're all they're all separate and they're conflicting
00:31:06Right and
00:31:09typically
00:31:11at this stage of the game, no one sat down with them and said
00:31:13How do we how do we change our incentive structure so that?
00:31:18We're all moving in the same direction. One of the things that meta did
00:31:21That I I really liked and I think it works really well
00:31:25Is that when they are evaluating the performance of an engineer at least in teams that I was working with?
00:31:31They weren't just looking at features
00:31:34They were looking at operational excellence production excellence and they were they were looking at. Okay, what incidents they participate in?
00:31:41What reliability work did they work on? What scalability work do they work on?
00:31:45How do they make that the system more reliable more operable and that applied to are they going to get the bonus?
00:31:51Are they going to get promoted? It was it was an important part
00:31:54of the criteria
00:31:57so
00:31:58When they really started to lean into that that meant that the incentives for the software engineers changed and on the team that I was on
00:32:05where we were really
00:32:07Pushing that forward with the with the engineering leads that I was working with
00:32:12Those engineers magically started coming to me and saying hey, I mean i'd like to participate on some reliability work with you
00:32:18Can I take over this load testing process?
00:32:20Sure. Absolutely. Here's the run books. Here's some scripts like, you know
00:32:25Go nuts and it's amazing what happens when incentives?
00:32:29shift so
00:32:31That is I think it's just a natural progression for any organization if you're like a tiny startup where everybody owns everything
00:32:38And you're trying to get funding and you're trying to you know, win those initial customers and keep them happy
00:32:43I think I think incentives are already aligned
00:32:45Otherwise the company wouldn't survive but as you start getting bigger and you have that separation of concerns
00:32:51It gets harder and harder to have an incentive structure that is aligned across the org
00:32:55So yeah, this is this is where the socio-technical stuff starts coming out when we're talking about SRE and DevOps
00:33:00Nice and and you also mentioned that with this full steam ahead with AI
00:33:06People will need more people like you working in this field and you know at the other side
00:33:12We're hearing that software engineering is dying. There's no more jobs, right?
00:33:17So do you think that like for junior developers listening to this podcast?
00:33:23Is SRE and DevOps a good field to get into at this specific moment? Yes, and yes, so
00:33:30I I will I will push back and I'm I will not be the person that says that software engineering is dead
00:33:36I actually I actually think that the fundamentals
00:33:38Uh is is even more
00:33:41important
00:33:43Understanding programming languages and how they work algorithms and how they work is even more important
00:33:50because
00:33:53How are we supposed to review the code that that an AI generates?
00:33:56And how are we supposed to design systems that work because if you if you just give you know, an LLM model. Hey architect this thing
00:34:04Are you sure it's going to work like i'll i'll put it this way open up a clod code project
00:34:08And ask it to design you a bridge
00:34:11Or a skyscraper
00:34:15And then I want you to take that design and I want you to go and build it
00:34:18And then I want you to go and sit in it. Are you going to be comfortable doing that?
00:34:22Are you gonna be comfortable driving over that bridge that it designed?
00:34:25Right until until you stop shaking your head. We need those engineers in place
00:34:31Right, and as for sre and devops people. Yeah, obviously
00:34:34There needs to be more of us because that's where the constraints going if we're if we're making so many changes to the system
00:34:39We definitely have to make sure that it's healthy that we understand the implications of the changes on production
00:34:43That we have a cicd pipeline a value stream if I extrapolate that to top level that is
00:34:51Healthy monitored and treated as a production service because in my view it is all of those skills are going to become even more important
00:34:58It's like the the way that I I tend to look at it. It's kind of like, you know f1 racing you have a pit crew
00:35:04And that pit crew knows everything about that vehicle can swap out every part and gives that driver the ability to dominate that that race track
00:35:12You you need those people
00:35:15Um because that gives that gives the product managers and the architects, you know, the the environment to
00:35:22Iterate quickly and to be able to get those changes out without those constraints happening
00:35:27so in my view
00:35:29like
00:35:30It's a great time to be an sre. I feel like a couple years ago was different because everybody was getting laid off
00:35:35but I I think that
00:35:37If you really really focus on the fundamentals if you understand
00:35:39You know linux and in in in systems as well as the socio technical stuff like basically what it means to actually be an sre
00:35:46Um, I I think you can go
00:35:49very far
00:35:51In in this in this career, but we do have to be informed by the by the this advent of agentic development
00:35:57We can't keep our heads in the sand about it
00:35:59So you mentioned
00:36:02Was it phoenix project? Yes book
00:36:05Yeah, so what other resources do you think is valuable for someone who wants to like dive very deep into this subject?
00:36:12Oh, man, I mean there's there's tons. Let me let me get you the the the heavy hitter books
00:36:18I mean the devops handbook is good. I felt that when that book came out
00:36:21It was it was a great summary of everything that I learned prior to the book coming out. It's a great
00:36:27It's a great one-stop shop leading change by john cotter
00:36:30Is about it's about transformational change leadership and organizations
00:36:34It gives you a framework and when you are an sre on a team or if you're trying to drive a reliability transformation
00:36:41It's a it's a it's a good thing to know the book toyota cotta
00:36:45Which is about how toyota, you know does continuous improvement in the manufacturer manufacturing of vehicles
00:36:52Very useful very useful one of the first ones who came up with the system. Yes. I think i've heard of that. Yeah
00:36:59kaizen
00:37:02Was was the toyota concept?
00:37:04The japanese economic miracle is what created agile
00:37:07And devops like it was the precursor to that other books the practice of claus's administration
00:37:13I mentioned I think they might have like a new edition very very useful
00:37:17Yeah, like the sre books that google produces are good, but I I will I will put a caveat
00:37:22Those are books that were written by people that run that work for companies that are absolutely gigantic
00:37:27And have infinite budget
00:37:30so the practices and the ways that they go about things are going to be different than the way you go about things with your
00:37:35four-person startup, but I think
00:37:37Putting all of those pieces together will give you an understanding from an engineer
00:37:44from
00:37:44a leader
00:37:46And from a strategist on how to drive devops and sre initiatives
00:37:49So I think I have a unique perspective because I think about it holistically. I'm not just going deep on
00:37:55All I do is kubernetes and I make kubernetes work. I'm trying to make teams work
00:37:59And then after that we figure out the technical problems and solve those
00:38:03And for all the listeners, we will link the books mentioned in the show notes
00:38:09Um, I was going to ask so going back to ai you I agree with you on people should learn the fundamentals and that
00:38:16Software engineering or development isn't going anywhere, but there is a kind of trend
00:38:21But i'm seeing more and more of people pushing code to production without reviewing it
00:38:26and i'm sure that these are like people on twitter or x or just like
00:38:30People showing off it's not really true
00:38:32But but there are kind of thought leaders like like people who who run the ai companies like dario, uh musk saying hey
00:38:39Like next year 2027, whatever
00:38:41You can you can write code and you can ship it without looking at it or you can write code
00:38:47Like compiled code without writing the programming language like to compile that code and so languages will be dead. What are your thoughts on that?
00:38:54I seriously doubt that these thought leaders are on call
00:38:58Because if they were they would be saying something completely different
00:39:02No point
00:39:04They're they're not they're not they're not in the infra. They're not the information security group
00:39:08crying about
00:39:09vulnerabilities being introduced and they're not dealing with what happens when customers are upset like
00:39:13Sure. LLMs can produce
00:39:16Syntactically correct somewhat logical
00:39:21code
00:39:23Are they write like 20 30 something?
00:39:25Maybe
00:39:29But even then
00:39:31You're still going to need proper monitoring observability
00:39:37You know tests and production. Okay, if you want tests and production as charity espouses
00:39:42There is some
00:39:45prerequisite invest investment that you have to make and in her view
00:39:48it's going to be observability understanding the behavior of your software, but even even still if I
00:39:54you know
00:39:56Were a a government and I needed to buy software
00:39:59It has to meet my security controls
00:40:03If I was a medical company and you're writing software that is literally responsible for keeping patients alive
00:40:09Do you really think you want an llm to generate that without review? That would be unconscionable
00:40:15It goes back to the example about the bridge or the skyscraper. Do you really want an llm to design a bridge or a skyscraper?
00:40:21And just ship it pour the mortar in the rebar. Let's go. No
00:40:27No, absolutely not
00:40:31Yeah, that bridge and skyscraper analogy is a very good way of looking at it and never never thought of that
00:40:37You know if you were like an engineer
00:40:39From zero to hero, like would you want to walk on your own bridge that you just built?
00:40:43Yes, and here's the the thing that we have to think about so my I was mentioning curtis
00:40:49I love that guy in the in the beginning of this episode
00:40:52He was a pe
00:40:56A professional engineer in electrical engineering that meant that he went to school
00:41:02He took the pe exam. He interned at a company for several years and then eventually after taking a big test
00:41:09Then and only then was he able to do engineering projects. There is a high level of discipline
00:41:18in how
00:41:21You do things. I mean, it's the same thing with doctors
00:41:23Like you learn the academic but then you're like in the field for years under supervision before you can actually do medicine
00:41:30And and there's a reason for that. It's because
00:41:33the decisions that we make have have massive ramifications on the lives of human beings in the environment and
00:41:40And and in the world, um and and software hasn't quite
00:41:44caught up to that
00:41:47yet, and i'm not necessarily saying that we need to emulate those models, but
00:41:52We do have to at least
00:41:54acknowledge the risks that we are introducing we do need to have the discipline to when we're building things that affect lives to human beings
00:42:01We need to have that discipline. We need to think about risk. We need to think about the adverse outcomes
00:42:05we need to take responsibility and I think
00:42:08There's a lot of organizations that don't want to hear that because it pushes back on on their ability to make progress
00:42:14but
00:42:16Why are we writing software? We're solving problems for people we shouldn't be introducing new ones
00:42:23Yeah, and it's a very interesting point that you said about your friend, sorry forgot his name the PE professional engineer curtis because curtis
00:42:31Yeah, because here in canada
00:42:33There was a discussion about here in order to be called an engineer. You actually need a trade certification
00:42:40And then there was the debate that software engineers shouldn't have the title engineer because they haven't earned that certification
00:42:48Yes
00:42:51Yeah, that resonates that resonates
00:42:53Yeah, the level of discipline is just completely different between our industry and theirs
00:42:57Yeah, exactly. Um, I wanted to address something that you've written on your
00:43:03Linkedin bio you mentioned from burnout to bottlenecks no from burnout and bottlenecks to reliable fast-moving teams and systems
00:43:12So what's your what's your experience with burnout?
00:43:16Oh my goodness. I have a lot of experience in burnout
00:43:21You know as a as an operations person at fast growing organizations
00:43:26There
00:43:28Is definitely people that have like hero complexes people that want to prove themselves people that have
00:43:35those little things inside of the psychology where they feel they need to
00:43:39To to jump in and just own things. It's a vulnerability right imposter syndrome
00:43:45All of those tendencies that we all have as people people pleasing all of those things can contribute to to burning out where you
00:43:52Are so motivated to prove to others that you are doing good work that you're not listening to your body
00:43:59You're not listening
00:44:02To your to to your emotions your emotional state. You're not being introspective. You're not giving yourself time and space to rest
00:44:08Your relationship with rest is is dysfunctional like there was a period of time where I thought that rest was bad
00:44:14That the idleness
00:44:16Was was wasteful when you know in
00:44:20In in this season of my life now where i'm thinking a lot about this stuff. No rest
00:44:25Gives you the ability to do the big lifts later
00:44:29You know people that like, you know body build they're not in the gym every single
00:44:35Day doing every single body part they give themselves time
00:44:40to to to rebuild the fibers in their muscles
00:44:45So why why are we?
00:44:47You know
00:44:49Saying we can't do that when it comes to our minds
00:44:51So that's the the theme but I I remember
00:44:55After leaving meta I was I was like patient zero in the gigantic wave of layoffs
00:45:01So I was affected by the layoffs. That was what sorry about that
00:45:04Oh, no, I mean shirt. Oh moto would not exist if it were not for that for that. Okay, cool
00:45:08Right the ashes a phoenix right percent like that was that was the my response to it
00:45:13My response to it, but that that period where i'm i'm starting my own business
00:45:18And i'm taking a step back from the industry and thinking about how I want to work. It really made me
00:45:23Become very aware of the burnout that I was carrying with me and not, you know, acknowledging and um,
00:45:32You know, I I remember earlier in my career being the grumpy siseman and if there are any any folks
00:45:38From from aquia or former aquia that they're listening to this, you know what i'm talking about
00:45:42I used to be real grumpy
00:45:44and like i'm i'm very jovial today because i've done a bit of work on myself, but
00:45:49That was like burnout and I didn't recognize it
00:45:51I was just I was just thinking that people like too many people were coming up to me asking asking me for requests
00:45:56When they should be doing it themselves and like read the man page, you know what i'm saying?
00:45:59I dealt with a lot of burnout and and I think my business and the way that I run it gives me the the the
00:46:04the space
00:46:07to
00:46:08Tackle those challenges head on and into interact with work in a way that is more healthy
00:46:13and maps more to how my brain works what I want in my life and making sure that
00:46:20You know, i'm not
00:46:23Living to work. I'm i'm working
00:46:25So that I can have experiences that fill me up. You know what I mean?
00:46:28What do you think about that? I know I kind of bounced around
00:46:31I was gonna say that the grumpy
00:46:34sysadmin I can relate to that not not for myself, but like companies i've worked so there's always a guy who's like
00:46:40knows how to spell up kubernetes or notice everything about aws and if you want something from him
00:46:44He's like really? Okay, and then he goes and does it but doesn't want to do it so I can relate with that
00:46:49but yeah, I can see why that happens and where it comes from purely because of the fact that it happens a lot and
00:46:56So I can see the frustration
00:46:59And so yeah, I think it's good to to get a break from that and um, you have to work on yourself and try other things
00:47:05For sure. There's personal responsibility involved in that but it's also being very cognizant of the culture
00:47:11Of the environments in which you work the teams in which you work the people you interact with
00:47:15And making sure that you're choosing places that are best for you. I know I know the job market is still a little funky
00:47:21It might be hard to make
00:47:22You know those types of choices and being being able to say i'm just going to go to another place
00:47:26That might be really difficult to say I recognize that too, but at least being mindful
00:47:30Of where you are and listening to your emotions listening to your body and and doing things about them
00:47:36I think is a big piece of the puzzle. There is a
00:47:39Doctor a doctor moslock. She made something called a burnout inventory
00:47:44She originally designed it for medical personnel because she was in the medical field
00:47:49But the burnout inventory and that her her her research is extremely applicable to tech
00:47:56and you can you can even like
00:47:58Check out some of her talks. I think she did
00:48:01I think she was doing talks for the annual devops event that the author of the phoenix project
00:48:06like ran
00:48:08so
00:48:09Yeah, i've read a little bit about the clinical aspect of burnout as well as experiencing personally
00:48:15Yeah, and i'm super happy that this layoff turned into something good for you it led the way
00:48:24To you establishing your own company and I find it so interesting that from what I can see from your posts
00:48:31Like you're running this company
00:48:33basically on the road, right you're
00:48:35Moving all the time. You're basically living kind of like a nomadic lifestyle. Is that correct? That's right
00:48:42I I do live nomadically I am literally on this call using starlink in the middle of the high desert in nevada right now
00:48:50um, I converted my tikoma pickup truck
00:48:54I I called her molly. I I can I converted my uh pickup truck into a
00:48:59Mobile battle station. I am standing
00:49:02in the truck bed
00:49:04um
00:49:05And I have comfortable sleeping quarters 300 watts of solar
00:49:09Um a refrigerator, uh, you know a power system I built myself. So i've made this
00:49:16Micro home out of this truck and i've worked from every everywhere like like my property out here in in nevada to
00:49:23The parking lot of amc movie theater somewhere in boston like I I work from anywhere which is a lot of fun
00:49:29I've been doing this type of thing
00:49:31For two and a half years. I started shirtomoto three years ago
00:49:34I've been i've been living nomadically for the past two and a half and it's been the best experience adventure of my life like
00:49:41You know
00:49:43Doing work in the mountains. I had a bear climb in my truck once
00:49:46Oh, wow
00:49:48Yeah, what was that like?
00:49:51It was it was a little scary, uh back in back in those days. I was in a rooftop tent
00:49:54Not my cool camper thing now. Was it a black bear or a brown bear?
00:49:58Well, I am I am alive. Therefore. It was a black bear. Okay, cool
00:50:02It was it was the biggest black bear i've ever seen it was like it was gigantic
00:50:06It was it was looking for picnic baskets or something
00:50:08but
00:50:09It crawled into the truck bed where I was sleeping above on the bed rack with the rooftop tent and like it it shifted the weight
00:50:17And I had to I had to get the the keys out of my pocket and press the alarm button and go
00:50:21Do that and then the bear ran off when the alarm went off
00:50:24Yeah after after that I understood the necessity of building something that is enclosed like this thing is enclosed. At least the bear can't see me
00:50:30or get in there, so
00:50:33So you mentioned Nevada and Boston so you've done like a cross-country trip with your truck every year. Yeah every year
00:50:42I'm, actually planning to return to the northeast in a month right now. I have a six by ten cargo trailer that I am converting
00:50:48Like after this call i'm drilling holes and installing windows and vent fans
00:50:52And installing a thousand watts of solar and we're doing the whole business and it's going to be my mobile
00:50:58My mobile war room and office that I can work in during a clement weather
00:51:02That's so cool. And for some engineers or people who are also interested in this
00:51:08What's the price tag for building such a mobile home as you've done?
00:51:13I mean the the cargo trailer thing is actually rather economical you can pick it up for like four to five thousand and then
00:51:20Whatever you need to do that the price can go up, but you need to insulate it
00:51:24Obviously you need some ventilation some electrification
00:51:27There's a whole universe of people out there that do this kind of thing
00:51:31Um, i'm just one of the few that like combined nomadic living and tech and made it work
00:51:36But the you know, the truck is it's a coma with a custom camper on top from a company called go fast
00:51:41I got this one used but the um
00:51:44Yeah, it really depends on what your needs are. I mean, I see people rocking rocking it out in a prius
00:51:51you know and then I also see people in more sophisticated rigs, but
00:51:55yeah cost of entry can be can be low if you know a bit about cars and you're willing to
00:52:01You know roll up your sleeves and do work
00:52:04So what inspired you to start this endeavor of?
00:52:07camping and traveling and
00:52:10pimping up your car
00:52:11Yeah, um, well i've only lived in two places prior to
00:52:14This journey and I was spending my whole life working and and basically doing what what people told me to do
00:52:21What is expected of a young man, you know, you go to college you get your degree you get a good job
00:52:26You work your way up. I did that
00:52:28and
00:52:29and at the end of that corporate road, I realized hey man, i'm not as happy as
00:52:34I I should be about this i've made some massive accomplishments, but i'm not feeling it and since I started my own business and realized hey
00:52:41Do I really want to pay this much rent in the boston metro area?
00:52:44If I didn't stay in boston, what would I do and I found a youtube video of this guy who took a u-haul truck
00:52:51You know like a moving truck
00:52:53And he turned it into an apartment on wheels and i'm like
00:52:55Huh?
00:52:58That's kind of cool
00:53:00And I started going down this youtube rabbit hole and learning that there's like a whole
00:53:05you know community of people that live like this and I just
00:53:08Researched and watched a bunch of youtube and then decided I I sold everything
00:53:13I did not renew the lease at that apartment
00:53:16And drove out west in my civic and a few and a few months later. I got molly and and and set it up and
00:53:22And started but yeah, it was it was an understanding that I I had a lot of
00:53:28Life experience that I needed to catch up on
00:53:31Um, you know, I spent too much time ironically on a tech podcast. I spent I spent too much time
00:53:37On the computer and not enough time out in the world and I realized I needed to have that experience to balance me out because
00:53:44I'm more than just a dev obsessory dude. Like i'm a human being with my own
00:53:49my own flavor
00:53:52I've got experience needs to go and touch some grass
00:53:55I gotta touch some grass and and I've touched more more than grass out here in the world rocks
00:53:59You know mountains water all kinds of stuff. There's all kinds of things to touch out here. Yeah 100
00:54:04So you mentioned your bear incident has there been any other?
00:54:08Interesting adventures while you've been on the road
00:54:12I mean there so so many I mean I did cinnamon pass in colorado with friends, which is an off-road, you know
00:54:18A little technical a little challenging i've been to wyoming i've been to yellowstone grand teton
00:54:24I camped in some public land near there where where you can see the bears. I sometimes uh here
00:54:30Uh wild burrows out where i'm at. I saw wild horses a few days ago
00:54:34You can you can hear them walking and you look out the tent window and there it is. There's some horses
00:54:39um
00:54:41But yeah, i've i've crisscrossed the country a good number of times
00:54:45um
00:54:47And uh, yeah, it's been it's been nice. I I have a sweetheart that's willing to
00:54:51Indulge my craziness and and we go on adventures together
00:54:54um, and uh, yeah, it's it's it's been fun. I mean that could be a whole episode about all the crazy places i've been
00:55:01Oh, yeah
00:55:03That sounds so cool. Probably there is an episode on your own podcast, right?
00:55:08About this, you know, there should be because usually usually I I well because I bring on a guest and we're talking about you know
00:55:15The subject but maybe I should do a just one talking about this
00:55:20yeah, even the um the takes and your lessons from burnout, I think that would be a very
00:55:25useful episode for a lot of engineers especially nowadays because I feel like
00:55:29we've been promised that AI will
00:55:33Make us more productive and we'll have to work less hours when in reality. I feel like it's the opposite basically
00:55:41Sometimes you're just being burned out by the amount of work that you're that you're expected to do with these tools now available
00:55:50Yes, my views on that would probably be considered very subversive
00:55:53And i'll just i'll just leave it at that where where we need we need we need more workers. Let's be honest
00:55:58That's what we are. We are workers. We we need conditions that are more human
00:56:03Totally
00:56:06Yeah, agree
00:56:08We always like to ask our guests. Do you have any hot takes about sre dev ops?
00:56:14Ai anything related to tech. Yeah. Yeah. All right hot take and I say this I say this with the greatest amount of
00:56:22Respect I don't mean to gatekeep the practice of sre
00:56:26But i'll say this if you have an sre role if you have a job title
00:56:29That's sre
00:56:32And right now you're in the corner writing yaml
00:56:34And getting paged
00:56:37I want you to really question whether or not that is in fact an sre role
00:56:40Sre is a practice where you are taking a not so reliable system
00:56:45And transforming it into a reliable system
00:56:47And making customers happy through slos
00:56:50Toil management capacity planning right all of those higher order operational responsibilities and using software engineering
00:56:59That's very very important. That is sre practice
00:57:01So yaml is not a programming language if that's what you're doing most of the time
00:57:06I encourage you, you know get out into deeper waters. It's great out here
00:57:11there's a great community a lot of people that will be more than happy to teach you but
00:57:15Make sure you're finding uh roles that challenge you and help you grow
00:57:18The the phoenix project book it made the word devops popular and since then there are people who are devops engineers who do what?
00:57:26You said like write yaml and um
00:57:29Spell out kubernetes and all that stuff and it's like kind of like the devops
00:57:33That the book was talking about is not somebody who who does that. It's more like what you explained bridging two different
00:57:39Was it what's the word like not systems but disciplines together?
00:57:43And I think it's quite common like I said in the world to see i'm i'm a devops engineer
00:57:48I'm learning devops and it's like it's it's not that it's
00:57:51What you explain it's difficult now to bridge that because it's now so common that people are devops engineers
00:57:56It's not seen as what you explained. But yes, it's difficult to go back now, isn't it?
00:58:02Yeah, I mean we're talking about semantics and names so i'll try to qualify what i'm saying when I when I mean devops
00:58:09Yeah, indeed
00:58:10We're not talking about a set of tools. We're not talking about a specific team, you know title tool or team people talk about this
00:58:15All the time. No, it's not that it it it it is the practice of joining
00:58:19Technology people leadership process so that we can deliver software to the customer in a way that is as fast as possible
00:58:27Corruptive and is as good as as good for the business as possible like that to me is what devops is and the means to get
00:58:34There is the verse
00:58:35It isn't it isn't just kubernetes kubernetes is a piece of the puzzle. It's not just you know, cicd
00:58:40It's a piece of the puzzle. Sometimes it's also sitting down and listening to people. Sometimes it's talking about building vision and strategy sometimes
00:58:47You know, it's about sending your teams on a day off after they got paged one too many times at 3 a.m
00:58:52Devops is about that whole
00:58:55Holistic view rather than just the tools the tools are the tools are sexy. I get it people want to sell the tools
00:59:01But that's only one facet of the whole experience. Yeah, speaking of tools. Do you have any of your favorite tools that you use?
00:59:08Oh gosh, okay. Uh, let me let me think about that. Yeah, i'll i'll i'll volunteer one. So
00:59:14there
00:59:15Lately have been a lot of incidents in github recently
00:59:20And we've kind of gotten into the habit of hey, I would like my build processes and my testing pipelines to be
00:59:26In vendor there is a tool that you can run instead if you want to self-host your stuff
00:59:32called concourse
00:59:35And I like concourse. It allows you to build very sophisticated
00:59:38Pipelines for for testing or build or delivery or what have you using yaml each little section runs in a container
00:59:47But you self-host it and the reason why I really like it
00:59:50Is that the open source community?
00:59:53Like the governance is is is its own governance model
00:59:56It's not owned by a company that can you know swap out the the license and turn it into a sas
01:00:01Which we've seen many times for other projects. So if you're if you're getting frustrated with uh,
01:00:06you know
01:00:09Using the cloud service du jour for your cicd check out concourse use it run it on prem
01:00:16You know, maybe you're going backwards five or ten years in terms of philosophy, but it might be more stable. Who knows?
01:00:20Cool. I've never heard of it. I'll have to look into it
01:00:23Yeah, me too
01:00:26Yeah, it's good stuff. Oh, there's there's some companies out there that definitely use it
01:00:29Um, and yeah friends don't let friends run jenkins like it's that's done. Don't do that. Don't do that
01:00:35So you're saying jenkins is dead
01:00:38Did I say that
01:00:41No, no, no just it wasn't the gotcha question
01:00:44But what I am saying is that there's there's some there's some more options
01:00:47And I don't think people want to write groovy scripts anymore. So yeah
01:00:51Other things out there have been a pain for me as well
01:00:54Yeah when i've used it
01:00:57Yeah
01:00:59Very good
01:01:01Is there anything you want to plug like, uh, you do a podcast. Is there anything else you want to talk before we wrap up?
01:01:06Sure, let's do that. I'm always grateful for that. Yes
01:01:09I uh am a consultant at my company cherto moto. That's c-e-r-t-o-m-o-d-o.io. I specialize in assessing
01:01:18Companies reliability devops and sre posture if you're getting paged a lot
01:01:22If you're not shipping often if customers are angry
01:01:26Should definitely put some time on my calendar. I also run a podcast called reliability rebels
01:01:31Where I interview people talking about
01:01:33How sre isn't just the tools it's also challenging the status quo
01:01:38Talking about the socio technical and then on the 24th
01:01:41of february, I am doing a uh webinar about
01:01:45the
01:01:47ai flood of code
01:01:48I think I call it the ai code tsunami. Um, and I do webinars monthly talking about all kinds of interesting subjects
01:01:54So if you're interested check out my website and you can learn all about that stuff. Thanks so much for the opportunity to plug
01:01:59No worries. Thank you. I'm in for
01:02:03For speaking about all these sorry adventures and lessons learned. It's been very fun to have you
01:02:09So thank you everyone for listening to this episode to the better stack podcast
01:02:13Subscribe to our show wherever you get your podcast apple spotify youtube pick your poison, but for now
01:02:20It's a goodbye from me
01:02:22It's a goodbye from me
01:02:25And a goodbye from me
01:02:33(upbeat music)