Women in Cloud Native was presented by Nancy Chauhan, Developer Advocate at LocalStack and founder of Women in Cloud Native. In this session, we heard about Nancy’s journey in cloud native and why she founded the group Women in Cloud Native.
Airflow & K8s was presented by Jed Cunningham, Staff Software Engineer at Astronomer. In this session, Jed gave us an overview of running Airflow on Kubernetes. He covered topics including:
- Operators vs Executors
- KubernetesExecutor vs other executors
- KubernetesPodOperator
- KubernetesJobOperator & friends
- SparkKubernetesOperator
- Advanced features/customizations
Speakers:
- Nancy Chauhan – Developer Advocate @ LocalStack and founder of Women in Cloud Native
- Jed Cunningham – Staff Software Engineer @ Astronomer
Watch the Replay
Read the Transcript
Speaker 1: 00:00:00 Okay, looks like we are live. So, alright, thanks everybody for joining us. I’m Paul, the head of community. We have a really awesome couple of speakers today. So we’re featuring Nancy Chohan, the founder of Women Cloud Native, and she’s going to be talking about her experience founding that organization. And then also we have Jed Cunningham, who’s going to be talking about airflow and Kubernetes. He’s a staff software engineer at Astronomer. Before we get into our talks, of course we always have some community things to discuss. So let me get into that. So first off, I want to say special thank you to our gold community sponsors, Google Cloud and Perona, as well as our silver sponsors and our community collaborators. Thanks for your sponsorship, which allows us to do things like this. Our town halls here are a couple of community spotlights. So Patrick McFadden from our community, he was featured in a, there’s been a series of articles, Kubernetes at 10, and Patrick, you can see his article when Kubernetes won.
00:01:05 And Life now is a serially teenager. So you can scan the QR code there and read that article. Also, Edith Pulia, one of our DOK ambassadors, got a blog post submitted in the CNCF blog, which is the evolution of database operators article. It has a really awesome infographic that you can see the timeline of different operators that were released for Kubernetes. So yeah, check out those articles if you’re interested in reading about those. Again, we’re also, I just want to give a shout out to Women cloud native. So if you are interested in being involved, we’re going to hear a lot more from Nancy about this, but check out the scan, the QR code, and that will take you to the KCD page from an cloud native.
00:01:54 All right, this is really exciting. We announced our ambassadors last month or this month, so we have 20 DOK ambassadors right now, about seven or eight are returning and a bunch of nuances. So we just had orientation with our new ambassadors. There’s a lot of really accomplished people with a lot of knowledge and information to share and help the community. So if you see them on LinkedIn, some people have been announcing it. If you know some of these people, congratulate them and welcome them as ambassadors and you’ll be seeing them throughout the community. We’re working on some projects, we’re probably putting together an ambassador day. So yeah, that’s what’s going on there. So DOK day, we will be at coupon North America. I’ve told you a bunch of times that’s going to be on November 12th. You can see the details by scanning the QR code there. The CFP is closed. We had a bunch of submissions, so we’re really excited. The notifications for that will happen on Monday, August 26th. So if you, good luck and you’ll know by August 26th. But again, we’re excited to be, this will be our third co-located event at Coon.
00:03:10 Here are a couple upcoming events. So we have a doc talk, which is the anatomy of DAS, bringing self-serve databases to Kubernetes with open source. So you can check that out. That’s going to be next week on July 23rd. Oops, I put 23rd, which of course that doesn’t make sense, but you get the gist. And then of course, we have a town hall in August 15th. That’ll be Spark Batch processing workloads on ES, and that’s going to be with David Fox from Dish Network. So you can scan the QR code there, check that out. And then of course, if you’re interested in being involved in the community, you’re already here. So that probably means here you have some interest. You can scan the QR code that I’ll take you to our community page. If you’re interested in sponsorship, you can scan the QR code there as well. And then lastly, stick around until the end of the presentations today and we’ll have a little quiz. You can win A DOK. And then, so with our two speakers today, if you have questions, go ahead and add them to the chat and we’ll relay those questions to our speakers. And then at the end of their presentations, they can answer those questions. So yeah, with that being said, I want to introduce Nancy. Nancy, the founder of Wound and Cloud Native. So we’re really excited to have you here today. Nancy, thanks for joining us, and I’ll let you take it away.
Speaker 2: 00:04:43 Thanks so much, Paul. So yeah, cool. Let me share my screen. Just let me know if that’s visible.
Speaker 1: 00:04:54 Cool. Yeah, it’s loading. Looks good.
Speaker 2: 00:04:57 Awesome, awesome. So yeah, thank you so much, Paul for inviting me today, and I’m super excited for today’s session. I’m going to talk more about women in cloud native community, but before we move ahead, I just want to tell a little bit about me. I worked as a DevOps engineer and developer advocate with various startups. I worked at Blink, which is the largest e-commerce, grocery e-commerce in India. I also worked with xap and then some developer tools like Kit Port and Local Stack. So I’m a CNF ambassador, AWS community builder. I really love communities and yeah, that’s why I’m here today. Yeah, I’m also part of CNCF tag environment sustainability. I’m leading the project this time. So we’re going to have sustainability week 2024 in the month of October where we are going to host the meetups in various regions across the globe to discuss more about sustainability in tech.
00:05:58 So if you’re interested in it, let me know. Just ping me. I founded women in cloud native community. I’m going to talk more about it in today’s session. So yes, I’m also open for contractual roles and I love cats and I love traveling a lot, meeting different people, exploring different regions. So yeah, let’s get started. So today’s talk is about women in cloud native community, which I founded in December, 2022. But before that, let’s discuss that, why diversity is important. So I feel like diversity is crucial in a successful company because it brings different perspectives to solve a problem. And basically understanding all customers, your diverse customers by all the diverse diversity and inclusivity. It basically makes, it helps you make better decisions, more inclusive decisions if you have the whole diversity in your organizations. So yeah, that’s about the diversity. This is a graph. Basically there was a survey conducted by CNCF, and it was basically around the diversity.
00:07:11 And it was stated that men are slightly more likely to feel a strong sense of belonging in the OSS community than women and non-binary individuals with 77% of saying that they agree or strongly agree with the sentiment versus 71% of women and 64% of non-binary individuals. So this statistics basically tells that there’s still a lot of work which needs to be done in the organizations in our tech, where we can have more diversity and more inclusivity. And when it comes to inclusivity and diversity, it is not only about the ratio, it’s also about how comfortable or what kind of space we are basically having in the organization, which makes it comfortable for everyone. So yeah, this is a screenshot. I was just going through my mail and I came across one, I was just going through my mail yesterday and then I came across this male.
00:08:07 This was basically, I wrote a mail on my last working day and it was like a goodbye mail. That was the norm there. Someone basically replied back to my male, that best of luck for your future and never got to work with you. But I remember seeing you as the only female identity in a floor full of men. This was the first time when I noticed that after working for two years in that organization, this was for the first time when I noticed that, oh my God, it was just me. And I was so much swm into the work, but that I didn’t even take a notice of it. But yeah, this is the thing. And that was the moment when I realized that I need to connect with more women in tech because I generally knew men in tech and not women. And that is when I started participating in a lot of communities, started connecting with women in the communities.
00:09:05 So yeah, this is a problem. And with this problem, I decided that let’s have a community of, because I’ve been already working in DevOps and cloud native space, so let’s have a community. And we launched it in December, 2022. So this was my post on Twitter. I didn’t think much and I got a lot of support that was super nice. A lot of people pinged me. They were happy to have the space where women can connect with each other, even men can connect. I mean, this was a space where everyone can connect with each other. And I’m really glad that we got a lot of support. And also, I remember when I posted this on the very next day, a lot of people pinged me that, let’s have a coffee chat, let’s discuss what to do. I mean, so many people were excited about it. And then it was a thing.
00:10:00 Yeah, these are all wonderful faces. I mean me, to be honest, I’ll be very honest, I did not know much women before creating this community. So I created this community and I got to be in touch with amazing faces over here. I mean, Liz, everyone knows about her. I’m in Nikita, Emily, so many faces are there, Adrianna, these are amazing women and all of them are leaders in their industry. And this was one point that I want to have more leaders. I want to know more about more women leaders in the tech. So community creating women in cloud native community really helped me to connect with those leaders. Yeah, I mean, this is what I talked about. I mean, how can we bridge the gap through, and we can do it through connection, collaborative learning and shared knowledge as the diagram shows that there’s a women and if we can just connect them with the whole pool of people in tech, I mean, whether it’s men and women.
00:11:00 So people can learn a lot from each other. I mean, it’s just two ways. They can also experienced folks or people already in the cloud native industry can learn from newbies. So there’s a lot of learning and sharing which goes there. And this eventually inspires more women or more diverse diversity in tech. It inspires more people to enter this area. It makes it more approachable. So the goal of women in cloud native community is to bridge the gap to connect the people. So yeah, these are the core values. Build and share people in women in cloud native, they build and they ship together, celebrate each other’s success, accomplishment and support. So there could be support in various ways, which I’m going to talk about later. And how is this community run? So right now we have 2000 plus members across different platforms. We have a group on community groups like the CCF F, and we also have a Discord group.
00:12:08 So if I combine, there are like 2000 plus people. That’s huge. And that is managed by community builders. And that’s I think, the backbone of the whole community because they foster a supportive environment for women in the tech industry, either by organizing events and workshops that encourage learning and networking. And they’re strong believers of advocacy for diversity and inclusion. So yeah, big thanks to them. So these are the current community builders. We have Havani, she’s also the lead organizer. She is really helpful, such an amazing person, inspiring. We have Sanita. She has a lot of security knowledge and she brings security perspective. We have AM Moga, we have Rosaline. So these are the folks who are really help running this community. So yeah, we are a part of CNF community groups. So if you want to know more about what’s happening in the women in cloud native community, you can join this group.
00:13:13 So let’s talk more about the community work, what’s happening in women in cloud native community and how can you join or help us? So we regularly host podcasts where we reach and share inspiring stories and work. So I do a lot of podcasts with amazing women, and I try to cover most of the leaders in the industry so that people get role models or maybe someone to look up to. And it’s, it’s a way to make it more approachable because to be honest, I feel right now the problem is that there is a lot of lack of knowing. I mean, people really don’t know who are there at the top. So this podcast is really about that sharing the inspiring stories so that people can reach to them if they really need help. We also host and participate in workshops. There are regular workshops on different topics, and we do it on be, which is the platform provided by CNCF.
00:14:19 It’s pretty much like Zoom. We use Zoom sometimes as well. The whole idea is to provide a space where people can be comfortable to discuss anything when we are doing the workshops, when we are giving the demos. So this is it. And the coffee chats, this is my favorite. I mean because I learned a lot by coffee chats. It’s just a personalized space where you can mentor each other or maybe you can share things with each other. You can help each other with any kind of bugs or they could be different topics as well regarding salary negotiations or maybe developing leadership and soft skills or maybe how to basically jump the career ladder. So there are different topics which you can discuss in coffee chats. So these are the things which are happening now in women in cloud native community. You can join us on CNC F Slack, the C ncf, F Women.
00:15:16 There’s also a Discord group. I think I forgot to put a link here, but you should definitely find it on the social media or maybe I’ll share it later in the comments section. So yes, and you can join the C NNC F community group. You can just join it. You can get a lot of updates, whatever. I mean if the workshop is happening, you’ll get the updates and you can join the workshops or you can stay connected for the latest updates on the Twitter or maybe the LinkedIn. Yeah. Okay. So now comes the future plans and initiatives. Super excited for this because we are planning something, we are planning to launch a mentorship program. The whole idea is this whole one-on-one thing which goes mentor-mentee, where we can discuss about different things like leadership or it could be any tech related work or maybe how to contribute to OSS or how to basically get started with one project.
00:16:17 So we initial ideas include launching with batches was spring and fall. I’m seeking individuals and volunteers to collaborate in developing the program. It would be really appreciated if someone wants to volunteer it with me designing of the program and launching it. That would be really, really helpful. So that is about the mentorship mentee program. I think it’s going to be really helpful. It’s going to be really useful for people out there. So let me know if you are interested. So here’s the QR code where you can basically participate. So I’m just giving a pause. I mean you can basically scan this QR code. I’ll also paste the link later after my talk. So if you want to contribute to the community, maybe as a volunteer or maybe you want to be a guest to a podcast or want to host a workshop or maybe help me design the mentorship programs, you can just let me know in this. And if you have any feedback for the community, that would be really appreciated if you can just give the feedback. Awesome. So I think with this we come to end. If you want to get in touch with me, you can scan this QR code. You can reach out to me on Twitter, LinkedIn, anywhere. I’m super excited looking for people to help me run this community. Yeah.
Speaker 1: 00:17:37 Awesome. Thank you so much, Nancy. Again, if anybody has questions, feel free to put them in the chat. So just to give a little background, I reached out to Nancy because with DOK, we’re also very interested in having representation from all backgrounds and all different people. So I reached out to Nancy trying to collaborate with her to get more women involved in DOK. And so that’s why we invited her today to talk about it. So there’s lots of ways to get involved as Nancy showed. So I recommend if you’re interested, getting involved in reaching out to Nancy to get involved with women to call it native. Yeah, like you said, the more diverse set of opinions you have out there, I think we’re all better for that. Cool. And then so in terms of the mentorship program, are you looking for all people in the cloud native space or is it mostly developers or different areas of cloud native space?
Speaker 2: 00:18:43 Yeah, that’s a good question. Thanks Paul for asking this. So I did mention in the form as well, which I’m going to passe the link again later. So it could be into different areas. It could be tech, someone who can give technical guidance to any of the projects. It could be, it could be the leadership as well, someone who can really help with the leadership and soft skills. It could also be related to salary negotiations or anything. I mean there are certain things which are required. So it’s pretty much diverse there mentioned in the form. So yeah, it’s pretty much around that. Or someone who can give the guidance about the career ladder or maybe the speaking, I mean, conference speaking. So those are the different things which are included in it.
Speaker 1: 00:19:32 Okay, cool. Alright, well spread the word, hear that you heard it from Nancy and us. Yeah, encourage people to get involved in that as well. Alright, well thank you so much Nancy. Really appreciate your time. I think what you’re doing is great and we’re excited to be collaborating with you and we’ll keep trying to send people your way. We think it’s great. Cool. Well, next up we have Jed Cunningham who will be talking about airflow and Kubernetes. So yeah, I’ll hand things over to Jed. Thanks for joining us today as well.
Speaker 3: 00:20:12 Awesome. Yeah, thanks for having me here. Let me get screen share going. Cool. Are you all able to see that? I’ll take silence as a yes. Yeah.
Speaker 1: 00:20:29 Sorry, I muted myself. Yep, looks good.
Speaker 3: 00:20:31 Cool. Yeah, so hey, thanks for having me. I’m very excited to be here. Yeah, I’ve been dealing with Airflow and Kuber for I guess a little more than five years now. So both these topics are very near and dear to my heart. So a little bit about me, I’m a staff software engineer at Astronomer. I actually kind of like the tech lead on our open source engineering team. We basically have a team of a dozen or so of us that work on airflow, full-time for open source reasons. I’m a Committer and PMC member in the Apache airflow community as well. And yeah, as I mentioned, I work at astronomer. We offer coasted versions of airflow and then we also have enterprise offerings for airflow on-prem also. So yeah. Cool. So today really I want to talk a little bit about airflow in general. I’m not sure, I just want to give a quick overview of what is airflow.
00:21:32 We’ll talk about the specific integrations that we have with Kubernetes and running airflow on Kubernetes and a little spoil alert, but that leads into the helm chart bit a bit. So yeah, let’s get started. So what is Apache Airflow? If you go look at the website, one of the first little bits you’ll see is that it’s a programmatically author schedule and monitor workflows. And that’s a reasonable description of what it does. Airflow is really, really flexible ultimately though. And so the programmatic piece is really important. You can schedule and monitor anything really. It doesn’t have to be data workflows, although it is normally used for data purposes. And that flexibility means that we being airflow kind sit at the central nervous system of a lot of companies. The amount of integrations that we end up having makes it a really powerful tool. And so it ends up getting used for a lot of things.
00:22:37 We see a lot where one team starts using it for a small purpose and then it starts to permeate across the organization as more teams are like, that’s cool. Let’s consolidate and use similar tools. So if we look at the official integration to that airflow has, we call ‘EM providers, and those are packaged separately on Pipi. There’s the big ones, the major cloud providers, there’s Databricks, Kafka, and Spark, and then we have a Kubernetes provider, which is good for this audience. And we’ll kind of dig into that one specifically more here. But just before we do a little more general airflow stuff. So airflow on Pipi roughly in the 25 million downloads per month range. This is really ramped up over the last few years, which is great to see. We are really close. I don’t know, as of a few days ago, we hadn’t hit it yet, but we’re a couple dozen away from hitting 3000 contributors to Airflow, which is amazing.
00:23:46 I am pretty sure we’re a pretty good margin in the lead in the Apache Software Foundation for individual contributors. We have 35,000 stars on GitHub and 45,000 members on our Slack, which is pretty active. Cool. So programmatic. This is an example of what an airflow DAG looks like. And this is kind of a weird mashup. We actually see some of the old styles where we’re using operators, and then we also have the task decorator there, which is kind of the new task flow style of authoring dags. But yeah, basically you write your workflows or whatever you want in Python and you can use the flexibility of Python to be really creative with the way that you build your workflows. There’s good and bad aspects of that, but I think that flexibility is really powerful. And then, yeah, this is one quick screenshot of that DAG in the airflow ui.
00:24:50 There’s a ton of really cool features in ui. We’ve kind of evolved the UI a lot over the last couple of years, but this is kind of like the high level. You can see what runs you have, you can dig into logs, you can see timing, and yeah, this is going to get better too and we’ll spoil for later. Cool. So let’s dig a little bit more into the Kubernetes provider specifically. This is the package name on Pifi if you wanted to install it. So what’s in there? There’s seven operators, there’s one sensor, one hook, and a couple executors maybe. What the heck are these things? And we’ll talk about some of these a little more as we go, but this is actually one of the smaller providers. If you go look at some of the cloud vendor providers for example, there’s dozens of each of these. So the surface area here is small.
00:25:51 Sorry, went too far. All right. So if you go out to airflow and you Google rather than you type in airflow Kubernetes, you’re going to end up seeing results for Kubernetes pod operator and Kubernetes execute. And one of the very first questions that people have are like, which do I want to use? Do I need to use both? Do I have to use both? And let’s dig into this question a little bit. As I said, it comes up constantly. So let’s back up though and be like, okay, what is an operator and an executor to begin with? And really they’re, I guess they’re very different things. The way to think about operators is that’s like, what does this task need to do? This is like a, I’m authoring a workflow, I need to go out and do X, I need to go send an email I to go push a file to S3.
00:26:45 The executor is more about how do I run any task? Nothing specific about that email I need to send, but how do I do any work in the system? This is more like infrastructure and an environment question. And especially in larger deployments of airflow, the role of who is worrying about which operator and which actually user are different. One would be the DAG author and one would be the deployment manager. So yeah, let’s dig into some operators a little bit here. So there’s some generic operators, Python operator and Bash operator are kind of the two canonical examples there where you can really do anything you want with those. You don’t get a lot out of the box, Hey, I’m going to go run this Python callable, or Hey, I’m going to go run this command line in Bash. Then you end up having that middle column, which is specific integrations with things.
00:27:48 So I mentioned S3 operations earlier, there’s a ton of S3 operators that you can use. There’s SQL things, email, there’s Slack operators name it. There’s 92, I think I said earlier, 92 different services that we kind of have integrations with. And so this is, it’s a very valuable piece of airflow and that lets you be quick and kind of have a standardized way to integrate with these things. Now, the Kubernetes pot operator, which I mentioned earlier, that is more of a generic operator. Ultimately the best way to think of it is it’s kind of like your Docker run. Basically, I just need to go run a container. It generally isn’t airflow of things. You end up seeing a lot of things that are written in other languages you don’t go, for example. But yeah, Kubernetes PO operator is definitely more on the generic side then.
00:28:47 Yeah. So let’s talk about executors a little bit. You executer choice is kind of AOC cult following in some ways, but if you sit down and try to back up and look at it from a technical perspective, you kind of get to choose speed or isolation. And so I kind of put this little table together and there’s definitely some blurry lines here, but by and large, it’s fair to say that local executor, which will run a task in the scheduler itself as a separate process, that’s really fast, right? Airflow is already loaded in memory. It’s not spinning up new Python environments or anything. It’s very, very quick to go run a single task in that environment. Celery executor uses celery, which communicates over traditionally Redis, but any message queue, I guess there’s a few that are supported. And so that has a little more overhead in that you have to put the message into the queue and wait for a worker to pick it up and run it. But in the grand scheme of things, the latency that you’re adding there is pretty low. And so that’s still pretty quick. Kubernetes executor here, this is probably the biggest downside to Kubernetes executor is the startup latency.
00:30:07 The scheduler has to build a pod and it has to send it to Kubernetes. Kubernetes has to put it on a node. That node may have to pull image. You got to wait for the container to start a python to start. Airflow is fairly large dependency wise, so it isn’t exactly the quickest to start up and you pay that cost for every task with Kubernetes executor. So yeah, just from that perspective, a lot of it’ll depend on the type of tasks that you’re running. If you have a ton of tasks, if you’re turning over tasks very, very quickly, that extra latency of earnings executor can really have a meaningful impact in your overall workflow duration. Whereas if you have things that run longer, like an hour or something, the cost of five to 10 seconds is really not material at that point. Excuse me.
00:31:03 There’s also a bit of a complexity of deployment. So local executor is super easy. You don’t have to do anything special at all. Seller X here, you do have to set up the queue and startup separate workers, but it’s still pretty easy. Kubernetes execut, there’s a little caveat here. The deployment manager takes in a lot of the complexity here for you, and it does kind of limit architecturally where you can place different components like the airflow database because all these workers have to be able to go talk to the airflow database as well. And so yeah, as I said, there’s a little bit of a caveat because normally DAG authors don’t have to deal with any of that. And then especially if you’re using a helm chart or for example, all that complexity is really kind of hidden from you anyway, and we’ll kind of see that in an example later.
00:31:57 So why would you ever want to use Kubernetes executor? Well, here’s where it really shines is with the isolation that you get and the ability to define custom resources with both local executor and seller executor, you are running more than just your task in that component. And so say I’m running 16 tasks in a worker and the worker oms, how do I know which task was the culprit? I don’t. With Kubernetes executor, you can be very specific and give a very resources dedicated to that task and that task alone. Therefore, if it does OOM, that task was the culprit and not maybe being the victim of a noisy neighbor. And then yeah, custom resources say you need custom secret or volume mounted or something for specific tasks, you don’t want to share that, have it be available to other tasks very easily do that with Kubernetes executor.
00:33:00 And this is where it gets weird. There’s what we’ve traditionally called hybrid executors, which are kind of a mashup of these. So we see, sorry, celery Kubernetes executor and local Kubernetes executor here. Both of those are default tasks will go to celery and then opt into Kubernetes or vocal and then opt into Kubernetes. And those work pretty well. We don’t see a huge adoption of those executors, but they do exist. And it does let you be a little bit more flexible around which executor specific tasks use. And so if you have a more mixed usage of airflow in your environment, this can let you be a little smarter about how things are run. And then kind of confusingly named, we have the hybrid executor coming in, airflow two 10, which should be here in a month or so. And this basically takes the concept of these two hybrid executors that I have above here, and it really lets you run any combination and let you route tasks to any executor that you want. And so you could conceivably have celery, Kubernetes executor, local executor, there’s a couple AWS specific ones. You could use all of them in the same airflow deployment if you wanted to. So that’s a cool future that we’re kind of looking at in the airflow. So if we zoom back out to the original question though of like KPO versus Kubernetes executor, any combination work, really any combination of any executor and any operator work because they’re really solving, they’re different pieces of the, so yeah, don’t force any associations there.
00:34:53 So let’s look a little bit at KPO and we’re going to actually walk through some examples of how different executors will schedule this out, and maybe that’ll kind of help cement what these things mean. So if we look at this code here, we have a very simple DAG one task and we have a task id, and we have a name that’s kind of the name of the pod that we want. We have the image and we have the commands we want to run. This is really complex. We’re saying hi, and this is the very bare bones of community pot operator. And as I mentioned earlier, this is kind of like the Docker run equivalent in airflow. So let’s walk through and see what this looks like with different executor choices. So let’s imagine that we have all of airflow running in Kubernetes already, and we go run that dag by default.
00:35:52 What it’ll do is it’ll go create a new ephemeral pod for that docker run this container. It’ll run it in that same cluster right alongside the other components of airflow. Now, you’re not restricted to that though. You can actually run airflow completely outside of Kubernetes, and the same thing can happen. You can run airflow outside of Kubernetes and have this pod spin up in a Kubernetes cluster. You can even go to a different Kubernetes cluster if you want. So there’s a lot of flexibility here. Now, ities executor is where it gets a little more interesting. If you remember back when we were talking about the trade-offs of different things that basically every airflow task gets a separate worker pod created for it. And so when we combine Kubernetes executor and Kubernetes pod operator together, we end up with actually two different ephemeral pods for that specific task.
00:37:01 So the scheduler will go out and create the worker, which then actually runs that Kubernetes pod operator task, which then goes out and creates another pod in the cluster. And it may sound a little weird, why do we need two? But there’s some features in the airflow that kind of require a babysitter, I guess, for the actual task pod there. So it can’t run its own callbacks in airflow, for example. And so that worker exists to kind of make sure that KPO is integrated well with the ecosystem that airflow has ultimately. And yeah, what does that look like for a non KPO test? Then let’s use Python operator as an example. In this case, you do just get the one pod. So in this case, yeah, you get the one worker pod for that Python operator, and yeah, it should be hopefully straightforward. Cool. So when we’re deploying airflow, how does Kubernetes executor know how to build these pods?
00:38:12 Well, the way it’s done is with a POD template file. So if you go way back into airflow one days, we actually tried to enumerate all of the features of odds in our config, which was painful. So now we have basically a PO template file, we inject it, or sorry, we consume it and use that as a base for the pods that the scheduler creates. And as you can see here, it has very basic things. It has a container named base, that’s kind of the main container there. Airflow will actually kind of inject the right image to use, and normally you’ll end up having a service account that’s specific to workers in airflow. You can mount config, you can mount whatever you want. One interesting thing that I left in here kind of intentionally so we could talk about it, is that it feels a little weird that if you look here, the execut override via an environment variable here that we actually run, we’re saying, Hey, use local executor when you start this worker.
00:39:25 This is a little bit of historical baggage that airflow has today day. I’m hoping that with Airflow three that’s coming, spoiler alert that we’ll be able to kind of clean this up. But this is a really important piece to make sure that the task doesn’t end up in kind of a weird nested loop of trying to create a pot again and again and again. So this basically forces it to run locally in that worker, and then that’s not the only way. So that’s how a deployment manager would come in and say, Hey, I want every pod, every worker pod to have X. Well, sometimes you need to be a little more specific. Remember I mentioned that? Hey, custom resources, custom volumes, what have you per task. Well, how do you do that? This is how you do it. It’s executor config, executor config, pod override more specifically.
00:40:24 And it allows you to IT to kind of build a pod, a Python pod, sorry, a Python pod object and pass that into airflow. And airflow will kind of take the base pod that it would create and layers this on top and it emerges those. And so this gives the DAG authors the ability to be more specific for this specific task. Hey, I need this. So that’s how you can as a dag author, customize your experience. Cool. Let’s chat about the Kubernetes pod operator a little bit more. So there’s a lot of native support, and by native, I mean keyword arguments for things like resources, volumes. I mean, there’s security contacts and there’s a ton. There’s a whole multiple pages of options here. And so more often than not, you end up having a keyword argue. You can specify that feels natural to override whatever you’re trying to do there.
00:41:34 But there’s one feature that I want to call out specifically. If we come back to this example that I showed earlier where we have those two pods, the one is the airflow worker and one is the actual task pod feels weird. Maybe there’s a better way. And there is, so airflow has the concept of defer mode for operators and not all of ’em support it, but if the job that is being run or if the task that’s being run can be monitored with an async process, then we can use what’s called the trigger in airflow, which is kind of a specialized worker. And what it kind of does is it lets the worker start up and then you can do what we call the fur. And basically you can shut down the worker part of this task and kick the responsibility over to the trigger, which really just kind of runs an async IO thread, or sorry, an async IO process, and it runs a bunch of async IO code to kind of wait and detect when the main job or whatever it is that you’re trying to watch is done.
00:42:51 So kus pod operators, a really classic example of where this is beneficial, that worker otherwise would sit there essentially idle and doing nothing for the whole duration of the KPO pod. And so if we’re able to start up kick it over to the trigger, the worker can shut down the resources for that worker can go back into the Kubernetes cluster, be used for other things, be used for other tasks in airflow even. There’s a really lightweight piece on the trigger that is able to keep tabs on that pod that it created. And then once that pod finishes, hey, we can kick back up, start another worker and do whatever cleanup we may need to do. So in practice, what this looks like is that first box we start, we go create the pod, and then we shut down, and then at some time later we come back, we created another ephemeral pod to do that cleanup.
00:43:51 So this is what you see your cluster. So especially if you have a long running tasks, really, really nice option to reduce resource usage. And honestly, it’s a little more resilient too. Triggers in airflow are way more resilient to failure than distinct worker pots. So yeah, that’s another benefit. Cool. KPO also can be customized either than further. I mentioned there’s a lot of things that are built in, but we definitely haven’t covered every nook and cranny. And so we do offer these options as well. And you may want to use these for easier standardization too, across tasks. Again, airflow, dags or Python. And so you can do a lot of creative things to reduce boilerplate. And then I just want to mention the pod mutation hook, if you research airflow and Kubernetes pod mutation hook is also likely to come up as well. This is kind of the original way to customize pods that airflow creates.
00:45:04 What’s a little weird is that you actually get both Kubernetes executor pods and Kubernetes pod operator pods as they both get passed through this callback. And so if there are things that you want to do for one or not the other, you have to basically are responsible for figuring out what type of pod it is and only doing it on the right pods. But this is kind of like the big hammer. If none of the other configuration options work for you, this is how you can pull it off. Cool. We also have a Kubernetes job operator, and you may be kind of like, well, why would you have both a job operator and a PO operator? Feels a little weird. Well, it kind of comes back to who handles retries and who’s responsible for breaking up the work. The nice thing about Kubernetes pod operator is that it is literally a pod.
00:46:09 And so if it fails, cool, that kicks us back into Airflow’s ecosystem to deal with retries, how many do you have left? That sort of thing. And then also with the airflow has different tooling I guess for things like parallelism and completion. And so if you’re trying to plug in airflow into an existing job process, like the job operator is a good answer, but yeah’s kind of pick your journey here as to do you like the job approach in Kubernetes or do you airflow’s answer to some of these problems? And then there’s a couple of sibling operators here. The Kubernetes job operator will go out and create the job, but you also have a delete job operator and a patch job operator that are kind of in the same bubble. Cool. And then I think last but not least I guess we’ll see is the Spark Communities operator.
00:47:17 I’ll admit I don’t really have as much experience with this one, but it basically is an integration to make using the cobe Flow Spark operator more native in airflow, it’ll actually go out and create the Spark application CRD for you. And yeah, as I said, it’s one that I don’t really have a whole lot of experience with, but it does get some traction in the community. So if you are using that coop Flow Spark operator, this is a good option. Cool. So let’s say you are sold. You’re like, okay, I’m down. Let’s go. How do I run airflow? Well, you could always go out and write your own Kubernetes manifests to start these things on es. We’re talking a handful of components. It’s not that complex. And hey, you could also take that a step further and go write your own helm chart. If you haven’t done that exercise, it’s a good one.
00:48:22 I guarantee you’ll learn some stuff. Or you could use some existing charts that are in the ecosystem. So say you want to go the easy path and go down the charts path. So there’s really three big options today. There’s the Bitnami chart, there’s also the user community chart, excuse me, the user community chart, which this is the chart that used to be in Helm Stable before Helm Stable died. And then we have an official community chart. Full disclosure, I’m the release manager for the official chart, so I, I’m a bit biased as to which one you should choose, but ultimately the official chart is released by the same group of people, the same group of release managers as the core airflow releases happen. So yeah, that’s a positive. We do a pretty good job of making sure that things are compatible. We have over 170 contributors just to the chart, which is amazing.
00:49:30 It’s been really cool reviewing prs and kind of seeing new faces swing through. And yeah, it’s been a really kind of rewarding experience to see this grow from a dozen of us kind of poking at it to, yeah, now 170, our chart is definitely production focused. We don’t support some features that some other charts might that are more like development environment centric, and so some of the design choices are view it from the lens of trying to be resilient in production. We have support Airflow 2.0 plus, and I’ll mention it, I didn’t want to write it in the slides, but it does still technically support Airflow one point 10, but if you’re still using Airflow one point 10, you should really upgrade. It supports basically all of the executors, all of the ones that you’d want to use in production for sure. We also have JSON schema from values, which is a nice way to make sure that what you are trying to configure will actually work, that you’re not off in the weeds and configuring things that don’t exist or what have you.
00:50:44 The Home Chart also works with Argo cd, flux Rancher. Terraform, yeah. We try to be open to alternate deployment methods if you aren’t actually using Home Helm. And there’s a ton of features. I mean, we have gety built in and all kinds of cool things, but we haven’t covered everything. And so if there are things that we haven’t covered, definitely open for prs or even feature requests to kind of make it a better experience, and it is really easy to get started. Ultimately it’s Hey, add our repo and off to the races your home install and set your Kubernetes executor and you’re off. That’ll end up spinning up all the things that you kind of need. We have our database, although you wouldn’t want to use that piece for production, you’d want to swap that out. We have a whole production guide as to things that you want to change from this bare install, get started install to make it more production ready.
00:51:54 But yeah, this is a working airflow. Cool. And then, yeah, kind of back to the Kubernetes executor side of things. A lot of the things that you’d want to configure on a worker by default are exposed through the values file and the chart. And so you can, for example, set the resources for the default resources for workers in the values file. You can do tolerations, all that stuff. And if you connect the dots, what this ends up doing when you set these is it goes and sets the appropriate stuff in the pod template, file the chart, builds the POD template file for you, or of course, you can always override it as well in values file if you wanted to. Cool. Let’s talk about airflow three a little bit. So these are really high level things. We are ramping up on Airflow three development. There’s active conversations going on all over the place.
00:53:01 We’re targeting next March or April for release, and so it’s a shorter window. We’re trying to really get it out quickly, but at a high level, we have a new task execution interface that’s coming. And really what that means is we’re removing the need for airflow workers to have direct database access. And so it provides a lot better isolation for tasks in an airflow instance. It also opens the door for things like non Python tasks, like say having a task that’s written in Go. And then it also opens the door for remote workers, like running workers that are very, very separate infrastructure wise from the rest of your airflow components. We’re going to have further data awareness features like partitioning and polling and versioning on data sets and things. So a lot more of where continuing on that journey of making airflow more data aware.
00:54:06 I’m really excited about this one. And then, yeah, we’re solving the longstanding shortcoming of airflow with adding DAG versioning. So gone are the days of using all visibility to what your DAG looked like in the past. Yeah, so DAG versioning is coming and yeah, of these three, the DAG versioning is the one that I’m specifically working on. And so yeah, really looking forward to getting that Airflow three. Cool. Last thing here I just want to call out is we’re having Airflow Summit in San Francisco in September. We have all kinds of tracks with beginner track all the way up to dealing with very complex use cases. We’re celebrating our 10 year anniversary of Airflow being an open source project while we’re there. So yeah, if you’re in San Francisco or are able to come, I’d love to meet you all there. Yeah, me and my team will be out there in force. So yeah, if you do make it out, find me, come say hi. And yeah, with that, I think that’s really all I had. I kind of realized as I was driving over here today that I probably should join the docker on Kubernetes Slack and be available there for questions. But in the Apache airflow Slack, these are the channels that I kind of hang out in and if you’re welcome to DM me or whatever. So yeah, I think that’s all I got.
Speaker 1: 00:55:42 Alright, thanks Jed. I’ll send you an invite to the DOK Slack channel, but I was going to ask you where people can go, but you answered my question. So if you have questions about airflow and Kubernetes, you can join those channels on the airflow Slack group. Yeah, I’ll send you an invite so then we can also make you accessible on our channels. I had a question like, well, if people have questions, feel free to add them, but I had a question that struck me when you were talking about you had almost 3000 contributors. I know there’s a high barrier to entry in contributing to these projects. What do you attribute the high number of contributors to?
Speaker 3: 00:56:32 Yeah, that’s a really good question. I think part of it is just the wider usage of airflow is really massive, right? Airflow is used for all kinds of different things ultimately. And so I think we just have a bigger, relatively a bigger pool of users that are using it. And to really use airflow, you have to have at least some level of Python skills, generally speaking, to write these workflows. And so I think there’s maybe a little more overlap with people that are comfortable with at least basic programming that are able to come in and actually author prs. I’ll also say that we try very hard to focus on being very welcoming to new contributors.
00:57:24 I try to be very, very helpful with people that are new for us. It’s very easy to see when it’s the first PR that people open and I try to help them with little things that make their journey a little smoother, like getting static checks, or sorry, pre-commit running locally so that they can fix little static check issues that they have without having to wait on CI. And yeah, so I think the maintainers communication with new folks and we even have a Slack channel that’s dedicated to helping people that are starting down their contributing journey. That helps too. We also try really hard to be welcoming to any contribution. A lot of people got their start with a minor docs pr and that’s totally cool. We are more than happy to come in and merge a type of fix or something that’s rewording just a little bit of something on the docs. We are absolutely open to that level of contribution as well.
Speaker 1: 00:58:24 Yeah, that’s awesome. I mean, the very few number of poor requests I’ve submitted to some projects, it can definitely be intimidating. So you don’t quite necessarily know if you’re doing things right. So having people I think, to guide you through it, that seems like a really useful thing to do. Well, we running up against time, I guess. Well, thank you both Nancy and Jed for your presentations is super useful. Of course, we’ll post this on YouTube, so if anybody can watch this later, but we try and get through our quiz, so we have our quiz that will run and if anybody can participate, and if you scan the QR code there with your phone, you can just a pretty simple quiz, just some questions basically based off these presentations, if you want to join, you can just scan the QR code there and then we will run through that real quick. The winner can contact me on Slack or whatever because we don’t necessarily see everybody’s names, but then we can send the winner a shirt, a DOK shirt. So we’ll let people scan that Hygenists is in control of the quiz. Alright, here we go. This one, we’ll see who was paying attention. Oh, we have one player, we need another player. Otherwise player won. Wins the shirt.
01:00:04 Well, I think if Nancy, if you play, you’re going to do very well. Well, I don’t know if we have, okay, we have one player back, maybe I, okay, Dagens, do we have any other players?
Speaker 4: 01:00:32 That’s all we’ve got for now.
Speaker 1: 01:00:34 Alright. I say we go with the one player, you’re going to win the shirt. Okay. Regardless, I say we go for it. Oh, two players? Yes. Okay. Oh, what happened to question one? Question one. In what year did Nancy launch women in cloud native? Is it 2020? 2021. 2022. Okay. Everyone’s voted. The quicker you vote, the more points you get. Yes. Okay. Everybody got it right? People are paying attention. Looks like Nemo answered just slightly before. Okay, here’s the next question. Okay. What does Nancy like to do outside of work? We’ve got skydiving and rock climbing, extreme cooking and eating cats and traveling or scam, collecting and cartography. All right. Cats and traveling. I thought it was going to be extreme sports. Oh, whoa. We got some new people. Awesome. Okay. You can still come from behind you. Just answer quickly. Late comers. Okay. What services do the women in cloud native provide to cloud native community? We’ve got workshops, podcasts, coffee chats, or all of the above. Alright, all the above for the wind, all those services. So that’s another reason why you should follow women in cloud native. And then we have one airflow question here.
01:02:59 Okay. What is the primary purpose of Apache airflow? We wrote this question before you spoke, Jed, so we hope it aligns with what you said.
01:03:17 All right. It seems like everybody got it right. Nice. Okay, we’ll find out who the winner is. I don’t see the name, but if you are the winner looks like, is that the official bell? No. Yep, bell was the winner. Okay. If you are Bell, reach out to me. You can do it on Slack, you can email me, find me on LinkedIn, whatever. And we’ll send you a DOK shirt. Yeah, thanks for participating and we will call this as the end to the talk or of the town hall. Again. Thank you Nancy. Thank you Jed. Really appreciate you giving us your time. This is super useful for the community and we will talk to you later. Alright, thanks everybody for joining. Have a good day.
Data on Kubernetes Community resources
- Check out our Meetup page to catch an upcoming event
- Engage with us on Slack
- Find DoK resources
- Read DoK reports
- Become a community sponsor