Attend Data on Kubernetes Day at KubeCon North American on Nov 12th

Register Now!!

Empowering developers w/ easy, scalable stream processing on K8s

In this Town Hall, Intuit shared insights on Numaflow, an open-source Kubernetes-native stream processing framework developed to address the challenges of real-time data analytics. The presentation highlighted how Numaflow enables easy, cost-efficient, and resilient stream processing without the steep learning curve or operational intensity of existing solutions. It showed Intuit’s successful implementation of Numaflow, which processes billions of messages daily across hundreds of Kubernetes clusters, supporting various applications from analytics to machine learning predictions and anomaly detection.

Speakers:

  • Sri Yayi –  Senior Product Manager, Intuit
  • Vigith Maurice – Principle Engineer, Intuit

Watch the Replay

Read the Transcript

Speaker 1: 00:03 All right, thanks everybody for joining us for another DOKC Town Hall. I’m Paul Head of Community for Data and Kubernetes community. Today we have a really awesome talk with RI Yahi and YAI and Viji Morris from Intuit. Today’s talk is empowering developers with easy, scalable stream processing on Kubernetes. Before I hand things over to them to give their presentation, I will of course go over all the community announcements that we have. So yeah, so first of all, I wanted to thank our community sponsors, our gold sponsors, our Google Cloud and Perona, and then we also have our silver sponsors and our community collaborators, and we actually have a new community collaborator, which I’ll talk about shortly, but thank you to all our sponsors for helping us, which allow us to put on these live streams to do other sort of events and help run the community.

00:52 So thanks a lot. So yeah, as I mentioned, we actually have a new community collaborator. So if you remember a couple or last week we had Deep de Reti who’s the maintainer and project lead for ss. So Vitas has joined Rook in Apache Unicorn as our newest data on Kubernetes community collaborator. And the point of that program is just to bring in open source projects in the DOK sort of space to help foster collaboration in the community. So yeah, welcome vass to the community Collaborator program. Also, just a quick reminder, if you’re heading to Coup Con, the schedule is out for DOK day, which is on November 12th. Make sure you get the all access pass if you want to attend the passes.

01:41 The all access passes won’t go up until November 9th, so you have a bit of time to buy a pass if you’re going. But yeah, also want to give a thanks to Percona again as being the gold sponsor of the event. So it’s really important and we have the sponsorship for the event, which allows us to keep being brought back as a co-located event at Coon. Well, coupon North America hasn’t happened yet, but EU 2025 is in London on April 1st through the fourth. The CFP for KubeCon is open until Sunday, November 24th at 1159 GMT standard time. So yeah, get those proposals in. This is for KubeCon, the greater event, you scan the QR code, that will get you a link to the CFP page and you could learn how to submit your proposals. Some upcoming events for us for data on Kubernetes. Next week we’ll have Victor ov who will be talking about scaling realtime analytics and Apache Pinot on Kubernetes.

02:44 And then our town hall next month we’re going to have Michael Lavan who will be talking about AI and ML on Kubernetes for the absolute beginner. So don’t miss out on that. Also, we have Nelson Aldi who will be talking about his new book, big Data on Kubernetes. If you scan the QR code on the right, that’ll take you to our meetup page. Hopefully. I guess if you’re here today, you probably already know about our meetups, but you can register for our meetup group and for the individual events. So check that out. And of course, if you want to get involved in the community scan the QR code on the left that’ll take you to the community page. It’ll tell you how to get involved, give you links to all the channels that you can work in, and then also if you’re interested in becoming a sponsor of the community, scan the QR code on the right. That will give you some information there. So that is it for me. I just want to say before I hand things over, if you have questions for Viji and sri, just put ’em in the chat and we will pass ’em along to them. Most likely we’ll get those questions answered at the end. But yeah, I will now hand things over to SRI and Ji.

Speaker 2: 03:56 Awesome, thanks Paul. Thanks for this great community for giving us the opportunity to share about our open source project and how we are doing scalable stream, processing it into it. Let me share my screen. Hopefully, I think you can see my screen right?

Speaker 1: 04:20 Yeah, yeah, I can see it.

Speaker 2: 04:22 Perfect. Awesome. Hi everyone, I’m Sri. I’m a product manager on the Intuit developer platform team, and today I have Widget who is also along with me. We are going to share about how we are empowering developers to do scalable stream processing on Kubernetes using an open source project called As Pneumo Flow. So on a very high level, here is our high level agenda for our colonization where we’ll just quickly introduce about what do we do at Intuit as a platform team. Discuss a little bit about the stream processing. What are the challenges that developers or anyone are facing as of today and how did we solve them using Pneumo Flow and then do a demo and how communities also using the open source project. So we’ll share about all of the various good and the greatest use cases. So as majority of you might be already aware, Intuit is a global FinTech platform company with product offerings like TurboTax, QuickBooks, MailChimp and Credit Karma.

05:24 And if you look at some of the numbers down there, we are a completely modern SaaS platform that’s run on Kubernetes. All of our services are on Kubernetes. And you can see at the scale that we operate either in terms of machine learning, the amount of money that we process on a yearly basis and the number of connectors that we have with various financial institutions across the world. And we are a big believer in open source. If you are in the Kubernetes ecosystem, majority of you might be already aware of the Argo project, which was actually incubated at Intuit and then open sourced to the broader ecosystem. And some of our learnings from our go workflows and all of that has helped us to start on this new journey of another project called Asne Flow. Some of the use cases that we have learned and how did we go about solving them is what we are going to share as we go along during this presentation.

06:15 With that, let’s dive deeper into the today’s topic, which is basically the stream processing. What do I mean by stream processing, right? Essentially you as a developer or as a data engineer or as an ML engineer, what you try to do is you try to act upon the events that as they come in, it could be for different kinds of use cases, either it could be realtime analytics that you would want to do or some kind of event driven applications that you would want to develop or do some kind of predictions or inference on the data that is coming in real time. So you’re trying to act on these events as they’re coming in. And today probably we are going to focus more around stream processing, which is a real time, real time as and when the UN come in, you just act upon them.

06:56 So that’s a quick introduction of stream processing where you have a stream of inputs that you’re receiving, you are doing processing on that data, and then you’re sending this data to process data into downstream. It could be like databases for analytics reasons or for dashboards or it could be a number of use cases, and that’s a very high level introduction of stream processing, but we can always spend a lot of time in unpacking what entails into stream processing, what are the challenges and all of that. So we are going to cover some of those. Now let’s take some of these examples that we see on a day-to-day basis. For example, if you are in the e-commerce ecosystem, you would want to process and understand how many orders were done on real time. Let’s say in the past one minute, how many orders were placed or probably you want to send out recommendations to your users based upon the various actions that they’re taking within the product.

07:52 The other use case could be is in industrial ecosystem where you want to receive data from different kinds of IOT sensors and then you want to process data and then take certain actions. Let’s say probably prevent your maintenance or probably some kind of shut down protocols that you would want to execute. So that’s another set of use cases that you would see on a day-to-day basis. And the third one, which I’ve already mentioned, which is the real-time analytics use cases. This can be simple pricing related, real-time insights. Let’s say for example, how many orders have you placed through maybe Uber Eats or DoorDash or any of those you would want to understand so that you can do dynamic pricing, you can do some kind of demand forecasting so that there are dashes available in order to deliver the orders and things like those. So that is more of the real-time analytics use cases.

08:40 And then lastly of course you can see a lot of fraud detection use, especially in the FinTech world that we operate in. When there’s a kind of transaction that is made, we want to make sure that it’s a valid transaction, it’s not a fraudulent one and all of that. So all of these need to be done real time such that there is no impact to the customers who are actually using our products as such. So in short, you can see diverse set of use cases from e-commerce to FinTech to the iot or in general like the realtime analytics use cases, the realtime stream processing use cases. You generally come across the board and they are kind of the backbone of our existing ecosystem or the modern products that we are trying to develop. So let’s try to take a look at how does the architecture look like when you’re trying to build these stream processing systems.

09:36 So we have discussed about different use cases like either e-commerce or financial or IOT and all of that. So all of these events that you receive, it could be purchasing orders or payment orders or maybe IOT sensor data. It can come through different kinds of sources. It could be right from Kafka most often than not. Pulse S-Q-S-S-N-S and different kinds of message streaming or event streaming buses is what generally people tend to use as sources in order to receive your events. And once you receive these events, you want to act on them, right? The kind of processing could be either you act on each of those events or probably you want to do some kind of aggregation and it can be any number of things right from just simple processing to doing some kind of prediction aggregations and whatnot. And once you process this data, of course you would want to send this to your downstream.

10:28 It could be any other, again, the way we have discussed it, the things can be either like a databases or again even streaming systems or any of those. So in short, if you look at it, you need some sources where you are receiving the data and the second is you’re trying to process the data and then you’re sending these data, whatever you have processed into your downstream system so that probably other systems can consume and then act upon those events as well. But let’s try to understand what are the challenges that we started seeing specifically at Intuit or even in generally across the broader ecosystem with respect to stream processing. So the first and foremost as realtime event processing is big. It’s kind of a challenging thing, especially for folks who are not in the world of data engineering. Let’s say you are a data engineer and you might be writing tons of even processing or realtime stream processing jobs on a day-to-day basis.

11:28 But if you try to unpack the existing ecosystem, it’s kind of heavy on Java frameworks and also you need to learn about the concepts of the stream processing or the realtime stream processing. So that is one of the biggest pain points. Let’s say you are an ML engineer or probably an application developer, but you have a use case of trying to do some kind of real time event processing or the stream processing. You would have to move out of your existing ecosystem. Let’s say you’re a Python or probably a go developer, you have to move out of your existing comfort zone and learn a new platform altogether in order to build these jobs. And that is one of the most common pain point that we have heard from our internal developers and everyone. The second is complex integration in the previous architecture diagram that you have seen, you have lot of event sources from Kafka to puls to SQS and all of these.

12:24 It’s not just about connecting to multiple event sources. You would also have to think about the operations and maintenance anytime any changes that happen. So connecting to multiple sources and syncs is not something that every developer or ML engineer or a data engineer are super excited about and they’ll have to write a lot of boilerplate code to start with. So that is another pain point. And then lastly, there are scaling complexities. When we think about scaling complexities, the first thing that I want to iterate upon is that at Intuit we are completely a Kubernetes ecosystem, but when you think of Kubernetes, it’s more in the context of a microservices, a single entity that will serve like HTP request and all of that, but they may not. Kubernetes as such is not kind of more focused on the data movement and the data processing kind of like the ecosystem.

13:13 So there are different kinds of scaling challenges from a platform team perspective that you would end up with. In simple terms, like let’s say if you think about auto scaling in the Q and ecosystem, ideally you would use your HPA, which is reliant on the let’s for example, resource utilization, memory utilization, CPU and whatnot. But ideally, when you think about your event processing systems, you would want to think about factors like, Hey, how many events am I receiving from my upstream event sources? At what rate am I processing? And once I write these events to your downstream, probably to a DB or whatever it may be, you cannot scale beyond a particular point. So you’d have to consider multiple factors even before you scale your event consumers. So in short, combining all of these different kinds of pain points from developers like the ML engineers or even the data engineers where they’re struggling to understand the complex data engineering frameworks and then unable to move away from their existing languages or the comfort that they are more used to and then operations that you’d have to take care of in terms of scaling and all of these made us to think about and then build this new project called As the Pneuma Flow.

14:32 With that, I will hand it over to Widget, he’s right next to me and then he’ll share a lot more about the Numa flow platform, what does it entail and how can you go about writing piece of stream processing jobs? And we’ll also share some demo after wit introduces the Pneumo flow as a platform. With that, over to you.

Speaker 3: 14:53 Thank you. Free. So let me get in weeds with what Pneumo Flow is about. So when we designed Pneumo Flow as an open source solution, what we did was we wanted to design as a serverless platform. This is because developers tend to when it is managed for them and that improves their productivity so they can focus on what they love doing the best. That is writing business logic or the code that really matters for ML engineers. That is mission learning, things like that. So first and foremost, what we had to do is we had to abstract the infrastructure. That means users do not even have to talk about what is the auto scaling needs they have. We wrote controllers so that to extend Kubernetes to understand streaming philosophies onto it, it is no longer a simple health check on an endpoint but rather understand back pressures and scale accordingly.

15:55 Second is that as things evolve, we use multiple data unbounded streams. For example, I think we use Apache Pulsar Kafka is one of the major backbone on eventing. We also use things like sqs, that is Amazon’s Q and so forth. So we wanted also the users not to worry too much about where the data is coming from because they really don’t care where it is coming from. Rather they’re interested in the data in the payload. So we thought, okay, we have to decouple it, so they just focus. This means that when you write the code and deploy, let’s say unquote, like in Lambda, Amazon Lambda, right? You just write the code and say, okay, invoke this piece of code either through a P gateway code or some SNS code. And lastly we wanted to make it scalable and reliable because these are running on containers that depends on other systems, right? So we wanted to make sure that we should be able to do cube deployments with zero downtime and zero day vulnerability should be patched as soon as possible for this Pneumo flow actually comes with its own control plane called Pneumo plane and that’s how we do it. But this is the key thing from a user perspective, this is given out for them. They don’t have to worry about any of the things like infrastructure sources nor about the security and the scalability aspect.

17:21 With that, let me just walk you through what the event processing graph looks like. So it all starts with an unbounded source. This could be Kafka, Apache, pulse, you name it. It can be anything you can write on your own source. It does much more than just from the underlying pneumo flow. That’s much more than that. It tracks event times and watermarks and so forth for completeness properties, but it’ll start with the stream and then once the data is read in, it’s passed over to the user, right user container or user defined function. What it does is the user has to do is they need to read the message given to them that is given out for them and they can make 1 0 1 or meaning basically filter the data out if it is not valid, let’s have formatted correctly. One meaning is the one to one translation for every input you want to write out, send them an analytics for example, or that could be zero to many.

18:16 Those are enrichment use cases or write amplification use cases. Once you have processed the message, you can make a decision, a runtime decision to say where to send this message. Should it go the part A or via part B or both parts? There could be many parts it could choose. These are very powerful in AB testing in ML and also choosing which kind of ML model. For example, to choose cases where we have low density data, you want to use a sparsity model versus in some cases you want to run a model that is placed on GPU. What it means is all these circles you see in the graph can be independently placed in different note types. For example, one UIF could be running in GPU. So those powerful features help users to choose at runtime what part the message should take. Lastly, it all ends up in a syn is a per stage where either you write back to a stream or you persist, persist into your database. It shouldn’t matter. Pneumo flows guarantees from the source toing any data that comes in the source is reliably forwarded all the way in an exactly once formatting semantics all the way to sync. No data will ever be lost or corrupted. That’s the guarantee we provide to the developers. Now that you’ve seen the graph, let me walk you through how does it take to write a serverless eventing function. So this is a python example I noted via the symbol down there on the top, sorry, bottom right.

19:51 So if you look at the signature, we give them data, data is type is an interface and all the function need to do is respond, give back the messages. It does not talk about Kafka or anything. It’s just about you’re given a message and this is the function we execute for every message that we get on the stream, right? First and foremost, usually the design principles, you create the envelope to return the results because there could be zero, one or many, right? So you create the envelope where you are going to push out the results. Then you do data value to get the data itself. What is the data? There are other things around meaning you can get hurdles, get metadata, like timestamp, so forth. But detto get value is where the real payload comes in. Then you can call your processing function. This could run for a while, right?

20:37 In ML system, it could run for minutes. In some cases it’ll be like symbol, millisecond, microsecond operations. Then you push the result to the list, not append, so that this message can be forwarded. Now you return the messages in any part of this code. It doesn’t talk about where the message is coming to or what to do if there is a transient network. Error platform takes care of everything. It guarantees you that for every message you receive, we’ll execute this function without fail and no message. Even if while after we return the messages, let’s assume that there is a port migration going on and the next vertex is maybe not running completely because of a cluster upgrade. We’ll guarantee that this message goes through I. Now let me quickly show what I mean by dynamic ODing, right? So this is the highlighted idea where you can actually, when you upend the message, right, you can annotate saying that tag as left meaning take the left path tag as right meaning, right path message drop, meaning drop the message and by default it is forward or path with this symbol semantics of A UDF function.

21:45 You can build complex trees and for example, in this case you have multiple sources where you are writing to a single thing after being processed by different UDFs. You could do joins in some cases. You could have a cycles if you need before itrate programming and this is more like you enrich and then you keep enriching until the enrichment is good enough. And lastly, you can even broadcast changes to the system using something called a side input so that these are, if you think about it, the source never dries up. It’s an unbonded source. So these are fire and forget pipeline. So there is, we need to give users somewhere, get information to adapt it itself to the new needs of the environment. Now I talked about U dfs. Now let me show how easy to write your own source. As a platform developer, we cannot write all the sources out there.

22:37 There are many sources, right? And many needs. If you remember early three slide, it was showing lot of sources of there and each company has their own way of reading from even from Kafka there will be a list that is application level encryptions and policies applied to Kafka topic. It’s not free for all read because there are PIA data and things like that. So we needed to give way to abstract the source and this is anem in trust. This clearly says that we are multi-language agnostic to an extent, right? Whoever implements this interface, for example, you should be able to read, you shouldn’t be able to act. Tell us how much is pending. This is for auto-scaling purposes. And lastly, how many ions do you have? This is for some adverse semantics, like a watermark completeness and so forth. But whoever implements this source can be a source.

23:22 Vertex saying goes for the sync. It should be very easy to write things because we cannot implement all the things out there every morning when I wake up ICM new database out there. So we need to be, the platform should be adaptable to let users write the thing the way we need, and I tend it actually most of the users use custom syncs rather than inbuilt syns because they have very custom requirements on how they write and how the data validations happen. So this is Java example and you just need to extend the single interface and you’re given an I trade here instead of once message at a time. We give I trade because in most things is better to write batches for better throughput and all we need to do is irate a do next and it’ll give the next gentle and just a response saying that okay, either all are successful or a few of them have failed.

24:12 Anything that has failed, we will retry. That is the guarantee the platform provides. This is a very high level architecture. I wouldn’t do just by talking about it in a minute, but I give you a very rough idea. The theory behind this is that it’s very difficult to get persistent systems in state systems in Kubernetes, right along with the streaming semantics. So the way you need to read this high level architecture, firstly there’s a green line on the very top showing data movement data moves from left to right for 99 percentage except when there are cycles, but still it’s more towards push towards right left to right. There’s a negative pressure that happens against the data moment. That’s a bad pressure. This helps us understand how to auto scale systems. And now if you come in the architecture diagram itself, it’s a tight loop between read, write and act and we read from something from the source, the source takes, it does some bookkeeping work, rise to the nest to a raft protocol so that the message is never lost.

25:17 It goes through a three way raft and the next vertex picks up the moment the right happens. Once the right is successful, we acknowledged back saying that, okay, this message has been processed and this site read write act is the one that moves data for the natural data movement forward. And lastly, I put in a blue line to show the watermark. Watermark is a completeness property, which we also provide to make sure that there is no data older than this data. This is more very powerful when you’re doing some radio semantics, which will help users to know when to close a window when they’re doing a one minute aggregation. This is a distributed system and we need to make sure that there will be some slow voltages and slow port. So we need to track the completeness property. Now that I’ve told is, let me summarize what Pneumo flow is, right?

26:08 Pneumo Flow is a Kubernetes native serverless platform, serverless platform for running, scalable, reliable, event driven application. It is possible through three pillars. First one is a Kubernetes native processing system, very lightweight, very versatile run. The same code can run on edge on cloud. You’ll see SRI will talk about some companies using Pneumo flow in production and you’ll get a flavor of what we mean by is very versatile. Second, it is language diagnostic, meaning you can write in Java, Python, then go and rush. It’s easy to add in USDA to it. We also have a couple of built-in sources and things, which is quite popular, but as you saw earlier, you can easily write a source and a sink. Lastly, scalability, we can autoscale from zero to one many and many to zero. This is quite important because these are fire and forget systems. We need to make sure that when there is no traffic, we should be able to autoscale the bay. It needs to reduce cost and it is very, very capable of running on resource footprint and on edge systems where there are no internet connections with this, I’ll give it back to SRI to give you a demo.

Speaker 2: 27:18 Awesome, thanks wit for doing a quick introduction of what Pneumo flow is, but let’s take a quick example and then try to understand what does it actually take you to write a real time event processing pipeline within Pneumo flow, right? So for this simple demo probably I’m just picking up a sentiment analysis because this is something that anyone and everyone can relate to. So on a very high level is what I’m trying to do is you are receiving a bunch of events and then you want to do a sentiment analysis on that event and then say that, hey, is it a positive sentiment or a negative sentiment? And then finally write the data to a sync. Now let me quickly go to the next step and then see what does it actually mean? What does it actually take for you to write that in Pneumo flow, right?

28:02 So within Pneumo flow everything is what we call is a pipeline. So each pipeline is defined based on your vertices and just, it’s nothing but a graph that you can build out defining, here are my vertices and here is how the entire graph is built out. You can define all of your address. We will show you the spec for even the sentiment analysis pipeline here. So first you need a source where you’re receiving a bunch of the events, this is your source vertex. And then the next is you are doing a prediction saying that Hey, is this a positive sentiment or a negative sentiment? And then finally you are writing the events, the results into a sync here probably I’m just going to use in this pipeline like a log sync or whatever. That’s something that is already available out of the box, but just for the demo purposes now.

28:53 So as WIT has mentioned earlier, we have pretty much have a lot of inbuilt sources and syncs that are most often used by the community and some of them have contributed back to the community as well. So as a developer or as a data engineer or as an ML engineer, if you are sources and syns are taken care for you, all you would focus on is just this prediction. Or maybe if it’s some kind of an aggregation, the processing logic for aggregation or some kind of even processing, whatever it may be based on the use case that you’d have as a developer or as an ML engineer or a data engineer, you’ll just focus on this middle step, what we generally call as a user defined function Here, the example is more of a prediction that we are doing on the event streams that we’re receiving.

29:38 So you just focus on this piece of code which is here, and in this example I’m just going to use a hugging face model that is already available that does the sentiment analysis for you. So let’s try to see what does it actually take for you to write this small piece of code to start with and then how do you build this pipeline spec, right? So moving on to the next slide where you can see this piece of code, right? All as a developer you would focus on is just this data or trying to implement this function. This is a Python example that I have written. So it’s a very simple where you are receiving the data, but here you don’t know where you are receiving the data, right? You are not worried about your source because it’s already taken care by the platform. It’s not HDB nor Kafka.

30:23 There is no mention of any of your event sources as such. So you’re just receiving some data that is provided by the platform to you and then you’re just acting on that data that you received. So you’re just saying that, hey, what is the value doing sentiment analysis on that and then just passing on that information to the next step. And then when I say in this next step, we said that we are writing the data to some sync, but you don’t know what sync it is, right? So even in this piece of code, you never mention what sync or whatever you are just saying that hey, I receive the data, I process the data and then just send it to the next step. And when it comes to the processing though, here is just doing simple sentiment analysis. There are a lot of complex things that you can actually do with the data.

31:04 It can be some map functions or reduce aggregations. Anything that you would want to get that you generally have with any kind of real time stream processing systems like Flink or any of those, you have all of the streaming semantics available into the platform in any language that you like. So I’m just showing you a Python example here. And then what does the pipeline spec look like here? If you take a look at it, we said that you need a source. If you look at this, it’s a pipeline, it’s a custom CID that we have implemented for pneumo flow where you have the pipeline as a kind and then you define your spec saying that here are my vertices where I have a source, I’m doing an inference or the prediction in the middle step. And then finally writing that results to a log thing which is already baked to the platform.

31:49 It’s an out of the box saying that is available for you to debug and all of that. And then you’re defining how are the edges defined. So first time from into sentiment inference to log is how the pipeline spec the edges. That is what I’ve shown in the earlier diagram is what it’s going to look like. So in short, as a developer or as anyone who is interested in doing processing, you just focus on this piece of code and the sources and syncs majority of the time are taken care for you. But if you have custom use cases, you can implement your own source and sync. And then the inference part of this processing logic is just nothing but a container here. So it’s as simple as that for you to create any kind of event processing pipeline in pneuma flow, not worrying about the operations and auto scaling and all of that.

32:38 Let’s try to see, we mentioned about auto scaling and all of that, right? I’ll show you multiple pipelines that we are running in production at Intuit and then see the scale at which we operate and also the different languages that we are actually using and all of that. So let’s start with a very simple example and then I’ll try to take you to very complex pipelines as well. So the first one I mentioned about is the sentiment analysis pipeline, right? Let’s say here is the ui. All of this is available as part of the open source project. It’s not a custom UI that we have internally only for Intuit or anything. It’s part of the open source platform. You get all of the bells and whistles right from that help you to debug very close to where you are trying to build your pipelines and all of that.

33:21 So let’s take this example right where I have my HTP source just I’m sending a bunch of events to this HTP source doing a bit of inference on the data and writing the results to the logs. So I’m not sending any of these events right now as I’m going through the demo, but I’ll show you the other pipelines where we are processing thousands of events in real time. So if you take a quick look at it, each of these are what I said as vertices and these are your edges and you can, if you want to understand what is happening with the health of the different vertices, you can take a look at all of those metrics here within the platform itself. Take a look at your logs and all of that, how much memory are you consuming and all of those. And because if you’re writing it to a log, you have a log sync here, you can see the results and all of that within the platform here itself.

34:12 Let’s say for example, I’ve sent some events earlier, you would see some of those results in the logs here, but I think it’s scrolled through. But let’s take a different example and then try to see how complex can your pipelines get. So let’s me take another one. This is a real production pipeline that we are running at scale here. I’ll also touch upon the multi-language that support that we mentioned auto-scaling and all of that. So if you take a look at it, we are reading the data and this particular pipeline is used for our observability needs where we do real time aggregations of availability and various other metrics that are important for to meet our SLAs and monitoring needs and all of that. So our input is basically a Kafka where we receive all of our events. And if you look at it within our call itself, you’ll see the number of events that we are processing can range from 5,000 events per second to 10,000 events per second to very small.

35:09 So these events come in like a particular set of bursts and then you try to process them and then to take it forward. So this is your input source, this is what we are using as Kafka. We have inbuilt Kafka source that you can actually use. And then after that, once you receive those events, we are doing a bit of pre aggregation. So you can think of this pre aggregation as most likely, a lot of map produced kind of aggregations that you would want to do. And similarly, after you do pre aggregation, you do post aggregation and then you’re actually sending the events to multiple. Once you process that, we are sending to multiple things. One of them we are sending it to Kafka and then we are doing some kind of anomaly detection on that data. You’ll also see that next pipeline, all of that.

35:51 And we are also sending the same set of data to one of our observability tools called as Wavefront, where we are storing all of these metrics. We use it for dashboarding, needs, alerting and all of that. So all of these events are processing, all of these pipelines are processing the data real time and then aggregating the data and all of that. So another thing that I want to really touch upon here is the autoscaling piece. When we looked at those earlier challenges we mentioned about the complexity of doing event processing or the realtime stream processing. The second is auto scaling, the platform challenge, the autoscaling challenges and all of those. But if you look at all of these, this entire pipeline in itself, each step has its own dedicated set of pods. It’s like you have three pods running for this particular input vertex, and then you have five pods running for pre aggregation, three for post aggregation, five for publishing to a sync and one only just to Kafka.

36:51 And each of these, if you look at it during our conversation itself, we saw it was at 500 events per second to a hundred events per second, and now we are seeing a burst of 2000 events per second. We scale each of these individuals vertices based on multiple factors. One is we understand what is the backlog that is there upstream from your event sources and at what rate can you process? And then also what is your downstream dependency? Let’s say you’re writing to a DB and probably there are only so many connections that you can have to db. So you are always throttle, right? You cannot scale your steps that are before. You can only scale your steps before that to an extent, not beyond a particular point. So we understand all of these semantics based on the back pressure and all of that, and then try to scale each of those components what is only required and make sure that we process the events at the scale that is needed.

37:47 So see, again, you can see that the number of events dropped from 2000 to 500. So we always keep on seeing that in cycles. And then based on the requirements and the need, these pods or the scale sets that you don’t have any latency issues or any of those. So the third thing that I want to highlight upon here is this is a polyglot pipeline. We said that you can use any language that you would want to write. So for example, some of our pre aggregation steps are written in Java and then post aggregation, some of the team is return in go. The reason is that because when you think about your teams as such, some of them could be really experts in Java, Python, or even Java, sorry, go. So you can use any language that you’d want to use and then you can come together and then build out these pipelines and use your own processing logic, whatever you would want to do because each of these are individual containers themselves and then you can just package your logic into that.

38:44 And we take care of the auto scaling, sending data movement and all of that. So end to end. So that’s like one pipeline that you can see right now. We are at like 4,300 odd events. This is a real time pipeline that you are seeing an action where we are processing the data for all of our observability needs at Intuit. Now let’s move on to another example, which is even more complex. The complex structure and the complex pipelines that you can actually build with a platform and even it has a bit of flavor of ML into it. So this is our realtime anomaly detection pipeline where once we process the data in the previous pipeline, we try to do anomaly detection to say and understand that hey, whatever we are seeing in terms of these metrics, are they anomalous behaviors? If they are anomalous behaviors, probably we need to send out alerts to the team so that they can start debugging and whatnot.

39:35 Once we receive all of that data, we are doing a bit of pre-processing here and then we are sending it to inference. Inference is nothing but a production where we are doing an anomaly detection saying that, Hey, is this anomalous or is this not anomalous? And then we are doing a lot of and then writing it to multiple things. But in short, if you take a look at this pipeline, it has all the features like whatever we have mentioned about either right from reading from one source, doing conditional forwarding to multiple places using multiple languages. This is say for example, inference is written in Python. We also use the same pipeline to trigger some of our training workflows related to the anomaly detection models. And pre-processing is actually in Java. So using, as we mentioned earlier, one platform that solves for connecting to multiple sources and syncs, use any language that you’re interested in to doing any kind of event processing.

40:32 And then also the platform takes care of autoscaling. We can see it here. Each of these individual vertices there are different, you have one pod here, two pods here, four pods at the source, three at inference. So we understand the end-to-end pipeline at what each step is processing and then scale each of those steps individually such that you don’t have any kind of consumer lag and all of those in short. And lastly, as a platform engineer or any developer, you’d be more interested in also debugging any, if you have any issues, right? As I showed earlier, you can quickly take a look at all of your logs here in itself and then look at understand your metrics, whatever is most important for you. We provide all of that close to the platform so that it’s one end-to-end platform, right? From processing operations and maintenance and doing any kind of debugging close to you wherever you are. So that’s the power of the entire Pneumo flow platform as such. And these are some of the examples we are running at scale at Intuit, and there are a lot more within the community I’m going to share in the next upcoming slides. So let’s go there.

41:49 So here are some of the other use cases. I’ve just quickly sampled out some of those so that we can show the breadth of the use cases and the various places where Pneumo Flow as a platform is being used for real time event processing use cases. So as I said, it’s from FinTech to aerospace. You can see across the board where teams are using it for first, as I showed one of the pipeline attu it, we are using it for anomaly detection, various hundreds other use cases. And the second is Lockheed Martin uses it for IOT data processing. And then B cubed is basically they are into doing event processing of radio frequency signals on edge devices. So when you think of Edge devices, you are constrained with resources and all of that. So Pneumo Flow as a platform or some of these pipelines are so lightweight that they were able to write the processing logic that they were interested in using Python and then just deployed in very small devices so that they can do some radio frequency signal processing and all of that.

42:48 And Atlan, they’re a very good startup and they’re more interested, they’re focusing on realtime notification systems and data movement and things like that for their internal use cases. And Seeker uses it for some of their ML pipelines. And then chain, they use it for kind of like a fraud detection on blockchain data. So if you look at all of these breadth of use cases, right from FinTech to aerospace to some kind of industrial teams, they are using Pneumo Flow for doing event processing and from it could be very few events to any scale that they would like. That’s one part. And the second is using the languages that they are interested connecting to any sources and things that they would like. So in short, some of the pain points that I’ve mentioned are not just for Intuit, they are something that are really resonating with the community across the board.

43:41 And these are some of the pain points as you would be coming across if you’re trying to build any kind of even processing systems or real time stream processing systems, you should absolutely take a look at Pneumo Flow and moving on last slide. So if you have really liked whatever we have shared today, so you can learn more about Pneumo Flow by just quickly scanning the QR code or just search for Pneumo flow on Google, you should be able to find that GitHub repository. So do star us if you absolutely appreciates the support, whatever the team is trying to do as a platform and trying to open source to the community. And at the same time you can get updates and all of that. So we have a link to the Slack channel on that. You can on the GitHub Depot report, I can show it that quickly to you, but if anyone is interested, we are completely an open source project.

44:33 Absolutely. You can look at the issues that we have, good first issues or anything that you’re interested in, feel free to pick them up and then reach out to us. We’d be more than happy to help you. And if you want to, you have learned about Pum Flow where multiple teams, multiple companies are actually using it for different sort of use cases. If that is something that you kind of resonate with, absolutely try it out, reach out to one of us or join our Slack channel and then share any comments or any thoughts or any use cases that you might have that we could possibly support or you would want to contribute. We’d be more than happy to have you as part of our community. Let me quickly show you that GitHub report. You can just go to NMA Flow, you can see all the various set of use cases like quick introduction, like the various use cases and all of that.

45:21 You can look at our welcom coming roadmap and also you can join the Slack channel right from here and do star US if you really like that conversation today. Yes, and also we’d be presenting at CubeCon, like Paul mentioned earlier, we are doing multiple talks at CubeCon, different set of use cases right from ML use cases to real time event processing use cases. And also we are going to share about how our payroll platform team is trying to reimagine their entire architecture using event driven approach and how they’re actually using Pneumo Flow to solve some of their pain points. You have multiple talks. We are going to talk about all of these and we have one which is through on the DOK day itself. So wit and another colleague of ours is going to provide a talk on that. But yeah, that’s pretty much it for today, where we wanted to share about some of the pain points of real time stream processing or event processing and how did we go about tackling them with Pneumo Flow. Yeah, we are open to take any questions. Any thoughts?

Speaker 1: 46:32 Alright, thanks so much S and vi. Yeah, we do have one question right now from Jakob s sch who said, how would you compare Pneumo flow to something like Flink, for example? What would be the advantage? Disadvantages?

Speaker 3: 46:50 We use Flink heavily at it. It is mostly as a data processing framework to do very high throughput data processing. The way you look Pneumo flow is more on a eventing side. For example, users can write code in the procedure where they know rather than to learn new semantics. So if you’re familiar with Flink to do data processing, Flink is the way to go. But what we found out most of the developers are not data engineers, like they’re application developers and they want single image. For example, think about an individual ecosystem. We use Argo CD to deploy applications. We wrote Argo, and they want the eventing system to be along with the same Argo application. The eventing should use the same container image that is being written because there’s a lot of code that is being shared between the HTTP endpoints and the eventing framework. If you are doing so, just to reiterate, if you’re doing data processing and if you are very familiar with Flink is quite good to use, I wouldn’t compare A versus B, rather, it’s more of a choice based on the constraints you have at hand.

Speaker 2: 48:01 Just adding to what WIT has mentioned, one of the thing that has been really resonating with the community is when we think of the likes of Link, it’s primarily a huge Java shop that you’d have to get into. That’s one. And the second is you’ll have to learn a lot of these realtime stream processing concepts. But if you take those examples of let’s say BQ or any of those teams, they want to do real time stream processing, but they’re not really well versed with Java or probably moving into a new platform that is bulky for them to manage and all of that Flink. So that is the most important piece that is really resonating with the users, even internally or external is like, okay, I can just use my own language, which I’m really comfortable with, but I still get all of the streaming semantics, like doing any kind of aggregations, red use map, whatever is needed that is available through the platform. So we are just getting closer to anyone wherever they are, rather than making them move away from their existing ecosystem to a new platform altogether as such. So that’s the biggest value prop Pneumo flow as a platform provides you to anyone who is interested in stream processing or any of those.

Speaker 1: 49:11 Okay, cool. Thanks. Actually, I have a question as well. It’s from me. So what monitoring and observability features does Pneumo Flow offer?

Speaker 2: 49:21 Oh, so I think that’s a very good question because this is something pretty much all of our developers keep on asking us. So as I showed you in some of these pipelines, the first is we ensure we emit metrics that will help you to monitor in multiple folds, either in terms of understanding what is the health of the pipeline to start with. You can see that we publish some of those metrics and the components that we use internally, we publishing that, are they healthy, are they running? And all of that out of the box. That’s one. And the second is we publish a lot of metrics so that you can understand the end-to-end latency of a pipeline and all of those. You can also, I’ll show you some of the debugging features within the platform itself. So these are various metrics that we publish out of the box for you.

50:13 You can create any kind of alerts on top of these and so that you can monitor whatever you’re interested. You can find this in the documentation itself, like anything that you would want to look from an operator manual perspective. The other thing is the logs and all of those, which the users, let’s say for example, you want to quickly debug, probably you might be ideally going to Splunk and looking at all of those, but if you want to quickly look at what is happening, we have all of the logs and everything available close to your pipeline itself so that you can just quickly take a quick glance and then try to debug from there. So if need be, so all of these are part of the nice UI close to you, wherever you are and then we probably the metrics, whatever is most important. Yeah.

Speaker 1: 50:54 Okay, thanks. And then we have another question from Robbie. So he’s asking, so does Pneumo Flow require Kubernetes?

Speaker 2: 51:01 Yes, we are Kubernetes native platform and it is built for Kubernetes for event processing.

Speaker 1: 51:09 Okay, great. Any other questions? We’ll wait just a second because there’s a little bit of a delay between us and the stream, so we’ll wait a few more seconds here, but if there are no more questions, we’ll move into the quiz, which will, and you can win a free date on Kubernetes shirt. Actually, you both can participate as well because the questions are not based on your talk. Well it seems like we don’t have any other questions yet, but thank you so much for presentations was really informative and I think yeah, a lot of people could get a lot of use out of NNU and Flow. So yeah, thanks for spreading the word. So I think we will start the quiz. So that’ll take a second. You get out your smartphone, there’ll be a QR code that you can scan.

52:01 Let me get that popped up here. There we go here. Okay, so if you scan this QR code, we’ll do a little quiz. It’s just some pretty simple questions, mostly based on the introduction here. So if you scan that, we can get in on the quiz just for pretty simple questions. And the faster you answer, the better the more points you get. So if people answer the same questions, give the same answer, the correct answer, it’s whoever did it fastest. So yeah, if you participate and you win, you’ll get a free date on Kubernetes shirt.

52:44 Don’t be shy. Scan the cure. Okay, we’ve got at least one. I can’t participate because I know all the answers. I created the quiz. Alright, we’re getting started here. All right, we’ve got some players here, six players. All right, question number one, in which city will the next Kon EU take place? Okay, we’ve got Madrid, London, Urich, or Brussels. Fast answers. We’ll get you more points. Alright, London, it is April 1st through fourth. CFP is, oh, maybe I shouldn’t, maybe I gave you the answers to some future questions, but we’ll see. Alright, Android in the lead.

53:45 Question two, four. Okay, here we go. When does CFP for KubeCon EU? Close? October 31st. November 24th. December 25th or January 1st. All right, looks like most of you got that one right? Nice. November 24th. That’s the date. So if you’re thinking about submitting for the EU conference, you better get working on your submissions now. Now that just to be clear, that’s for KubeCon and not for DOK day. That will be a separate CFP. Okay, two more questions here. Will you be the winner? Okay. Which CNCF project and recent DOK talk guest join the DOK Community collaborator program. All right. Is it ess? Is it Colo, Longhorn, or Zy?

55:11 All right, well, a couple of you got it is ess. So they joined us. We had a talk from Deep Tea Secret ready last week, which was a great talk. So we welcome them to the Community Collaborator program. All right. Okay. Android still in the lead. See if you can hold together for this last question. Okay. What are the two strategies mentioned for running fault tolerant and cost optimized spark clusters in EKS data replication and load balancing? Oh, I can’t, I’m just going to let you read all those. Okay, well that one was a tough question. Okay. I think. Okay. Our leader is Android, so Anne, I don’t know who you are because is anonymized, but you can reach out to me on LinkedIn on the DOK Slack. Just let me know. I’m Paul Ow on LinkedIn au is my last name or Slack. Reach out to me and give me your address and your shirt size and we’ll send you a date on Kubernetes shirt. Again, I want to thank RI and Viji for today for presenting. Yeah. Hopefully we’ll see you at DOK day or at K Con.

Speaker 2: 56:41 Thanks a lot.

Speaker 1: 56:42 Yeah, thanks. All right. Thanks for coming.

Data on Kubernetes Community resources