We hold live meetups every week where different guests share their stories, wisdom and practical advice on how to overcome common issues. All meetups are recorded and put on our YouTube and Podcast channels.


5:00pm GMT/9:00am PST
1-Is Kubernetes even ready for data?
Kubernetes has been a great solution for deploying application infrastructure. Trying to manage your data with the same control plane has been, less than ideal. This has been even more true when using distributed databases like Apache Cassandra. Once you get past the storage and stateful sets, you still have a lot to do. Let’s have a frank talk about the new opportunities to make Kubernetes ready for data.
Patrick McFadin
5:00pm GMT/9:00am PST
2-Data on k8s maturity check
Let’s talk about storage. Optoro has moved to running stateful stores on Kubernetes. It’s a challenge, but it has a lot of value. Let’s talk about how we chose to do it, and what we figured out along the way.
Zach Dunn
5:00pm GMT/9:00am PST
3-Design considerations for operationalizing Distributed SQL on Kubernetes
This talk is targeted towards cloud-native developers and architects looking to deploy the operational database on Kubernetes.  We are going to walk you through the design decisions YugabyteDB's team took when architecting the database as a service on Kubernetes. We are going to cover concepts related to Kubernetes Volume provisioning, pod placement strategies for data resilience/High availability, and how cluster events are used for reconciling the k8s workloads during day 2 operations like upgrades, scale-up/down.
Nikhil Chandrappa
5:00pm GMT/9:00am PST
4-The problem of stateful workloads - balance of keeping data HA vs. costs
In an engineer’s ideal world we would love all the resources and redundancies we can possibly get for our services and infrastructure that supports them for sanity and of course, HA. However, how do you balance between “enough” redundancy and the actual operational costs of supporting such engineering choices, and what are some of the tough engineering decisions that need to be made? This talk focuses primarily on services being run on Kubernetes (or public cloud offering of Kubernetes), but the principles can be extended to any infrastructure environment. Key Topics: capacity planning, cost management, distributed services
Ren Lee
5:00pm GMT/9:00am PST
5-The full cycle of doing data on k8s: a case study
Scaling ACID compliant databases in the cloud is challenging. We’ll look at a specific use case where we’re trying to scale a Saas Odoo ERP offering on Kubernetes and build a scalable Postgres cluster as a backend service.  
Dave Cook
5:00pm GMT/9:00am PST
6-Operators, operators, operators…operators
Operators represent a great opportunity for the data community to solve for the complexities of managing data products for their customers in a way that standardizes UX and integration points -- historically the most powerful solutions had to be niche and highly customized.
Amit Gupta
1:00am GMT/5:00pm PST
7-Conway’s Law & Kubernetes: Centralization vs. small team autonomy
Big clusters or small clusters?  Where to draw the line and how to know whats best for your use case? We will be talking to Joseph and Mike from Adobe about the inevitable questions that arise when running k8s at scale. If it is run by the platform team, is it inevitably a pet?  Or more of a pet?  Is that the idea, that we give stuff that ” must not fail” to platform teams so they are common services w/ SLAs?  Or how is it decided what is owned by the platform vs. the individual teams. While talking with Joseph and Mike we will also dive into what their stack looks like, must have tools they use on a daily bases, VM vs K8s, differences in stateful apps on k8s and War stories!
Joseph Sandoval, Mike Tougeron
5:00pm GMT/9:00am PST
8-Appropriate workloads for databases in K8s
As more companies are moving to kubernetes and cloud native as a standard for developing net new functionality something has to happen to the legacy workloads. Often times we see a lift and shift mentality into kubernetes, we will talk about how that mentality can be dangerous or cause more work than expected.
Rick Vasquez
5:00pm GMT/9:00am PST
9-Geospatial Sensor Networks and Partitioning Data
We use resources like weather reports or air quality measurements to navigate the world. These resources become especially important when faced by extreme events like the current wildfires in the Western USA. The data for the reports, predictions, and maps all start as realtime sensor networks.In this talk, I’ll present some of my research into scientific data representation on the Web and how the key mechanism is the partitioning, annotation, and naming of data representations. We’ll take a look at a few examples, including some recent work on air quality data relating to the current wildfires in the western USA. We’ll explore the central question of how geospatial sensor network data can be collected and consumed within K8s deployments.
Alex Miłowski
5:00pm GMT/9:00am PST
10-Data on Kubernetes and container attached storage - an update
Back in 2018 the CNCF published a blog we wrote called Container Attached Storage. Today - September 22nd 2020 - a new blog is appearing on their site updating Container Attached Storage. This talk borrows very heavily from that blog. What is CAS? Why would anyone use Kubernetes itself for storage? How does a microservices architecture help? Why is shared storage at the end of the road - though still used underneath CAS sometimes?
Evan Powell
5:00pm GMT/9:00am PST
11-Doing Data Wrong
In this talk, we'll look at great ways to lose data (like running databases on Kubernetes and bare metal), pain points for developers, lessons we've learned, and have a Festivus in September airing of grievances sessions for those who have felt this pain.,
Jeremy Tanner, David McKay
5:00pm GMT/9:00am PST
12-PostgreSQL-as-a-Service on K8s at Zalando
PostgreSQL is a powerful, open-source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance, but an production-grade deployment requires many complementary technologies to the database core: high availability and automated failover, backup and recovery, monitoring and alerting, centralized access control and logging, connection pooling, and so on. Being not initially accustomed for running stateful workloads, Kubernetes with its infrastructure as a code paradigm, CustomResourceDefinition, and Operator pattern turned out to be extremely convenient for deploying and running PostgreSQL at scale. I will talk about a few open-source project developed and maintained by database team at Zalando which anybody could use to build own PgaaS: 1. https://github.com/zalando/patroni - Tool for PostgreSQL high availability and cluster management. Integrates with K8s API and makes PostgreSQL cloud-native. 2. https://github.com/zalando/spilo - The Docker image that packages Patroni, multiple versions of PostgreSQL, and tools for backup and recovery. 3. https://github.com/zalando/postgres-operator - Implements Kubernetes operator pattern, orchestrates hundreds and thouthands deployments of Patroni/Spilo clusters Aforementioned projects would never get to the current state without an effort of dozens external contributors.
Alexander Kukushkin
5:00pm GMT/9:00am PST
13-Distributed Workloads on Kubernetes: Operators to the Rescue
How easily can you run distributed workloads on Kubernetes? The initial deployment of your 10-nodes database might be easy to setup, but day-2 operations (changing the configuration, adding and removing nodes, version upgrades, etc.) are much more complicated. We'll discuss how operators can help you manage distributed workloads, and a few operator tricks we learned while working on ECK (Elastic Cloud on Kubernetes) - an operator for the Elastic stack. 
Sebastien Guilloux
5:00pm GMT/9:00am PST
14-Kubernetes Cost Control
The importance of cost control while working with the cloud. K8S, Data & Cost Control Hints/Tips around controlling your K8S costs,
Arie van den Bos
4:00pm GMT/9:00am PST
15-Reaching limits in K8s: A case study with Ingress Controller
When talking about data, we usually think about big data and scale, and what do we do next. Such limits are sometimes a good problem to have. In this talk, we'll discuss our approach to this situation using the Ingress Controller.
Laurent Rouquette
4:00pm GMT/8:00am PST
16-HyperStore-C: S3 object storage managed by Kubernetes
Cloudian’s HyperStore is S3-compatible object storage software focused on the enterprise market.  In this talk, I'll discuss how and why we are working on Kubernetes-managed versions of HyperStore, including where we are now and what we're looking.
Gary Ogasawara
4:00pm GMT/8:00am PST
17-Is k8s Even Ready For Data? Round II
Data on Kubernetes Community: Is K8s even ready for data, Round II - Cassandra on OpenEBS - aka CaSS on CAS. In our inaugural DOKC meet-up, Patrick McFadin Developer Advocate at Datastax emphasized the challenges of running Cassandra on Kubernetes, concluding at one point that “Kubernetes might not be ready for Cassandra.” Since that meeting, the use of the open-source Container Attached Storage project OpenEBS as a simple and high performance per workload storage for Cassandra has proliferated. Also the Cassandra Operator from Datastax, aka “CaSS”, has progressed as well. So - where are we now? Is CaSS on CAS working well?  What is the future of collaboration between Datastax / Cassandra and MayaData / OpenEBS? Is Kubernetes now ready for Cassandra? What are the emerging technologies that might shape storage and Kubernetes in the near future?,
Jeffry Molanus, Patrick McFadin
9:00am GMT/1:00am PST
18-DoK Panel: The State of State
Stateful vs stateless? We will stately be stating our statutes regarding the status of the state of statefulness and statelessness on k8s- oh yea! In the DoK Community, one of the main issues that folks have are how in the world can they flatten the learning curve when it comes to running stateful applications in k8s. That's why we've brought on 3 experts from 3 different countries to tell us what state state (intentionally doubled) is in!, Dok Panel: The State Of State On k8s, Just a participant of the meetup discussion, https://www.meetup.com/Data-on-Kubernetes-community/events/274551382/ A fireside chat on states and kubernetes.
Rosemary Wang, Lili Cosic, Tomasz Cholewa, Jacquie Grindrod
5:00pm GMT/9:00am PST
19-Towards a K8s Native Streaming Application
Starting from a simple application which can be deployed in every machine running Docker, we will go through all steps required to transform the simple app into a Kubernetes native streaming application. We will explain the theory and then exemplify the learnt concepts to define a recipe for running streaming applications on Kubernetes. We will focus both on cultural and technical tricks to help you successfully adopt streaming applications at scale. At the end of the talk, you will have a comprehensive view regarding all platform building blocks and application requirements needed to successfully run a streaming application on Kubernetes. Spoiler: you will hear several times the words Apache Kafka, Kafka Streams and Strimzi.,
Francesco Nobilia, Jeremy Frenay
5:00pm GMT/9:00am PST
20-Tips and tricks to get Kubernetes certifications
CKA (Certified Kubernetes Administrator) has a bad reputation as the hardest certification many people have faced. In this talk, we will go through the process to pass successfully the exam, tips on the exam itself, the environment and any other question that might arise. , How to fly into a kubernetes certification.
Eneko Perez, Carlos Gomez Carrero
5:00pm GMT/9:00am PST
21-Data on Kubernetes: my insights
Data handling is one of the hardests things in Kubernetes. This talk will be an informal conversation about things (relateded to data management) Eduard found helping customers to embrace Kubernetes. I hope you find them useful!
Eduard Tomàs
5:00pm GMT/9:00am PST
22-Vitess Operator for Kubernetes
In this talk, I would like to uncover our newly announced Vitess Operator for Kubernetes. This talk demonstrates the sample implementation of Vitess in Kubernetes topology. I also explore common DBA tasks by demonstrating how they are handled in the Vitess ecosystem. Vitess, out of the box, comes with a lot of tools and utilities that one has to either incorporate or develop to manage MySQL topology. Let’s take a look at the capabilities of Vitess in these areas and demonstrate how they are performed under the operator realm. 
5:00pm GMT/9:00am PST
23- 2021 DoK Community Kickoff! Trends, friends, and more!
For our 23rd installation of the Data on K8s community meetup, we will be talking with Ariel Munafo who is a CNCF ambassador and the founder of EuropeClouds (among many other things), Arie Van den Bos Senior Systems Engineer on Cloud Systems at Kurago, and Jake Page who is a DevOps and Cloud Native Enthusiast.,
Ariel Munafo , Jake Page , Arie van den Bos
5:00pm GMT/9:00am PST
24-The architecture of a distributed database
Cockroach Labs has built a database architected from the ground up to be distributed. It is a perfect fit for the cloud and Kubernetes as it naturally scales and survives without manual interaction. The unique architecture of CockroachDB delivers some key innovations that may not only provide value for your applications but might also give you insight into the challenges/solutions in distributed systems. In this session, we will deliver a deep-dive exploration into the internals of the database, exploring the following, and more: * How the database uses KV at the storage layer to effectively distribute data * How Raft and MVCC are used to guarantee serializable isolation for transactions * How Cockroach automates scale and guarantees an always-on resilient database * How to tie data to a location to help with performance and data privacy
Jim Walker
5:00pm GMT/9:00am PST
25-Deconstructing Postgres into a Cloud Native Platform
Is deploying Postgres in Kubernetes just repackaging it into a container? Can’t Postgres leverage the wide range of Cloud-Native software and integrate well with K8s? Join this journey that will cover and demonstrate, with demos running on StackGres: * How to structure Postgres into an init-less container, plus several sidecar containers for connection pooling, backups, agents, etc. * Defining high level CRDs as the single API to interact with the Postgres operator. * Using K8s RBAC for user authentication of a web UI management interface. * Using Prometheus for monitoring; bundling a node, Postgres and PgBouncer exporters together. * Proxying Postgres traffic through Envoy. Terminate Postgres SSL with an Envoy plugin, that also exports wire protocol metrics to Prometheus. * Using Fluentbit to capture Postgres logs and forward them to Fluentd, which stores them on a centralized Postgres database.
Alvaro Hernandez
5:00pm GMT/9:00am PST
26- How to unblock your release pipelines with data
Even though microservices are becoming a pattern, we still see a lot of "monolithical" deploys and manual reactive actions. This blocks the ability to achieve maximum velocity in your release. We can leverage data and smart use of traffic-shaping to achieve a higher release velocity AND quality.
Olaf Molenveld
3:00pm GMT/7:00am PST
Nederkube Edition #1 - Is Kubernetes ready for Data Management?
Kubernetes became the standard for micro services architectures. But what about handling massive and scalable data management on top of it? Is it possible and what does it mean for operations? Cassandra has been adopted widely and accepted globally as the most scalable and reliable database. Now it adds ease of use by offering a Kubernetes native plug and play solution for enterprise use!,
Michel de Ru, Arie van den Bos, Jeffry Molanus
10:00pm GMT/2:00pm PST
"DoK Brazil #1 - DevOps, Kubernetes and Data"
My experience in this contemporary technology journey of the last 4 years, fears, mistakes, IT paradigms, and agile methodologies impact my goals.
Rogeria Portilho Rodrigues
5:00am GMT/9:00pm PST
"27- Cost management for OpenShift, a new SaaS service to understand your Kubernetes costs"
For IT decision-makers, this goes above and beyond just keeping infrastructure running and efficient; it is about understanding how your IT budget affects your business, and how well your resources maximize the use of your budget. This makes it critically important that IT teams can more quickly and easily see the totality of their IT costs across the hybrid cloud. We’re pleased to introduce a new software-as-a-service (SaaS) offering intended to help our customers better understand the costs of their OpenShift environments: OpenShift cost management. Available free of charge as part of a Red Hat OpenShift Container Platform subscription, OpenShift cost management provides a simplified, more intuitive view into the costs, from the macro to the granular, of an OpenShift deployment.
Sergio Ocón Cárdenas
5:00am GMT/9:00pm PST
28- Getting Started Contributing to Kubernetes
This talk will walk through how to get started contributing to Kubernetes, combatting imposter syndrome, the many other ways you can get started contributing to K8s other than by writing code, and the benefits to joining a community such as K8s. ,
Rin Oliver, Savitha Raghunathan
5:00pm GMT/9:00am PST
#30 Kyverno for Kubernetes!
Kubernetes is powerful but can be complex to manage! In this talk, Jim Bugwadia from Nirmata will show how policy managers can help address the complexity via admission controls and dynamic configurations. Jim will introduce Kyverno, a Kubernetes native policy engine and CNCF sandbox project. Jim will then demonstrate how you can use Kyverno to ensure security and best practice compliance for your clusters.
Jim Bugwadia
7:00pm GMT/11:00am PST
DoK Brazil #2: Bora entender as Bases de dados na nuvem com a ajuda de Wagner Bianchi! (Talk in Portuguese)
Uma conversa descontraída sobre o futuro de bases de dados como um serviço. Dados em Kubernetes desde o ponto de vista dum DBA. E várias outros assuntos parecidos.
Wagner Bianchi
5:00pm GMT/9:00am PST
#31 The Data Lifecycle - Where Do We Go From Here
Going from raw data to machine learning models successfully in companies of all sizes requires more than just an understanding of programming. Teams need to manage their data products lifecycle, their software as well as the data. Data products like machine learning models aren’t created out of thin air. They are built on layers of best practices that ensure the models are using accurate data, they are outputting reliable numbers and they have some method to interact with the outside world. So how do we get there? The purpose of this talk is to discuss the current state of the data lifecycle as it pertains to creating data products. This could be machine learning models, dashboards and data APIs. We will outline the general architecture that helps take data from raw to some form of machine learning model. In addition, we will discuss some of the concepts that are being applied from DevOps as well as being created in MLOps to help better facilitate your data life cycle. 
Benjamin Rogojan
5:00pm GMT/9:00am PST
#32 How to choose a Kubernetes distribution for on-prem environments?
Buy a ready off-the-shelf product, customize an existing open source project, or build your own distribution? When you can't go to the cloud and leverage its powerful features you have to make a choice. On-prem environments need more attention, but they also often can be more cost-effective and are highly coveted by the development and operations teams. In this talk, I will cover some of the most important topics related to building an on-prem Kubernetes platform and I will describe the most popular distributions.
Tomasz Cholewa
5:00pm GMT/9:00am PST
#33 Making observability accessible is the fourth pillar
Observability systems are typically a collection of tools that cover the three pillars of logs, metrics and tracing. These enable skilled engineers to correlate telemetry insights to perform data-driven diagnostics and rectify degraded services. In this talk, I discuss how over the course of three years, I have worked towards removing the built-in gatekeeping that comes with creating monitoring solutions and enabling them to work for an entire organisation. We shine a light on the overlooked developer community that interact with Observability but does not necessarily hail from SRE disciplines. Engaging with anecdotes from my past and illustrating the inherent bar to success that comes with connecting multiple tools together and the context that requires to achieve results. With years of experience working to improve adoption and create consumer-friendly facades for tools such as Grafana, Prometheus and Jaeger; I draw upon my background within large financial institutions and how building engaging and simplified DX can compel and excite engineers to work with observability.
Alex Jones
5:00pm GMT/9:00am PST
"#34 Opstrace, An open source alternative to services like Datadog, SignalFx, and others..."
Open source observability should not be hard. What companies package as their enterprise offering should be available to anyone who wants to monitor their systems. Opstrace is a complete monitoring platform designed for the end user instead of the expert. It's goal is to be as easy to use and operate as a hosted SaaS provider but within ones own cloud account. This is not only up to 10x more cost-efficient but also allows full control over ones data.
Sébastien Pahl
4:00pm GMT/9:00am PST
#35 Make Kubernetes your development environment
Developers spend a lot of time making their local machine look like a cluster. But why do we do that? Our local machine is not where our code is supposed to run! We built okteto (github.com/okteto/okteto) so we can make our Kubernetes clusters look like our local machine. In this talk, we'll show you how okteto helps you take advantage of all the goodness of Kubernetes and the cloud without having to sacrifice a really fast development and feedback loop.
Ramiro Berrelleza
4:00pm GMT/9:00am PST
"St.Patrick´s Day Special - A diplomatic answer to the meaning of data, kubernetes, and everything"
I will talk about my experiences entering the world of databases and data management after a very different life as a diplomat. I will introduce TerminusDB and it's world history origins. Finally I will situate the project and the roadmap from a k8s perspective.
Luke Feeney
4:00pm GMT/9:00am PST
#36 A Snapshot of DevOps
DevOps is like a camera. We focus on what's important, we capture the good times, we develop from the negatives, and if things don't work out, we take another shot. Many teams establishing working best practices for their tools improve their time to deliver and ability to scale. However, the real challenges exist outside of tools and technology and many teams today still have questions about DevOps. So, join this session to learn the fundamentals of shaping a DevOps culture. We'll discuss key attributes around people, process, and technology, likening you and DevOps to pro photographers and cameras.
Tiffany Jachja
2:00pm GMT/7:00am PST
My questions about Data on K8s
Kunal Kushwaha
4:00am GMT/9:00pm PST
#37 Running Data Replication Pipelines on Kubernetes with Argo
Hundreds of data teams have migrated to the ELT pattern in recent years, leveraging SaaS tools like Stitch or FiveTran to reliably load data into their infrastructure. These SaaS offerings are outstanding and can accelerate your time to production significantly. However, many teams prefer to roll their own tools. One solution in these cases is to deploy singer.io taps and targets — Python scripts that can perform data replication between arbitrary sources and destinations. The Singer specification is the foundation for the popular Stitch SaaS, and it is also leveraged by a number of independent consultants and data projects. Singer pipelines are highly modular. You can pipe any tap to any target to build a data pipeline that fits your needs, making them a good fit for containerized workflows. This article walks through the workflow at a high level and provides some example code to get up and running with some shared templates. I also drill into reasons for choosing the Argo approach over other orchestration tools like Airflow or Dagster, and the implications from a team perspective.
Stephen Bailey
5:00pm GMT/10:00am PST
DoK en español #1- Nuestros aprendizajes con Kubernetes
Our learnings from Kubernetes,
Aitor Artola, Isidro Nistal, Miriam González, Raquel López Ruiz
4:00pm GMT/9:00am PST
#29 How Absa Developed Cloud Native Global Load Balancer for Kubernetes
Global load balancing, commonly referred to as GSLB (Global Server Load Balancing) solutions, have typically been the domain of proprietary network software and hardware vendors and installed and managed by siloed network teams. k8gb is a completely open source, cloud native, global load balancing solution for Kubernetes. k8gb focuses on load balancing traffic across geographically dispersed Kubernetes clusters using multiple load balancing strategies to meet requirements such as region failover for high availability. Global load balancing for any Kubernetes Service can now be enabled and managed by any operations or development teams in the same Kubernetes native way as any other custom resource. The talk will cover both technical and business aspects of k8gb creation including ongoing adoption within the huge scale organization.
Yury Tsarev
5:00pm GMT/10:00am PST
Dok en español #2 ¡Suelten el Krake! Trayendo la Energía al Lazo de Cómputo / Release the Krake! Bringing Energy into the Compute Loop
ES: Cloud&Heat has always focused on providing energy-efficient data centers. In the last 8 years, we have developed an innovative water cooling technology for servers, converting waste heat into a valuable asset. By doing so, we have already greatly improved the energy efficiency of individual data centers. However, this isn’t enough. To globally maximize the efficiency of distributed data center infrastructures, this talk presents Krake. Krake is an orchestration software for compute-intensive jobs. It improves the global cost and energy efficiency of infrastructures by balancing the load between data centers. Krake evaluates and selects the most efficient site to run jobs based on certain metrics, such as energy availability, heat demand, and latency. It also reacts to changes in the system by migrating jobs. In other words, it ensures a job is run in the most energy- and/or cost-efficient way at any given time. / EN: Cloud & Heat siempre se ha centrado en proporcionar centros de datos energéticamente eficientes. En los últimos 8 años, hemos desarrollado una innovadora tecnología de refrigeración por agua para servidores, que convierte el calor residual en un activo valioso. Al hacerlo, ya hemos mejorado enormemente la eficiencia energética de los centros de datos individuales. Sin embargo, esto no es suficiente. Para maximizar globalmente la eficiencia de las infraestructuras de centros de datos distribuidos, en esta charla presentaremos Krake. Krake es un software de orquestación para trabajos intensivos en computación. Mejora el costo global y la eficiencia energética de las infraestructuras al equilibrar la carga entre los centros de datos. Krake evalúa y selecciona el sitio más eficiente para ejecutar trabajos según ciertas métricas, como la disponibilidad de energía, la demanda de calor y la latencia. También reacciona a los cambios en el sistema mediante la migración de trabajos. En otras palabras, asegura que un trabajo se ejecute de la manera más eficiente en términos de energía y costo en un momento dado.
Juan A. Fraire
5:00pm GMT/9:00am PST
#38 Patterns to create stateful applications on Kubernetes
In this talk we will discuss what are the best patterns to create stateful applications on top of Kubernetes. This will include application layer caching, embeddable database as well as leveraging kubernetes objects to store and sync state across multiple replicas.
Prashant Ghildiyal
5:00pm GMT/9:00am PST
#39 A fireside chat with Jérôme Petazzoni
A fireside chat with Jérôme Petazzoni in which we will get to know him up close and personal, ask him about how his personal music projects influence his professional work, and answer questions from the audience.
Jérôme Petazzoni
5:00pm GMT/9:00am PST
#40 Cloud-Native Chaos Engineering in Databases
Chaos Engineering is revolutionizing testing means and doing it the cloud-native way is the best way in today's rapidly changing world with a huge shift in the paradigm of Kubernetes resiliency. Karthik S, one of the maintainers for LitmusChaos would be introducing how to carry out Chaos Engineering, the cloud-native way. Further, he will touch upon how Chaos Engineering is carried out in Cloud-Native Databases with LitmusChaos. He will also touch upon observability considerations for chaos engineering and what hooks Litmus provides for the same.
Karthik Satchitanand
5:00pm GMT/9:00am PST
#41 Designing Stateful Apps for the Cloud and Kubernetes
Almost all applications have some kind of state. Some data processing apps and databases have huge amounts of state. How do we navigate a cloud-based world of containers where stateless and functions-as-a-service is all the rage? As a long-time architect, designer, and developer of very stateful apps (databases and data processing apps), I’d like to take you on a journey through the modern cloud world and Kubernetes, offering helpful design patterns, considerations, tips, and where things are going. How is Kubernetes shaking up stateful app design?  - What kind of state is there, and what are some important characteristics? - Kubernetes, containers, and the stateless paradigm (pushing state into DBs) - Where state lives and the persistence characteristics - Stateless vs serverless - why stateless is not really stateless, but server less really is - Improving on stateless paradigm using local state pattern - Logs and event streaming for reasoning about state and failure recovery - The case for local disks: ML, Databases, etc. - Kubernetes and the Persistent Volume/StatefulSets - Leveraging Kubernetes PVs as a basis for building distributed data systems - Mapping the solution space
Evan Chan
10:00pm GMT/2:00pm PST
"DoK Brazil #3 Como CNCF Brasil pode nos ajudar na nossa carreira de SRE, DevOps ou Dev."
Talk in Portuguese
Paulo Alberto Simoes
5:00pm GMT/9:00am PST
DoK en español #3: Almacenado de BigData en k8s: El reto de obtener el mejor rendimiento.
Vivencias y experiencia en el proceso de creación de una startup cloud-native en donde unos de los principales caballos de batalla es y será el almacenado en kubernetes.
Aitor Artola
5:00am GMT/9:00pm PST
#42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It
Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially generally available & production-ready. What is the high-level architecture of Spark on Kubernetes, how does it compare to alternatives, what does the migration look like? These are some of the questions we will answer together. We will first introduce the core concepts, then go through the stories of customers who migrated, and then give you concrete technical tips to help you be successful with Spark (on Kubernetes). If time permits, I may do a risky live demo. This will be a technical talk with very fresh content - I hope you will like it. I plan to make it short enough to make room for Q&A and improvisations based on your request. So let me know if there's something specific you're interested in.
Jean-Yves Stephan
5:00pm GMT/9:00am PST
#43 Kubecost: open source cost monitoring for Kubernetes
Measuring costs in Kubernetes environments is complex. Applications and their resources needs are often dynamic. Teams share resources without transparent prices attached to workloads while organizations are increasingly running resources on a range of machine types and even cloud providers. Kubecost provides an approach built on open source for ensuring consistent and accurate visibility across all your workloads. This discussion will talk about practical examples for implementing cost monitoring & optimization and managing the data that is generated from these efforts.
Webb Brown
5:00pm GMT/9:00am PST
#44 DataOps
The talk will cover the various aspects of DataOps, why DataOps is important. It will also talk about some of the client experiences and how DataOps strategy is helping addresses some of the challenges. The talk will also cover the DataOps implementations, tools and technologies.
Vijay AB Kumar
5:00pm GMT/9:00am PST
#45 K8s DX Chronicles: Evolution From CLI to GitOps & Cloud Native IDEs
Within its 7 years of existence, Kubernetes has been the gravitational center of the Cloud Native landscape, elevating a pluggable system that contributed to the diversification of the entire ecosystem. Wider adaptability of the tool prompted the diversification of the end-user base, and a consistent DX for cluster interaction became essential for Kubernetes. The community channeled herculean efforts towards the enhancement of the developer experience by extending the cluster CLI, building portals, and highly-responsive UIs.
Katie Gamanji
5:00pm GMT/9:00am PST
#46 Recovering and Porting Applications in the Fast-Paced DevOps World
Are you a Cloud Architect, DevOps Engineer or SRE who is developing cloud-native applications, managing complex app migration projects or needs infrastructure resiliency? Cloud-native applications present extraordinary performance, scale and compliance challenges in hybrid- and multi-cloud environments that legacy tools simply cannot support. In this session and demo, we’ll take you thru a case study for a large aerospace and defense company who is managing and migrating Kubernetes applications and databases in a multi-cloud environment. You’ll also learn how to handle common cloud-native development challenges like recovering from accidental namespace deletions during test/dv or migrating your application to another cloud for scale and performance testing.
Prashanto Kochavara
1:00pm GMT/5:00am PST
DoK #47 FullStack OpenSource Observability using SigNoz
In the talk, we shall dive deep into the latest open-source tools like Prometheus and Jaeger and our journey in using them and ultimately building our own open-source observability tool, SigNoz. We shall discuss: - What is Observability? The 3 pillars of Observability - Metrics, Traces, and Logs - How is monitoring different than observability? - The hard things about Prometheus? - Why Distributed Tracing became so important? - Running both Prometheus and Jaeger to get metrics + traces. How complex can it go? - Pros and cons of using SaaS vs OSS solutions. Why self-host in the 21st century? - Why we built SigNoz? - What is OpenTelemetry? How to instrument a sample app using OpenTelemetry? - Demo of SigNoz to get detailed insights into your applications
Ankit Nayan
2:30pm GMT/6:30am PST
DoK in Hindi #1: Pehle Kadam Data on Kubernetes Community mein!
Kya hota hai Kubernetes? Shuruwat kahan se kare? Community ka hissa kaise bane? Kya aap ke mann mein bhi ye sawaal aate hain? Join kariye hume iss meetup mein jahan hum baat karengey har cheez Data on K8s ke baare mein (Hindi mein)! May 3rd ko hum charcha karengey ki kaise aap community ka hissa ban sakte hain, CNCF kya hai, ek SRE ka kaam kya hota hai, and bahot kuch! But yehi nahi! Bhaag lijiye meetup ke end ki quiz mein jisse aap jeet sakte hain kuch special SWAGS DoK ki taraf se!
Kunal Kushwaha
5:00pm GMT/9:00am PST
DoK #48 Airflow vs Argo - Battle Royale
We are going to be looking at and comparing Airflow (the established) versus Argo Workflows (The new kid on the block) and see how they measure up. What you would use each for, why you would want to use one or the other and who would win in a battle for data workflow management supremacy.
Tim van de Keer
5:00pm GMT/9:00am PST
DoK #49 Deployments vs StatefulSets vs Daemonsets
Kubernetes provides different resources for deploying applications, we will be looking at them and the differences between them and how can we persist data using each of them.
Ali Kahoot
5:00pm GMT/9:00am PST
DoK #50- Going Full Circle with Kafka
Tecton is building a data platform for machine learning. This talk shares some of the adventures and lessons learned while introducing Kafka into our data pipelines.
Ravi Trivedi