DoK Library
Learn from data on Kubernetes (DoK) practitioners sharing best practices.
Harnessing Apache Kafka on Kubernetes with Strimzi
Aug 30, 2024
By Vinod Kumar Nair
Dive into the world of Apache Kafka on Kubernetes and discover how Strimzi can simplify your Kafka deployments. Whether you’re scaling your data infrastructure or optimizing Kafka clusters, this guide will help you unlock the full potential of your system.
Read Article
Managing Data on Kubernetes
Jul 18, 2024
By Seif Rajhi
Kubernetes provides great solutions for managing stateful workloads, including databases and big data applications. Its features, such as scalability, fault tolerance, and efficient resource management, make it an ideal platform for handling data-intensive
Read Article
CloudNativePG Recipe 8: Participating in PostgreSQL 17 Testing Program in Kubernetes
Jun 11, 2024
By Gabriele Bartolini
The PostgreSQL Global Development Group (PGDG) recently released PostgreSQL 17 Beta 1, offering a preview of all the features that will be available when PostgreSQL 17 is officially launched later this year
Read Article
How the CSI (Container Storage Interface) Works
May 20, 2024
By Steven Sklar
The Container Storage Interface (CSI) simplifies persistent storage management in Kubernetes by providing an API for custom drivers to handle volume provisioning. CSI replaces in-tree volumes, allowing instant addition of new storage types via independent drivers. This article elucidates CSI's role and implementation in Kubernetes, aiding debugging and migration tasks.
Read Article
CloudNativePG Recipe 6: Postgres Vertical Scaling with Storage – part 1
Apr 10, 2024
By Gabriele Bartolini
Explore the potential of optimizing CPU, RAM, and storage resources through meticulous measurement and benchmarking, challenging conventional scaling wisdom. Delve into the solid strategies within the CloudNativePG stack, such as separate volumes for data and transaction logs, temporary tablespaces, and I/O segregation for tables and indexes. Stay tuned for insights into aligning storage solutions with PostgreSQL’s resilience needs in the upcoming sequel.
Read Article
CloudNativePG Recipe 5 – How to migrate your PostgreSQL database in Kubernetes with ~0 downtime from anywhere
Mar 28, 2024
By Gabriele Bartolini
Are you considering migrating your PostgreSQL database from a service provider into Kubernetes, but you cannot afford downtime? Recipe #5 details step-by-step instructions, leveraging CloudNativePG and logical replication, to seamlessly transition from PostgreSQL 10+ to 16 using an imperative method.
Read Article
CloudNativePG Recipe 4 – Connecting to Your PostgreSQL Cluster with pgAdmin4
Mar 28, 2024
By Gabriele Bartolini
The article explores the deployment of pgAdmin4, a popular graphical user interface for PostgreSQL, within a CloudNativePG environment, primarily for evaluation and educational purposes.
Read Article
CloudNativePG Recipe 3 – What!?! No superuser access?
Mar 12, 2024
By Gabriele Bartolini
Explore the secure defaults of a PostgreSQL cluster in this CloudNativePG recipe, aligning with the principle of least authority (PoLA). Our commitment to security and operational simplicity shines through default configurations, balancing robust protection with user-friendly settings.
Read Article
CloudNativePG Recipe 2 – Inspecting Default Resources in a CloudNativePG Cluster
Mar 08, 2024
By Gabriele Bartolini
Dive into the nitty-gritty of how CloudNativePG works its magic with PostgreSQL cluster stuff, zooming in on configmaps and secrets.
Read Article
CloudNativePG Recipe 1 – Setting up your local playground in minutes
Mar 04, 2024
By Gabriele Bartolini
How to setup your local playground in kind, install CloudNativePG and deploy your first PostgreSQL cluster by Gabriele Bartolini
Read Article
Adding Zonal Resiliency to Etsy’s Kafka Cluster | DoKC Town Hall
Feb 21, 2024
In this talk, Kamya Shethia from Etsy, discusses the changes made to add zonal resiliency to the Kafka cluster, that enabled them to speed up their update process.
Watch Video
An Introduction to Custom Resource Definitions and Custom Resources(Operators 101: Part 2)
Feb 06, 2024
By Steven Sklar
In part 2 of his series "Operators 101", Steve Sklar goes into detail about Custom Resource Definitions, including some best practices when designing them and golang-based tooling used to generate them.
Read Article
DoK Database Patterns Whitepaper
Jan 11, 2024
By Data on Kubernetes Community & CNCF Storage TAG
This whitepaper describes the patterns of running data on Kubernetes with a focus on database applications. It describes the attributes of a storage system and how they affect the database applications, how different storage stacks affect these attributes, what are the differences of running data inside and outside of Kubernetes, what are the characteristics of Kubernetes that are beneficial for running data on Kubernetes, and what are the best practices and lessons we have learned from running data on Kubernetes.
See More
What Are Kubernetes Operators? (Operators 101: Part 1)
Jan 05, 2024
By Steven Sklar
This is part 1 of a series by Steven Sklar, Operators 101, where he teaches readers how to design, build, and deploy Kubernetes operators that can automate the management of your own unique applications.
Read Article
1000 node Cassandra cluster on Amazon’s EKS? – Matt Overstreet (DoK Day EU 2022)
Dec 21, 2023
Come here about our experience scaling Cassandra on EKS to over 1000 nodes and 20 million transactions per second. This session will cover the lessons learned, successes, failures, and tools used to get there.
Watch Video
The Data on Kubernetes Landscape – Melissa Logan (DoK Day EU 2022)
Dec 21, 2023
We know from the first Data on Kubernetes Report that 90% of respondents believe Kubernetes is ready for stateful workloads, but significant challenges remain. The DoK Community continues to grow and build a unique space where people share knowledge and have conversations that are shaping the next decade of data on Kubernetes.
Watch Video
Bringing Apache Cassandra closer to Kubernetes – Jake Luciani (DoK Day EU 2022)
Dec 21, 2023
What does Kubernetes provide that allows us to reduce the complexity of Apache Cassandra while making it better suited for cloud native deployments? That was the question we started with as we began a mission to bring Cassandra closer to Kubernetes and eliminate the redundancy.
Watch Video
Operator Lifecycle Management – Julian Fischer (DoK Day EU 2022)
Dec 21, 2023
"The ability to extend Kubernetes with Custom Resource Definitions and respective controllers has led to the OperatorSDK, which became
the de facto standard for data service automation on Kubernetes. There are countless operator implementations available, and new operators are
being released on a daily basis."
Watch Video
What we’ve learned from running a PostgreSQL managed service on Kubernetes – Oleksii Kliukin
Dec 21, 2023
Kubernetes is an emerging platform of choice for deploying and running PostgresSQL. Deploying 100 Postgres clusters is as easy as deploying one, and there is no need to tinker with tools like Ansible or Puppet. Resource sharing can be applied when it makes sense, allowing to run multiple Postgres databases in isolation on a single instance, each storing the data on a dedicated persistent volume.
Watch Video
Why run Postgres in Kubernetes? – Gabriele Bartolini (DoK Day EU 2022)
Dec 21, 2023
"Postgres should run inside your Kubernetes cluster. Yes, inside, not outside Kubernetes.
After all, a database should be seen as an application, a special type of application - for which it is legitimate to require an additional level of care and attention."
Watch Video
The future of data on Kubernetes with Adobe and CNCF – Joseph Sandoval, Xing Yang & Sylvain Kalache
Dec 21, 2023
Some data-intensive workloads are easier to run in Kubernetes than others. Why? What needs to improve? Join us as we deep dive with Adobe and the CNCF about how easy (or not) it is to run different types of data workloads on Kubernetes – and what is being done both inside and outside of Kubernetes to make data workloads easier.
Watch Video
From Laptop to Cloud: Developing Cloud-Native Applications with Containerized Databases – N.Vermandé
Dec 21, 2023
With the advent of microservices in Kubernetes, individual developer teams now manage their own data, middleware, and databases. Automated tests and CI/CD pipelines have to be revisited to include these new requirements.
Watch Video
Testing the Mettle: Evaluating data solutions for large-scale production to check who stacks up
Dec 21, 2023
The state of the CNCF Storage options has exploded in the past few years, but if you had to choose a project to use today, how would you go about comparing each offering and choosing who to partner with for your future growth?
Watch Video
Autoscaling Stateful Workloads in Kubernetes – Mohammad Fahim Abrar & Md Kamol Hasan (DoK Day EU 22)
Dec 21, 2023
Managing stateful workloads in a containerized environment has always been a concern. However, as Kubernetes developed, the whole community worked hard to bring stateful workloads to meet the needs of their enterprise users.
Watch Video
PV TrashCan – Protection against accidental deletion of PVs or Namespaces (DoK Day EU 2022)
Dec 21, 2023
Accidental PVC delete or namespace delete can cause the Persistent Volume to get deleted. Such volumes lose their data and the stateful applications lose their state. By the use of Persistent Volume TrashCan, users can get a grace period to undo such unintended delete operation.
Watch Video
Build your own social media analytics with Apache Kafka – Jakub Scholz (DoK Day EU 2022)
Dec 21, 2023
Apache Kafka is more than just a messaging broker. It has a rich ecosystem of different components. There are connectors for importing and exporting data, different stream processing libraries, schema registries and a lot more. This talk will show how to use it to read data from social networks such as Twitter, process them and use machine learning to analyze them. And all of it will be of course running on top of Kubernetes.
Watch Video
One Click to Run Apache Spark as a Service on Kubernetes – Bo Yang (DoK Day EU 2022)
Dec 21, 2023
It is still challenging to run Apache Spark and other big data processing workload on Kubernetes, especially in large scale. People need to address various issues like resource isolation, queuing, and cost efficiency. This session will share details about those challenges and how to address them. We will also present a convenient (one-click) way to deploy Apache Spark on Kubernetes, and dramatically lower the barrier to use Spark.
Watch Video
Datashim – a framework for declarative management of datasets on Kubernetes – Srikumar Venugopal
Dec 21, 2023
Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation
Watch Video
Kanister & Kopia: An Open-Source Data Protection Match Made in Heaven – Pavan Navarathna
Dec 21, 2023
Cloud-native applications comprise various components, including data services, storage systems, and related Kubernetes objects. Each component requires its own data protection tools, strategy, and domain expertise. A robust solution aligned with business requirements often involves complex workflows. What if there was a way to coordinate the implementation of these workflows while optimizing how backups are moved into storage?
Watch Video
Weathering The Cloud Storm: Modern Data Management Patterns for Reliability and Availability
Dec 21, 2023
“Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether it’s a regional e-commerce website or a major cloud region hosting thousands of applications. That’s why, instead of chasing these illusions, it’s worth focusing on the nines of availability.
Watch Video
Graph in Kubernetes Panel – Wey Gu, Cheukting Ho & Feynman Zhou
Dec 21, 2023
Graph databases are the fastest growing data store in the world. According to Gartner, the application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science. However, it is often difficult for data and analytics professionals to distinguish between different implementation models, and to fit them to their use case. This panel will speak directly to Kubetnetes users and provide them with the context they need to run stateful workloads.
Watch Video
Growing up fast: Kubernetes and Real-Time Analytic Applications – Robert Hodges (DoK Day EU 2022)
Dec 21, 2023
Kubernetes is turning into a preferred platform for real-time analytic app that crunch billions of events per day and return insights in seconds. In this talk we'll introduce the standard analytic app design pattern of fast event streams coupled with low-latency data warehouses, using open source projects. We'll then walk through deploying the pipeline on Kubernetes from ingest to end user access. We'll touch on use of operators, scaling, monitoring, upgrade, security, and approaches to adding custom components. Attendees can expect to leave with concrete lessons about how to stand up low-latency analytics quickly on Kubernetes.
Watch Video
How to protect your data – Sarah Julia Kriesch (DoK Day EU 2022)
Dec 21, 2023
How can you keep your data secure and how can you transfer them on a secure way? You will learn to encrypt your data, that you can use them Kubernetes based in a Multi Cloud environment.
Watch Video
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Developers (DoK Day EU 2022)
Dec 21, 2023
Kubernetes comes with a lot of useful features like Volumes and StatefulSets, which make running stateful workloads simple. Interestingly, when combined with the right tools, these features can make Kubernetes very valuable for developers wanting to run massive production databases in development! This is exactly what was seen at "Extendi".
Watch Video
Is your database in Kubernetes production ready? – Mykola Marzhan (DoK Day EU 2022)
Dec 21, 2023
"It only looks simple to run databases in Kubernetes. In fact, it is too many things needed to be considered before running any database in Kubernetes.
Failover and traffic switching, replication and data consistency/loss after failover, upgrades, DB and node-level configuration, CNI, backups, monitoring, etc."
Watch Video
Disaggregated Container Attached Storage – Yet Another Topology with What Purpose? – Nick Connolly
Dec 21, 2023
The storage topology in vogue seems to cycle every few years. Internal storage is followed by centralized Storage Area Networks only to be superseded by one-size-fits-all Hyperconverged models - until scalability constraints led to distributed storage. Then comes NVMe, offering blistering speeds that all of these storage stacks struggle with. Kubernetes inspires Container Attached Storage aspiring to be the perfect model, so why is disaggregated storage now making an appearance?
Watch Video
Microservices and Kubernetes for your Full Data Lifecycle – Steve Pousty (DoK Day EU 2022)
Dec 21, 2023
Data doesn’t magically appear in our data centers. There are usually several phases and several storage locations along its journey throughout your organization. New architectural patterns, such as microservices, and new technology, such as Kubernetes are changing how we can think about and manage the large volumes of data coming at us.
Watch Video
What’s New in Kubernetes Storage – Xing Yang (DoK Day EU 2022)
Dec 21, 2023
Kubernetes SIG Storage is responsible for ensuring storage is available for containers in a pod when the pod is scheduled on a node. There is the Container Storage Interface (CSI) for block and file storage that allows storage providers to write CSI drivers. There is also a COSI sub-project that is trying to add object storage support in Kubernetes. In this session, Xing will give an update on some of the features that SIG Storage is working on and discuss what might be coming in the future.
Watch Video
Operating FoundationDB on Kubernetes – Johannes M. Scheuermann (DoK Day EU 2022)
Dec 21, 2023
FoundationDB is an open-source distributed transactional Key-Value store that is used by multiple companies like Apple, Snowflake and VMWare Tanzu (previously Wavefront).
Watch Video
Using Kubernetes to deliver a “serverless” service – Jim Walker (DoK Day EU 2022)
Dec 21, 2023
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.
Watch Video
Protecting data with CSI Volume Snapshots on Kubernetes – Grant Griffiths (DoK Day EU 2022)
Dec 21, 2023
The container storage interface (CSI) is a contract between different container orchestrators (Kubernetes, Nomad, etc) and storage plugins. This contract is a set of gRPC services for provisioning, utilizing, and snapshotting storage volumes. In this talk, we will focus on one aspect of the CSI spec: Volume Snapshots.
Watch Video
The many uses of Kubernetes cross cluster migration of persistent data – Ryan Kaw (DoK Day EU 2022)
Dec 21, 2023
Multiple clusters exist in most Kubernetes environments today, and number of clusters will increase overtime. The reasons for having multiple Kubernetes clusters are many, for example, overcoming scale limits, reducing complexity, geo separation, redundancy and having separate production, staging, and development environments. Once you have multiple K8S clusters, it can be useful to have the ability to easily move or duplicate workloads across these different clusters. Kubernetes does not have a native method to allow migration or duplication of workloads across clusters.
Watch Video
Resilient Redis – Hrittik Roy & Ryan Gray (DoK Day EU 2022)
Dec 21, 2023
Redis is a widely used open-source in-memory data store and cache that has become a key component in the development of scalable microservice systems. While all of the main cloud providers provide fully managed Redis services (Amazon ElastiCache, Azure Cache for Redis, and GCP Memorystore), it may also be simply implemented in Kubernetes if you require additional control over the Redis configurations.
Watch Video
Running Kafka on Kubernetes, across three clouds at Adobe – Adi Muraru (DoK Day EU 2022)
Dec 21, 2023
Adobe runs dozens of Kafka clusters spread across both public (AWS and Azure) and private clouds to power the Adobe Experience Platform message bus.
Watch Video
Why are Operators paramount to running stateful workloads on Kubernetes?
Dec 21, 2023
In this panel with Sylvain Kalache, Head of Content at the DoK Community, drives a conversation featuring Nic Vermandé- Principal Developer Advocate at Ondat, Julian Fischer- CEO at anynines, and Sergey Pronin- Group Product Manager at Percona.
Watch Video
Data on K8s – Where are we now? // Álvaro Hernández, CEO of OnGres
Dec 21, 2023
Álvaro Hernández has been with our community since the beginning. He's seen where we've been, and also has a vision about where we're going.
Watch Video
What are customers’ concerns when running data on Kubernetes? // Álvaro Hernández- CEO of Ongres
Dec 21, 2023
We've crossed the chasm, but what are customers' current concerns when it comes to running data on Kubernetes?
Watch Video
What is data observability on Kubernetes? // Álvaro Hernández, CEO of Ongres
Dec 21, 2023
Observability is a hot topic for SREs in the Kubernetes ecosystem. But what does it mean in the context of data?
Watch Video
Why run stateful workloads on Kubernetes? Sathya Sankaran
Dec 21, 2023
Sathya Sankaran, COO of Catalogic and GM of CloudCasa by Catalogic, let's us know what value end users are getting by running stateful workloads on Kubernetes.
Watch Video
Day 2 Kubernetes- what are the challenges? Sathya Sankaran
Dec 21, 2023
Sathya Sankaran, COO at Catalogic and GM at CloudCasa by Catalogic, let us know the difficulties that folks are facing when it comes to Day 2 Kubernetes.
Watch Video
Is data on k8s becoming boring (in a good way)? Jerome Petazzoni
Dec 21, 2023
We all want technologies that are "wild" to become tame and under control. Can the same thing happen for data on Kubernetes? Let's see what Jerome Petazzoni has to say about it.
Watch Video
What does the new operator for Postgres do? – Gabriele Bartolini EDB
Dec 21, 2023
Gabriele shares with us how they built the Postgres Operator for Kubernetes.
Watch Video
Building a Digital Factory for the Sheet Metal Industry – Elie Assi
Dec 21, 2023
We develop systems to digitize the sheet metal industry with the belief that they should cooperate with each other in an open way. We are convinced that the future lies in creating a software ecosystem that interconnects all levels of the company and even manages to communicate with supplier and customer systems, making for more agile management throughout the entire value chain.
Watch Video
Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes – Arul Jegadish Francis
Dec 21, 2023
We at OpsVerse provide a DevOps tools platform with fully-managed open source-based tools. One of our key offerings is a holistic observability platform. Metrics and logs are straightforward to aggregate, however traces – which are collected using CNCF Jaeger – were left with some holes in advanced insights.
Watch Video
Scaling our SaaS offering to thousands of clusters – Dax McDonald
Dec 21, 2023
Sourcegraph is a code intelligence platform that helps our customers to understand their code better. As we have scaled up, we are starting to run hundreds of instances for our customers in separate kubernetes clusters.
Watch Video
The Challenges of Data Processing On Kubernetes:A look at Spark, Flink, Dask, and Ray – Holden Karau
Dec 21, 2023
This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as some of the current pain experienced by users and developers moving their workloads to Kube. In this talk you will learn about how we “cheated” back in the YARN and Mesos days to make things go fast, why Kubernetes doesn’t like those cheats, and what some alternatives are.
Watch Video
Architecting Your First Event Driven Serverless Streaming Applications on K8 – Timothy Spann
Dec 21, 2023
Once you have built a topic in Apache Pulsar, you will quickly see the need to build event-driven applications. This can require a lot of decisions on what framework to use, where to run it, how to deploy it, and how to manage these applications on Kubernetes cloud natively.
Watch Video
Data streaming on Kubernetes – Yaniv Ben Hemo
Dec 21, 2023
I will cover what is the current data streaming on k8s landscape, why it is important, use cases, and what are the challenges needed to solve
Watch Video
Databases on Kubernetes: Why are they important?
Dec 21, 2023
Watch Video
The Kubernetes Native Database – Jeffrey Carpenter
Dec 21, 2023
In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we’ve started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them.
Watch Video
Open Source Databases on Kubernetes: Best Practices – Peter Zaitsev
Dec 21, 2023
So you’re looking to run your Open Source Database on Kubernetes. What best practices should you follow and what pitfalls should you avoid ? In this presentation we will look at how to run stateful applications on Kubernetes overall as well as what is particularly important for databases - we will cover high availability, security, backups and disaster recovery.
Watch Video
Mastering MongoDB on Kubernetes, the power of operators – Arek Borucki
Dec 21, 2023
"NoSQL Databases on Kubernetes - considerations and best practices with a live demo.
MongoDB's natural capabilities like replication, sharding (partitioning data and holding different pieces in separate instances/pods) or failover (failing over from the master, read-write node to other read-only nodes, and promoting the read-only node as the master) can more easily deal with the uncertainty of heterogeneous cloud environments, which makes this database good candidate to launch on Kubernetes cluster."
Watch Video
Inter Cluster PostreSQL on Kubernetes – Julian Fischer
Dec 21, 2023
In this talk you’ll explore how to run a PostgreSQL cluster across multiple Kubernetes clusters. Learn what challenges arise when using asynchronous streaming replication in a set of Kubernetes clusters spanning across several geographical regions.
Watch Video
Highly Available Postgres Clusters In Kubernetes – John Long & Jonathan Gonzalez
Dec 21, 2023
A practical session about running Highly Available PostgreSQL in Kubernetes. The primary objective will be to demonstrate how to set up a reliable architecture in a Kubernetes cluster to achieve low RTO and RPO.
Watch Video
Medical / Healthcare Data on Kubernetes – Olyvia Rakshit & Prasad Dorbala
Dec 21, 2023
Healthcare organizations are transforming their applications and embracing digital platforms for efficient patient care. Today, compute at the edge, plays a critical role in deploying innovative healthcare applications that promise new approaches to patient care.
Watch Video
Shifting Left Stateful Applications In Kubernetes – Viktor Farcic
Dec 21, 2023
Stateless apps are easy to manage. More often than not, a Kubernetes Deployment, with a Service, Ingress, and Horizontal Pod Autoscaler (HPA) is enough. Almost everyone can do it. But, when it comes to stateful applications, things become a bit more complicated. We might need a database and storage. We might need to manage database users and schema. We might need to consider quite a few other things. Stateful apps are harder for everyone, especially if we want to shift left and enable developers to do it themselves.
Watch Video
Kubernetes 360º – Data driven observability – from Secrets to logs – Ben Hirschberg
Dec 21, 2023
If there’s one thing that everyone can agree on - it’s that the sheer scale and complexity of Kubernetes operations is growing constantly. What’s more, cloud native environments are becoming more and more expensive to operate and manage, as well as increasingly difficult to secure. On the bright side, there is a growing ecosystem of exceptional open source tools to help overcome this complexity, and provide greater situational awareness to what’s happening in your many and multiple Kubernetes clusters.
Watch Video
Choosing Kubernetes for Stateful Applications – Akshay Ram & Peter Schuurman
Dec 21, 2023
Learn how customers are increasingly deploying stateful applications on Kubernetes to benefit from portability, economies of scale, and built-in orchestration capabilities. This talk will include how customers choose between using Kuberentes, or a data Software as a Service (SaaS) and stateful capabilities of Kubernetes across two dimensions - the application orchestration and the storage layer. Also learn about MariaDB SKYSQL, a database software as a service that runs thousands of StatefulSet Pods across multiple zones and regions on Kubernetes.
Watch Video
Formula 1 telemetry processing using Apache Kafka on Kubernetes – Paolo Patierno
Dec 21, 2023
Apache Kafka is the de facto data streaming platform used for ingesting vast amounts of data and processing them in real-time. Low latency analytics are vital if users are to react to events as fast as possible and to effectively shape future decision making. Together with Kubernetes, it allows to develop cloud oriented analytics solutions which are highly scalable.
Watch Video
Are StatefulSets broken? Michael Guarino, CTO of Plural.sh
Dec 21, 2023
Are StatefulSets broken? Michael Guarino is no stranger to Kubernetes, and he's seen how it has developed over time to be "friendlier" when it comes to running stateful workloads with features like StatefulSets.
Watch Video
Stateful Apps in a Multicloud Era- Yves Weisser
Dec 21, 2023
"More & more companies use several environments to host their applications.
Sometimes an application will be developed in a datacenter & moved to production in the cloud, or vice versa."
Watch Video
What are customers’ challenges when running data on k8s? Joe Gardiner DoK Talks #156
Dec 21, 2023
Joe Gardiner (Director of Cloud Native Architecture - EMEA at Pure Storage. ) has been working with customers and helping them solve their data challenges for years. So what are their concerns when it comes to running stateful workloads on Kubernetes?
Watch Video
DoK Report 2022- DoKC Director Melissa Logan and Stephanie Fairchild of ClearPath Strategies
Dec 20, 2023
Our 2022 report features insights from over 500 executives and technology leaders on how data on Kubernetes has a transformative impact on organizations, regardless of size or tech maturity.
Watch Video
DoK Community Talks: Intro to Why Data Matters
Dec 20, 2023
"Chapter 1: Intro to Why Data Matters
Lisa Marie-Namphy, Head of Developer Relations at Cockroach Labs and Sam Ramji, Chief Strategy Officers at DataStax sit down with DoKC to discuss why data matters and what the future of data looks like."
Watch Video
How did South America’s biggest ecommerce store tackle data on K8s?
Dec 20, 2023
"When developers don't have access to data, how can they make informed decisions?
Ramiro Berrelleza is the CTO of Okteto, and he shared a case study of what life was like for the largest ecommerce store in South America before they leveraged the benefits of running data on Kubernetes. "
Watch Video
What is Kafka? The rise of one of the world’s most used streaming data technologies w/Abbey Russell
Dec 20, 2023
"Abbey Russell, PM at Cockroach Labs, shared the backstory on how and why Kafka was created.
Along the way, you'll learn about
- Who Franz Kafka was
- Kafka's earliest use at Linkedin in 2010
- Why organizations like Uber/Coursera/Mailchimp use it today
- Future of Data Streaming"
Watch Video
Operators 101 – Uma Dhatri
Dec 20, 2023
What are operators? Are they like the old timey ladies at the telephone exchange? Let's find out together
Watch Video
Exploring The Power of Autoscaling – Aditya Tomar
Dec 20, 2023
In this talk you are going to gain an insight about Machine Learning, Kubernetes and how Autoscaling helps organizations.
Watch Video
MongoDB Goes to K8s: A Wild Adventure with Operators – Ritesh Karankal
Dec 20, 2023
Hold onto your hats, folks, because we're about to explore Why do we need an operator to run MongoDB on K8s? How operators work? and what they can do for you? all while enjoying the ride with our trusty sidekick, the Kubernetes Operator.
Watch Video
Kubernetes: The Ultimate Platform for Streamlining Data Streaming- Yash Pimple
Dec 20, 2023
Watch Video
12 chapters of Data on Kubernetes – Atharv Karajgi
Dec 20, 2023
Watch Video
Rook – Helping the Kubernetes Storage Community Thrive
Dec 20, 2023
Rook is an open source cloud-native storage operator, providing support for Ceph to natively integrate with Kubernetes. An introduction to Rook will show how Rook configures Ceph to provide stable block, shared file system, and object storage for your production data. Rook recently joined DoK as a community sponsor. Let’s have a discussion about how we can help the K8s storage community thrive. Rook was accepted as a graduated project by the CNCF in October 2020.
Watch Video
DoK @ Comcast – Deliver Business Outcomes & Improved DevX with Data Services on K8s
Dec 20, 2023
"DoK @ Comcast: Delivering Business Outcomes & Improved DevX with Data Services Running on Kubernetes
Presented by Greg Otto, Executor Director, DevX Platforms & Charles Ju, Principal Engineer. Transforming how to deliver measurable value using data on Kubernetes, while providing psychological safety. In this talk, we will share our transformation journey, the “Months to Minutes” outcomes we achieved, the architecture approach, and the human journey from one of our engineers."
Watch Video
DoK + Apache Spark
Dec 20, 2023
Presented by Holden Karau, Spark Committer and Open Source Engineer at Netflix. In this brief talk Holden will cover some of the best practices from trying to deploy both small and large scale Spark on Kube.
Watch Video
Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo and Hera
Dec 20, 2023
At PayIt, we’ve been deploying applications to Kubernetes almost since the beginning of the company. Our data workloads, however, have run instead in AWS Glue. This has worked well enough for the reporting use cases that have been the main focus of this team historically. However, at the beginning of 2022, the PayIt data team began building out a new data platform, and in the process, ran into a number of challenges with Glue. In this talk, I will share the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture that we’ve arrived at today.
Watch Video
Repel Boarders! How to find a Kubernetes operator that really protects your data
Dec 20, 2023
Operators are a godsend for managing data in Kubernetes. But how about protecting it? We'll explore security threats to cloud native databases and show what protection you should look for in operators. Finally we'll introduce a new Data on Kubernetes Community project to develop security standards for database operators in Kubernetes.
Watch Video
Implementing Data & Databases on K8s within the Dutch Government
Dec 20, 2023
A small walkthrough of projects within the Dutch government running databases on OpenShift. This talk shares success stories, provides a proven recipe to `get it done` and debunks some of the FUD.
Watch Video
Persistence at the Edge for Thousands of Chick-fil-A Restaurants
Dec 20, 2023
"Kubernetes is being deployed outside of cloud and datacenter environments, at the Edge. In this sessions you will learn about how Chick-fil-A has been running Kubernetes in ~2,800 restaurants for the past 4.5 years. We'll discuss why this is necessary and useful, what types of data are being used, what is our approach to persistence, and what tradeoffs have we made between persistence guarantees and complexity of solution.
"
Watch Video
Get started with AI on AWS with MLFlow and Notebooks on K8s
Dec 15, 2023
In this hands-on workshop, we’ll run an end-to-end project for beginners using an open-source machine learning tools on the public cloud. It will allow anyone to follow easily by accessing the existing documentation and simply following the steps that we are going to provide.
Watch Video
Batch Workloads in Multi-tenant Environment with Apache YuniKorn
Nov 10, 2023
By Sunil Govindan, Wilfred Spiegelenburg
You will get an introduction to Apache YuniKorn – an open-source resource scheduler to redefine resource scheduling on Cloud. To ultimately explain how you can schedule large scale Apache Spark jobs efficiently on Kubernetes in the cloud.
Watch Video
DoKC Town Hall #1 – Comcast and Netflix
Nov 10, 2023
By Greg Otto, Charles Ju,Holden Karau
This video features talks from both Comcast and Netflix. Learn how both gained value in running data on Kubernetes.
Watch Video
DoKC Town Hall #2 – PayItGov & Altinity
Nov 10, 2023
By Robert Hodges, Altinity // Matt Menzenski, Payitgov
This month's town hall featured two speakers. Hear from Robert Hodges of Altinity about how to find a Kubernetes operator that really protects your data. Matt Menzenski of Payitgov, shares the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture
Watch Video