Schedule now LIVE for DoK Day at KubeCon Paris | March 19, 2024

Register Now!

DoK Library

Filter by type:

Learn from data on Kubernetes (DoK) practitioners sharing best practices.

Adding Zonal Resiliency to Etsy’s Kafka Cluster | DoKC Town Hall

Feb 21, 2024
video icon
In this talk, Kamya Shethia from Etsy, discusses the changes made to add zonal resiliency to the Kafka cluster, that enabled them to speed up their update process.

An Introduction to Custom Resource Definitions and Custom Resources(Operators 101: Part 2)

Feb 06, 2024
article icon
In part 2 of his series "Operators 101", Steve Sklar goes into detail about Custom Resource Definitions, including some best practices when designing them and golang-based tooling used to generate them.

DoK Database Patterns Whitepaper

Jan 11, 2024
This whitepaper describes the patterns of running data on Kubernetes with a focus on database applications. It describes the attributes of a storage system and how they affect the database applications, how different storage stacks affect these attributes, what are the differences of running data inside and outside of Kubernetes, what are the characteristics of Kubernetes that are beneficial for running data on Kubernetes, and what are the best practices and lessons we have learned from running data on Kubernetes.

What Are Kubernetes Operators? (Operators 101: Part 1)

Jan 05, 2024
article icon
This is part 1 of a series by Steven Sklar, Operators 101, where he teaches readers how to design, build, and deploy Kubernetes operators that can automate the management of your own unique applications.

1000 node Cassandra cluster on Amazon’s EKS? – Matt Overstreet (DoK Day EU 2022)

Dec 21, 2023
video icon
Come here about our experience scaling Cassandra on EKS to over 1000 nodes and 20 million transactions per second. This session will cover the lessons learned, successes, failures, and tools used to get there.

The Data on Kubernetes Landscape – Melissa Logan (DoK Day EU 2022)

Dec 21, 2023
video icon
We know from the first Data on Kubernetes Report that 90% of respondents believe Kubernetes is ready for stateful workloads, but significant challenges remain. The DoK Community continues to grow and build a unique space where people share knowledge and have conversations that are shaping the next decade of data on Kubernetes.

Bringing Apache Cassandra closer to Kubernetes – Jake Luciani (DoK Day EU 2022)

Dec 21, 2023
video icon
What does Kubernetes provide that allows us to reduce the complexity of Apache Cassandra while making it better suited for cloud native deployments? That was the question we started with as we began a mission to bring Cassandra closer to Kubernetes and eliminate the redundancy.

Operator Lifecycle Management – Julian Fischer (DoK Day EU 2022)

Dec 21, 2023
video icon
"The ability to extend Kubernetes with Custom Resource Definitions and respective controllers has led to the OperatorSDK, which became the de facto standard for data service automation on Kubernetes. There are countless operator implementations available, and new operators are being released on a daily basis."

What we’ve learned from running a PostgreSQL managed service on Kubernetes – Oleksii Kliukin

Dec 21, 2023
video icon
Kubernetes is an emerging platform of choice for deploying and running PostgresSQL. Deploying 100 Postgres clusters is as easy as deploying one, and there is no need to tinker with tools like Ansible or Puppet. Resource sharing can be applied when it makes sense, allowing to run multiple Postgres databases in isolation on a single instance, each storing the data on a dedicated persistent volume.

Why run Postgres in Kubernetes? – Gabriele Bartolini (DoK Day EU 2022)

Dec 21, 2023
video icon
"Postgres should run inside your Kubernetes cluster. Yes, inside, not outside Kubernetes. After all, a database should be seen as an application, a special type of application - for which it is legitimate to require an additional level of care and attention."

The future of data on Kubernetes with Adobe and CNCF – Joseph Sandoval, Xing Yang & Sylvain Kalache

Dec 21, 2023
video icon
Some data-intensive workloads are easier to run in Kubernetes than others. Why? What needs to improve? Join us as we deep dive with Adobe and the CNCF about how easy (or not) it is to run different types of data workloads on Kubernetes – and what is being done both inside and outside of Kubernetes to make data workloads easier.

From Laptop to Cloud: Developing Cloud-Native Applications with Containerized Databases – N.Vermandé

Dec 21, 2023
video icon
With the advent of microservices in Kubernetes, individual developer teams now manage their own data, middleware, and databases. Automated tests and CI/CD pipelines have to be revisited to include these new requirements.

Testing the Mettle: Evaluating data solutions for large-scale production to check who stacks up

Dec 21, 2023
video icon
The state of the CNCF Storage options has exploded in the past few years, but if you had to choose a project to use today, how would you go about comparing each offering and choosing who to partner with for your future growth?

Autoscaling Stateful Workloads in Kubernetes – Mohammad Fahim Abrar & Md Kamol Hasan (DoK Day EU 22)

Dec 21, 2023
video icon
Managing stateful workloads in a containerized environment has always been a concern. However, as Kubernetes developed, the whole community worked hard to bring stateful workloads to meet the needs of their enterprise users.

PV TrashCan – Protection against accidental deletion of PVs or Namespaces (DoK Day EU 2022)

Dec 21, 2023
video icon
Accidental PVC delete or namespace delete can cause the Persistent Volume to get deleted. Such volumes lose their data and the stateful applications lose their state. By the use of Persistent Volume TrashCan, users can get a grace period to undo such unintended delete operation.

Build your own social media analytics with Apache Kafka – Jakub Scholz (DoK Day EU 2022)

Dec 21, 2023
video icon
Apache Kafka is more than just a messaging broker. It has a rich ecosystem of different components. There are connectors for importing and exporting data, different stream processing libraries, schema registries and a lot more. This talk will show how to use it to read data from social networks such as Twitter, process them and use machine learning to analyze them. And all of it will be of course running on top of Kubernetes.

One Click to Run Apache Spark as a Service on Kubernetes – Bo Yang (DoK Day EU 2022)

Dec 21, 2023
video icon
It is still challenging to run Apache Spark and other big data processing workload on Kubernetes, especially in large scale. People need to address various issues like resource isolation, queuing, and cost efficiency. This session will share details about those challenges and how to address them. We will also present a convenient (one-click) way to deploy Apache Spark on Kubernetes, and dramatically lower the barrier to use Spark.

Datashim – a framework for declarative management of datasets on Kubernetes – Srikumar Venugopal

Dec 21, 2023
video icon
Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation

Kanister & Kopia: An Open-Source Data Protection Match Made in Heaven – Pavan Navarathna

Dec 21, 2023
video icon
Cloud-native applications comprise various components, including data services, storage systems, and related Kubernetes objects. Each component requires its own data protection tools, strategy, and domain expertise. A robust solution aligned with business requirements often involves complex workflows. What if there was a way to coordinate the implementation of these workflows while optimizing how backups are moved into storage?

Weathering The Cloud Storm: Modern Data Management Patterns for Reliability and Availability

Dec 21, 2023
video icon
“Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether it’s a regional e-commerce website or a major cloud region hosting thousands of applications. That’s why, instead of chasing these illusions, it’s worth focusing on the nines of availability.

Graph in Kubernetes Panel – Wey Gu, Cheukting Ho & Feynman Zhou

Dec 21, 2023
video icon
Graph databases are the fastest growing data store in the world. According to Gartner, the application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science. However, it is often difficult for data and analytics professionals to distinguish between different implementation models, and to fit them to their use case. This panel will speak directly to Kubetnetes users and provide them with the context they need to run stateful workloads.

Growing up fast: Kubernetes and Real-Time Analytic Applications – Robert Hodges (DoK Day EU 2022)

Dec 21, 2023
video icon
Kubernetes is turning into a preferred platform for real-time analytic app that crunch billions of events per day and return insights in seconds. In this talk we'll introduce the standard analytic app design pattern of fast event streams coupled with low-latency data warehouses, using open source projects. We'll then walk through deploying the pipeline on Kubernetes from ingest to end user access. We'll touch on use of operators, scaling, monitoring, upgrade, security, and approaches to adding custom components. Attendees can expect to leave with concrete lessons about how to stand up low-latency analytics quickly on Kubernetes.

How to protect your data – Sarah Julia Kriesch (DoK Day EU 2022)

Dec 21, 2023
video icon
How can you keep your data secure and how can you transfer them on a secure way? You will learn to encrypt your data, that you can use them Kubernetes based in a Multi Cloud environment.

Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Developers (DoK Day EU 2022)

Dec 21, 2023
video icon
Kubernetes comes with a lot of useful features like Volumes and StatefulSets, which make running stateful workloads simple. Interestingly, when combined with the right tools, these features can make Kubernetes very valuable for developers wanting to run massive production databases in development! This is exactly what was seen at "Extendi".

Is your database in Kubernetes production ready? – Mykola Marzhan (DoK Day EU 2022)

Dec 21, 2023
video icon
"It only looks simple to run databases in Kubernetes. In fact, it is too many things needed to be considered before running any database in Kubernetes. Failover and traffic switching, replication and data consistency/loss after failover, upgrades, DB and node-level configuration, CNI, backups, monitoring, etc."

Disaggregated Container Attached Storage – Yet Another Topology with What Purpose? – Nick Connolly

Dec 21, 2023
video icon
The storage topology in vogue seems to cycle every few years. Internal storage is followed by centralized Storage Area Networks only to be superseded by one-size-fits-all Hyperconverged models - until scalability constraints led to distributed storage. Then comes NVMe, offering blistering speeds that all of these storage stacks struggle with. Kubernetes inspires Container Attached Storage aspiring to be the perfect model, so why is disaggregated storage now making an appearance?

Microservices and Kubernetes for your Full Data Lifecycle – Steve Pousty (DoK Day EU 2022)

Dec 21, 2023
video icon
Data doesn’t magically appear in our data centers. There are usually several phases and several storage locations along its journey throughout your organization. New architectural patterns, such as microservices, and new technology, such as Kubernetes are changing how we can think about and manage the large volumes of data coming at us.

What’s New in Kubernetes Storage – Xing Yang (DoK Day EU 2022)

Dec 21, 2023
video icon
Kubernetes SIG Storage is responsible for ensuring storage is available for containers in a pod when the pod is scheduled on a node. There is the Container Storage Interface (CSI) for block and file storage that allows storage providers to write CSI drivers. There is also a COSI sub-project that is trying to add object storage support in Kubernetes. In this session, Xing will give an update on some of the features that SIG Storage is working on and discuss what might be coming in the future.

Operating FoundationDB on Kubernetes – Johannes M. Scheuermann (DoK Day EU 2022)

Dec 21, 2023
video icon
FoundationDB is an open-source distributed transactional Key-Value store that is used by multiple companies like Apple, Snowflake and VMWare Tanzu (previously Wavefront).

Using Kubernetes to deliver a “serverless” service – Jim Walker (DoK Day EU 2022)

Dec 21, 2023
video icon
Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.

Protecting data with CSI Volume Snapshots on Kubernetes – Grant Griffiths (DoK Day EU 2022)

Dec 21, 2023
video icon
The container storage interface (CSI) is a contract between different container orchestrators (Kubernetes, Nomad, etc) and storage plugins. This contract is a set of gRPC services for provisioning, utilizing, and snapshotting storage volumes. In this talk, we will focus on one aspect of the CSI spec: Volume Snapshots.

The many uses of Kubernetes cross cluster migration of persistent data – Ryan Kaw (DoK Day EU 2022)

Dec 21, 2023
video icon
Multiple clusters exist in most Kubernetes environments today, and number of clusters will increase overtime. The reasons for having multiple Kubernetes clusters are many, for example, overcoming scale limits, reducing complexity, geo separation, redundancy and having separate production, staging, and development environments. Once you have multiple K8S clusters, it can be useful to have the ability to easily move or duplicate workloads across these different clusters. Kubernetes does not have a native method to allow migration or duplication of workloads across clusters.

Resilient Redis – Hrittik Roy & Ryan Gray (DoK Day EU 2022)

Dec 21, 2023
video icon
Redis is a widely used open-source in-memory data store and cache that has become a key component in the development of scalable microservice systems. While all of the main cloud providers provide fully managed Redis services (Amazon ElastiCache, Azure Cache for Redis, and GCP Memorystore), it may also be simply implemented in Kubernetes if you require additional control over the Redis configurations.

Running Kafka on Kubernetes, across three clouds at Adobe – Adi Muraru (DoK Day EU 2022)

Dec 21, 2023
video icon
Adobe runs dozens of Kafka clusters spread across both public (AWS and Azure) and private clouds to power the Adobe Experience Platform message bus.

Why are Operators paramount to running stateful workloads on Kubernetes?

Dec 21, 2023
video icon
In this panel with Sylvain Kalache, Head of Content at the DoK Community, drives a conversation featuring Nic Vermandé- Principal Developer Advocate at Ondat, Julian Fischer- CEO at anynines, and Sergey Pronin- Group Product Manager at Percona.

Data on K8s – Where are we now? // Álvaro Hernández, CEO of OnGres

Dec 21, 2023
video icon
Álvaro Hernández has been with our community since the beginning. He's seen where we've been, and also has a vision about where we're going.

What are customers’ concerns when running data on Kubernetes? // Álvaro Hernández- CEO of Ongres

Dec 21, 2023
video icon
We've crossed the chasm, but what are customers' current concerns when it comes to running data on Kubernetes?

What is data observability on Kubernetes? // Álvaro Hernández, CEO of Ongres

Dec 21, 2023
video icon
Observability is a hot topic for SREs in the Kubernetes ecosystem. But what does it mean in the context of data?

Why run stateful workloads on Kubernetes? Sathya Sankaran

Dec 21, 2023
video icon
Sathya Sankaran, COO of Catalogic and GM of CloudCasa by Catalogic, let's us know what value end users are getting by running stateful workloads on Kubernetes.

Day 2 Kubernetes- what are the challenges? Sathya Sankaran

Dec 21, 2023
video icon
Sathya Sankaran, COO at Catalogic and GM at CloudCasa by Catalogic, let us know the difficulties that folks are facing when it comes to Day 2 Kubernetes.

Is data on k8s becoming boring (in a good way)? Jerome Petazzoni

Dec 21, 2023
video icon
We all want technologies that are "wild" to become tame and under control. Can the same thing happen for data on Kubernetes? Let's see what Jerome Petazzoni has to say about it.

What does the new operator for Postgres do? – Gabriele Bartolini EDB

Dec 21, 2023
video icon
Gabriele shares with us how they built the Postgres Operator for Kubernetes.

Building a Digital Factory for the Sheet Metal Industry – Elie Assi

Dec 21, 2023
video icon
We develop systems to digitize the sheet metal industry with the belief that they should cooperate with each other in an open way. We are convinced that the future lies in creating a software ecosystem that interconnects all levels of the company and even manages to communicate with supplier and customer systems, making for more agile management throughout the entire value chain.

Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes – Arul Jegadish Francis

Dec 21, 2023
video icon
We at OpsVerse provide a DevOps tools platform with fully-managed open source-based tools. One of our key offerings is a holistic observability platform. Metrics and logs are straightforward to aggregate, however traces – which are collected using CNCF Jaeger – were left with some holes in advanced insights.

Scaling our SaaS offering to thousands of clusters – Dax McDonald

Dec 21, 2023
video icon
Sourcegraph is a code intelligence platform that helps our customers to understand their code better. As we have scaled up, we are starting to run hundreds of instances for our customers in separate kubernetes clusters.

The Challenges of Data Processing On Kubernetes:A look at Spark, Flink, Dask, and Ray – Holden Karau

Dec 21, 2023
video icon
This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as some of the current pain experienced by users and developers moving their workloads to Kube. In this talk you will learn about how we “cheated” back in the YARN and Mesos days to make things go fast, why Kubernetes doesn’t like those cheats, and what some alternatives are.

Architecting Your First Event Driven Serverless Streaming Applications on K8 – Timothy Spann

Dec 21, 2023
video icon
Once you have built a topic in Apache Pulsar, you will quickly see the need to build event-driven applications. This can require a lot of decisions on what framework to use, where to run it, how to deploy it, and how to manage these applications on Kubernetes cloud natively.

Data streaming on Kubernetes – Yaniv Ben Hemo

Dec 21, 2023
video icon
I will cover what is the current data streaming on k8s landscape, why it is important, use cases, and what are the challenges needed to solve

Databases on Kubernetes: Why are they important?

Dec 21, 2023
video icon

The Kubernetes Native Database – Jeffrey Carpenter

Dec 21, 2023
video icon
In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we’ve started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them.

Open Source Databases on Kubernetes: Best Practices – Peter Zaitsev

Dec 21, 2023
video icon
So you’re looking to run your Open Source Database on Kubernetes. What best practices should you follow and what pitfalls should you avoid ? In this presentation we will look at how to run stateful applications on Kubernetes overall as well as what is particularly important for databases - we will cover high availability, security, backups and disaster recovery.

Mastering MongoDB on Kubernetes, the power of operators – Arek Borucki

Dec 21, 2023
video icon
"NoSQL Databases on Kubernetes - considerations and best practices with a live demo. MongoDB's natural capabilities like replication, sharding (partitioning data and holding different pieces in separate instances/pods) or failover (failing over from the master, read-write node to other read-only nodes, and promoting the read-only node as the master) can more easily deal with the uncertainty of heterogeneous cloud environments, which makes this database good candidate to launch on Kubernetes cluster."

Inter Cluster PostreSQL on Kubernetes – Julian Fischer

Dec 21, 2023
video icon
In this talk you’ll explore how to run a PostgreSQL cluster across multiple Kubernetes clusters. Learn what challenges arise when using asynchronous streaming replication in a set of Kubernetes clusters spanning across several geographical regions.

Highly Available Postgres Clusters In Kubernetes – John Long & Jonathan Gonzalez

Dec 21, 2023
video icon
A practical session about running Highly Available PostgreSQL in Kubernetes. The primary objective will be to demonstrate how to set up a reliable architecture in a Kubernetes cluster to achieve low RTO and RPO.

Medical / Healthcare Data on Kubernetes – Olyvia Rakshit & Prasad Dorbala

Dec 21, 2023
video icon
Healthcare organizations are transforming their applications and embracing digital platforms for efficient patient care. Today, compute at the edge, plays a critical role in deploying innovative healthcare applications that promise new approaches to patient care.

Shifting Left Stateful Applications In Kubernetes – Viktor Farcic

Dec 21, 2023
video icon
Stateless apps are easy to manage. More often than not, a Kubernetes Deployment, with a Service, Ingress, and Horizontal Pod Autoscaler (HPA) is enough. Almost everyone can do it. But, when it comes to stateful applications, things become a bit more complicated. We might need a database and storage. We might need to manage database users and schema. We might need to consider quite a few other things. Stateful apps are harder for everyone, especially if we want to shift left and enable developers to do it themselves.

Kubernetes 360º – Data driven observability – from Secrets to logs – Ben Hirschberg

Dec 21, 2023
video icon
If there’s one thing that everyone can agree on - it’s that the sheer scale and complexity of Kubernetes operations is growing constantly. What’s more, cloud native environments are becoming more and more expensive to operate and manage, as well as increasingly difficult to secure. On the bright side, there is a growing ecosystem of exceptional open source tools to help overcome this complexity, and provide greater situational awareness to what’s happening in your many and multiple Kubernetes clusters.

Choosing Kubernetes for Stateful Applications – Akshay Ram & Peter Schuurman

Dec 21, 2023
video icon
Learn how customers are increasingly deploying stateful applications on Kubernetes to benefit from portability, economies of scale, and built-in orchestration capabilities. This talk will include how customers choose between using Kuberentes, or a data Software as a Service (SaaS) and stateful capabilities of Kubernetes across two dimensions - the application orchestration and the storage layer. Also learn about MariaDB SKYSQL, a database software as a service that runs thousands of StatefulSet Pods across multiple zones and regions on Kubernetes.

Formula 1 telemetry processing using Apache Kafka on Kubernetes – Paolo Patierno

Dec 21, 2023
video icon
Apache Kafka is the de facto data streaming platform used for ingesting vast amounts of data and processing them in real-time. Low latency analytics are vital if users are to react to events as fast as possible and to effectively shape future decision making. Together with Kubernetes, it allows to develop cloud oriented analytics solutions which are highly scalable.

Are StatefulSets broken? Michael Guarino, CTO of Plural.sh

Dec 21, 2023
video icon
Are StatefulSets broken? Michael Guarino is no stranger to Kubernetes, and he's seen how it has developed over time to be "friendlier" when it comes to running stateful workloads with features like StatefulSets.

Stateful Apps in a Multicloud Era- Yves Weisser

Dec 21, 2023
video icon
"More & more companies use several environments to host their applications. Sometimes an application will be developed in a datacenter & moved to production in the cloud, or vice versa."

What are customers’ challenges when running data on k8s? Joe Gardiner DoK Talks #156

Dec 21, 2023
video icon
Joe Gardiner (Director of Cloud Native Architecture - EMEA at Pure Storage. ) has been working with customers and helping them solve their data challenges for years. So what are their concerns when it comes to running stateful workloads on Kubernetes?

DoK Report 2022- DoKC Director Melissa Logan and Stephanie Fairchild of ClearPath Strategies

Dec 20, 2023
video icon
Our 2022 report features insights from over 500 executives and technology leaders on how data on Kubernetes has a transformative impact on organizations, regardless of size or tech maturity.

DoK Community Talks: Intro to Why Data Matters

Dec 20, 2023
video icon
"Chapter 1: Intro to Why Data Matters Lisa Marie-Namphy, Head of Developer Relations at Cockroach Labs and Sam Ramji, Chief Strategy Officers at DataStax sit down with DoKC to discuss why data matters and what the future of data looks like."

How did South America’s biggest ecommerce store tackle data on K8s?

Dec 20, 2023
video icon
"When developers don't have access to data, how can they make informed decisions? Ramiro Berrelleza is the CTO of Okteto, and he shared a case study of what life was like for the largest ecommerce store in South America before they leveraged the benefits of running data on Kubernetes. "

What is Kafka? The rise of one of the world’s most used streaming data technologies w/Abbey Russell

Dec 20, 2023
video icon
"Abbey Russell, PM at Cockroach Labs, shared the backstory on how and why Kafka was created. Along the way, you'll learn about - Who Franz Kafka was - Kafka's earliest use at Linkedin in 2010 - Why organizations like Uber/Coursera/Mailchimp use it today - Future of Data Streaming"

Operators 101 – Uma Dhatri

Dec 20, 2023
video icon
What are operators? Are they like the old timey ladies at the telephone exchange? Let's find out together

Exploring The Power of Autoscaling – Aditya Tomar

Dec 20, 2023
video icon
In this talk you are going to gain an insight about Machine Learning, Kubernetes and how Autoscaling helps organizations.

MongoDB Goes to K8s: A Wild Adventure with Operators – Ritesh Karankal

Dec 20, 2023
video icon
Hold onto your hats, folks, because we're about to explore Why do we need an operator to run MongoDB on K8s? How operators work? and what they can do for you? all while enjoying the ride with our trusty sidekick, the Kubernetes Operator.

Kubernetes: The Ultimate Platform for Streamlining Data Streaming- Yash Pimple

Dec 20, 2023
video icon

12 chapters of Data on Kubernetes – Atharv Karajgi

Dec 20, 2023
video icon

Rook – Helping the Kubernetes Storage Community Thrive

Dec 20, 2023
video icon
Rook is an open source cloud-native storage operator, providing support for Ceph to natively integrate with Kubernetes. An introduction to Rook will show how Rook configures Ceph to provide stable block, shared file system, and object storage for your production data. Rook recently joined DoK as a community sponsor. Let’s have a discussion about how we can help the K8s storage community thrive. Rook was accepted as a graduated project by the CNCF in October 2020.

DoK @ Comcast – Deliver Business Outcomes & Improved DevX with Data Services on K8s

Dec 20, 2023
video icon
"DoK @ Comcast: Delivering Business Outcomes & Improved DevX with Data Services Running on Kubernetes Presented by Greg Otto, Executor Director, DevX Platforms & Charles Ju, Principal Engineer. Transforming how to deliver measurable value using data on Kubernetes, while providing psychological safety. In this talk, we will share our transformation journey, the “Months to Minutes” outcomes we achieved, the architecture approach, and the human journey from one of our engineers."

DoK + Apache Spark

Dec 20, 2023
video icon
Presented by Holden Karau, Spark Committer and Open Source Engineer at Netflix. In this brief talk Holden will cover some of the best practices from trying to deploy both small and large scale Spark on Kube.

Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo and Hera

Dec 20, 2023
video icon
At PayIt, we’ve been deploying applications to Kubernetes almost since the beginning of the company. Our data workloads, however, have run instead in AWS Glue. This has worked well enough for the reporting use cases that have been the main focus of this team historically. However, at the beginning of 2022, the PayIt data team began building out a new data platform, and in the process, ran into a number of challenges with Glue. In this talk, I will share the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture that we’ve arrived at today.

Repel Boarders! How to find a Kubernetes operator that really protects your data

Dec 20, 2023
video icon
Operators are a godsend for managing data in Kubernetes. But how about protecting it? We'll explore security threats to cloud native databases and show what protection you should look for in operators. Finally we'll introduce a new Data on Kubernetes Community project to develop security standards for database operators in Kubernetes.

Implementing Data & Databases on K8s within the Dutch Government

Dec 20, 2023
video icon
A small walkthrough of projects within the Dutch government running databases on OpenShift. This talk shares success stories, provides a proven recipe to `get it done` and debunks some of the FUD.

Persistence at the Edge for Thousands of Chick-fil-A Restaurants

Dec 20, 2023
video icon
"Kubernetes is being deployed outside of cloud and datacenter environments, at the Edge. In this sessions you will learn about how Chick-fil-A has been running Kubernetes in ~2,800 restaurants for the past 4.5 years. We'll discuss why this is necessary and useful, what types of data are being used, what is our approach to persistence, and what tradeoffs have we made between persistence guarantees and complexity of solution. "

Get started with AI on AWS with MLFlow and Notebooks on K8s

Dec 15, 2023
video icon
In this hands-on workshop, we’ll run an end-to-end project for beginners using an open-source machine learning tools on the public cloud. It will allow anyone to follow easily by accessing the existing documentation and simply following the steps that we are going to provide.

Batch Workloads in Multi-tenant Environment with Apache YuniKorn

Nov 10, 2023
video icon
You will get an introduction to Apache YuniKorn – an open-source resource scheduler to redefine resource scheduling on Cloud. To ultimately explain how you can schedule large scale Apache Spark jobs efficiently on Kubernetes in the cloud.

DoKC Town Hall #1 – Comcast and Netflix

Nov 10, 2023
video icon
This video features talks from both Comcast and Netflix. Learn how both gained value in running data on Kubernetes.

DoKC Town Hall #2 – PayItGov & Altinity

Nov 10, 2023
video icon
This month's town hall featured two speakers. Hear from Robert Hodges of Altinity about how to find a Kubernetes operator that really protects your data. Matt Menzenski of Payitgov, shares the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture