DoK Library

CloudNativePG Recipe 6: Postgres Vertical Scaling with Storage – part 1

Apr 10, 2024

By Gabriele Bartolini

Explore the potential of optimizing CPU, RAM, and storage resources through meticulous measurement and benchmarking, challenging conventional scaling wisdom. Delve into the solid strategies within the CloudNativePG stack, such as separate volumes for data and transaction logs, temporary tablespaces, and I/O segregation for tables and indexes. Stay tuned for insights into aligning storage solutions with PostgreSQL’s resilience needs in the upcoming sequel.

Read Article

CloudNativePG Recipe 5 – How to migrate your PostgreSQL database in Kubernetes with ~0 downtime from anywhere

Mar 28, 2024

By Gabriele Bartolini

Are you considering migrating your PostgreSQL database from a service provider into Kubernetes, but you cannot afford downtime? Recipe #5 details step-by-step instructions, leveraging CloudNativePG and logical replication, to seamlessly transition from PostgreSQL 10+ to 16 using an imperative method.

Read Article

CloudNativePG Recipe 4 – Connecting to Your PostgreSQL Cluster with pgAdmin4

Mar 28, 2024

By Gabriele Bartolini

The article explores the deployment of pgAdmin4, a popular graphical user interface for PostgreSQL, within a CloudNativePG environment, primarily for evaluation and educational purposes.

Read Article

CloudNativePG Recipe 3 – What!?! No superuser access?

Mar 12, 2024

By Gabriele Bartolini

Explore the secure defaults of a PostgreSQL cluster in this CloudNativePG recipe, aligning with the principle of least authority (PoLA). Our commitment to security and operational simplicity shines through default configurations, balancing robust protection with user-friendly settings.

Read Article

CloudNativePG Recipe 2 – Inspecting Default Resources in a CloudNativePG Cluster

Mar 08, 2024

By Gabriele Bartolini

Dive into the nitty-gritty of how CloudNativePG works its magic with PostgreSQL cluster stuff, zooming in on configmaps and secrets.

Read Article

CloudNativePG Recipe 1 – Setting up your local playground in minutes

Mar 04, 2024

By Gabriele Bartolini

How to setup your local playground in kind, install CloudNativePG and deploy your first PostgreSQL cluster by Gabriele Bartolini

Read Article

Adding Zonal Resiliency to Etsy’s Kafka Cluster | DoKC Town Hall

Feb 21, 2024

By Add the name(s) of the person/people who created this resource.

In this talk, Kamya Shethia from Etsy, discusses the changes made to add zonal resiliency to the Kafka cluster, that enabled them to speed up their update process.

Watch Video

An Introduction to Custom Resource Definitions and Custom Resources(Operators 101: Part 2)

Feb 06, 2024

By Steven Sklar

In part 2 of his series "Operators 101", Steve Sklar goes into detail about Custom Resource Definitions, including some best practices when designing them and golang-based tooling used to generate them.

Read Article

DoK Database Patterns Whitepaper

Jan 11, 2024

By Data on Kubernetes Community & CNCF Storage TAG

This whitepaper describes the patterns of running data on Kubernetes with a focus on database applications. It describes the attributes of a storage system and how they affect the database applications, how different storage stacks affect these attributes, what are the differences of running data inside and outside of Kubernetes, what are the characteristics of Kubernetes that are beneficial for running data on Kubernetes, and what are the best practices and lessons we have learned from running data on Kubernetes.

What Are Kubernetes Operators? (Operators 101: Part 1)

Jan 05, 2024

By Steven Sklar

This is part 1 of a series by Steven Sklar, Operators 101, where he teaches readers how to design, build, and deploy Kubernetes operators that can automate the management of your own unique applications.

Read Article

1000 node Cassandra cluster on Amazon’s EKS? – Matt Overstreet (DoK Day EU 2022)

Dec 21, 2023

Come here about our experience scaling Cassandra on EKS to over 1000 nodes and 20 million transactions per second. This session will cover the lessons learned, successes, failures, and tools used to get there.

Watch Video

The Data on Kubernetes Landscape – Melissa Logan (DoK Day EU 2022)

Dec 21, 2023

We know from the first Data on Kubernetes Report that 90% of respondents believe Kubernetes is ready for stateful workloads, but significant challenges remain. The DoK Community continues to grow and build a unique space where people share knowledge and have conversations that are shaping the next decade of data on Kubernetes.

Watch Video

Bringing Apache Cassandra closer to Kubernetes – Jake Luciani (DoK Day EU 2022)

Dec 21, 2023

What does Kubernetes provide that allows us to reduce the complexity of Apache Cassandra while making it better suited for cloud native deployments? That was the question we started with as we began a mission to bring Cassandra closer to Kubernetes and eliminate the redundancy.

Watch Video

Operator Lifecycle Management – Julian Fischer (DoK Day EU 2022)

Dec 21, 2023

"The ability to extend Kubernetes with Custom Resource Definitions and respective controllers has led to the OperatorSDK, which became the de facto standard for data service automation on Kubernetes. There are countless operator implementations available, and new operators are being released on a daily basis."

Watch Video

What we’ve learned from running a PostgreSQL managed service on Kubernetes – Oleksii Kliukin

Dec 21, 2023

Kubernetes is an emerging platform of choice for deploying and running PostgresSQL. Deploying 100 Postgres clusters is as easy as deploying one, and there is no need to tinker with tools like Ansible or Puppet. Resource sharing can be applied when it makes sense, allowing to run multiple Postgres databases in isolation on a single instance, each storing the data on a dedicated persistent volume.

Watch Video

Why run Postgres in Kubernetes? – Gabriele Bartolini (DoK Day EU 2022)

Dec 21, 2023

"Postgres should run inside your Kubernetes cluster. Yes, inside, not outside Kubernetes. After all, a database should be seen as an application, a special type of application - for which it is legitimate to require an additional level of care and attention."

Watch Video

The future of data on Kubernetes with Adobe and CNCF – Joseph Sandoval, Xing Yang & Sylvain Kalache

Dec 21, 2023

Some data-intensive workloads are easier to run in Kubernetes than others. Why? What needs to improve? Join us as we deep dive with Adobe and the CNCF about how easy (or not) it is to run different types of data workloads on Kubernetes – and what is being done both inside and outside of Kubernetes to make data workloads easier.

Watch Video

From Laptop to Cloud: Developing Cloud-Native Applications with Containerized Databases – N.Vermandé

Dec 21, 2023

With the advent of microservices in Kubernetes, individual developer teams now manage their own data, middleware, and databases. Automated tests and CI/CD pipelines have to be revisited to include these new requirements.

Watch Video

Testing the Mettle: Evaluating data solutions for large-scale production to check who stacks up

Dec 21, 2023

The state of the CNCF Storage options has exploded in the past few years, but if you had to choose a project to use today, how would you go about comparing each offering and choosing who to partner with for your future growth?

Watch Video

Autoscaling Stateful Workloads in Kubernetes – Mohammad Fahim Abrar & Md Kamol Hasan (DoK Day EU 22)

Dec 21, 2023

Managing stateful workloads in a containerized environment has always been a concern. However, as Kubernetes developed, the whole community worked hard to bring stateful workloads to meet the needs of their enterprise users.

Watch Video

PV TrashCan – Protection against accidental deletion of PVs or Namespaces (DoK Day EU 2022)

Dec 21, 2023

Accidental PVC delete or namespace delete can cause the Persistent Volume to get deleted. Such volumes lose their data and the stateful applications lose their state. By the use of Persistent Volume TrashCan, users can get a grace period to undo such unintended delete operation.

Watch Video

Build your own social media analytics with Apache Kafka – Jakub Scholz (DoK Day EU 2022)

Dec 21, 2023

Apache Kafka is more than just a messaging broker. It has a rich ecosystem of different components. There are connectors for importing and exporting data, different stream processing libraries, schema registries and a lot more. This talk will show how to use it to read data from social networks such as Twitter, process them and use machine learning to analyze them. And all of it will be of course running on top of Kubernetes.

Watch Video

One Click to Run Apache Spark as a Service on Kubernetes – Bo Yang (DoK Day EU 2022)

Dec 21, 2023

It is still challenging to run Apache Spark and other big data processing workload on Kubernetes, especially in large scale. People need to address various issues like resource isolation, queuing, and cost efficiency. This session will share details about those challenges and how to address them. We will also present a convenient (one-click) way to deploy Apache Spark on Kubernetes, and dramatically lower the barrier to use Spark.

Watch Video

Datashim – a framework for declarative management of datasets on Kubernetes – Srikumar Venugopal

Dec 21, 2023

Many ML pipelines depend on shared filesystems for input, output and intermediate data storage. Standards such as CSI have made it possible for applications in Kubernetes to access a variety of data storage systems. Yet, data scientists still have to deal with low-level details of data access in order to execute their pipelines in Kubernetes. Datashim is a framework that manages the lifecycle of a Dataset object, a CustomResourceDefinition that represents a source of data. Datashim takes care of the details of data access while Kubernetes pods can declaratively access the data by referencing a Dataset in their specifications. This talk will describe Datashim and the Dataset object, discuss its use in ML pipelines, and demonstrate how its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Datashim is an incubating project of the Linux Foundation Data and AI Foundation

Watch Video

Kanister & Kopia: An Open-Source Data Protection Match Made in Heaven – Pavan Navarathna

Dec 21, 2023

Cloud-native applications comprise various components, including data services, storage systems, and related Kubernetes objects. Each component requires its own data protection tools, strategy, and domain expertise. A robust solution aligned with business requirements often involves complex workflows. What if there was a way to coordinate the implementation of these workflows while optimizing how backups are moved into storage?

Watch Video

Weathering The Cloud Storm: Modern Data Management Patterns for Reliability and Availability

Dec 21, 2023

“Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether it’s a regional e-commerce website or a major cloud region hosting thousands of applications. That’s why, instead of chasing these illusions, it’s worth focusing on the nines of availability.

Watch Video

Graph in Kubernetes Panel – Wey Gu, Cheukting Ho & Feynman Zhou

Dec 21, 2023

Graph databases are the fastest growing data store in the world. According to Gartner, the application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science. However, it is often difficult for data and analytics professionals to distinguish between different implementation models, and to fit them to their use case. This panel will speak directly to Kubetnetes users and provide them with the context they need to run stateful workloads.

Watch Video

Growing up fast: Kubernetes and Real-Time Analytic Applications – Robert Hodges (DoK Day EU 2022)

Dec 21, 2023

Kubernetes is turning into a preferred platform for real-time analytic app that crunch billions of events per day and return insights in seconds. In this talk we'll introduce the standard analytic app design pattern of fast event streams coupled with low-latency data warehouses, using open source projects. We'll then walk through deploying the pipeline on Kubernetes from ingest to end user access. We'll touch on use of operators, scaling, monitoring, upgrade, security, and approaches to adding custom components. Attendees can expect to leave with concrete lessons about how to stand up low-latency analytics quickly on Kubernetes.

Watch Video

How to protect your data – Sarah Julia Kriesch (DoK Day EU 2022)

Dec 21, 2023

How can you keep your data secure and how can you transfer them on a secure way? You will learn to encrypt your data, that you can use them Kubernetes based in a Multi Cloud environment.

Watch Video

Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Developers (DoK Day EU 2022)

Dec 21, 2023

Kubernetes comes with a lot of useful features like Volumes and StatefulSets, which make running stateful workloads simple. Interestingly, when combined with the right tools, these features can make Kubernetes very valuable for developers wanting to run massive production databases in development! This is exactly what was seen at "Extendi".

Watch Video

Is your database in Kubernetes production ready? – Mykola Marzhan (DoK Day EU 2022)

Dec 21, 2023

"It only looks simple to run databases in Kubernetes. In fact, it is too many things needed to be considered before running any database in Kubernetes. Failover and traffic switching, replication and data consistency/loss after failover, upgrades, DB and node-level configuration, CNI, backups, monitoring, etc."

Watch Video

Disaggregated Container Attached Storage – Yet Another Topology with What Purpose? – Nick Connolly

Dec 21, 2023

The storage topology in vogue seems to cycle every few years. Internal storage is followed by centralized Storage Area Networks only to be superseded by one-size-fits-all Hyperconverged models - until scalability constraints led to distributed storage. Then comes NVMe, offering blistering speeds that all of these storage stacks struggle with. Kubernetes inspires Container Attached Storage aspiring to be the perfect model, so why is disaggregated storage now making an appearance?

Watch Video

Microservices and Kubernetes for your Full Data Lifecycle – Steve Pousty (DoK Day EU 2022)

Dec 21, 2023

Data doesn’t magically appear in our data centers. There are usually several phases and several storage locations along its journey throughout your organization. New architectural patterns, such as microservices, and new technology, such as Kubernetes are changing how we can think about and manage the large volumes of data coming at us.

Watch Video

What’s New in Kubernetes Storage – Xing Yang (DoK Day EU 2022)

Dec 21, 2023

Kubernetes SIG Storage is responsible for ensuring storage is available for containers in a pod when the pod is scheduled on a node. There is the Container Storage Interface (CSI) for block and file storage that allows storage providers to write CSI drivers. There is also a COSI sub-project that is trying to add object storage support in Kubernetes. In this session, Xing will give an update on some of the features that SIG Storage is working on and discuss what might be coming in the future.

Watch Video

Operating FoundationDB on Kubernetes – Johannes M. Scheuermann (DoK Day EU 2022)

Dec 21, 2023

FoundationDB is an open-source distributed transactional Key-Value store that is used by multiple companies like Apple, Snowflake and VMWare Tanzu (previously Wavefront).

Watch Video

Using Kubernetes to deliver a “serverless” service – Jim Walker (DoK Day EU 2022)

Dec 21, 2023

Serverless promises to change the way we consume software. It allows us to potentially pay for only that which we use and can help drive down operational costs to the minimal amount of resources necessary.

Watch Video

Protecting data with CSI Volume Snapshots on Kubernetes – Grant Griffiths (DoK Day EU 2022)

Dec 21, 2023

The container storage interface (CSI) is a contract between different container orchestrators (Kubernetes, Nomad, etc) and storage plugins. This contract is a set of gRPC services for provisioning, utilizing, and snapshotting storage volumes. In this talk, we will focus on one aspect of the CSI spec: Volume Snapshots.

Watch Video

The many uses of Kubernetes cross cluster migration of persistent data – Ryan Kaw (DoK Day EU 2022)

Dec 21, 2023

Multiple clusters exist in most Kubernetes environments today, and number of clusters will increase overtime. The reasons for having multiple Kubernetes clusters are many, for example, overcoming scale limits, reducing complexity, geo separation, redundancy and having separate production, staging, and development environments. Once you have multiple K8S clusters, it can be useful to have the ability to easily move or duplicate workloads across these different clusters. Kubernetes does not have a native method to allow migration or duplication of workloads across clusters.

Watch Video

Resilient Redis – Hrittik Roy & Ryan Gray (DoK Day EU 2022)

Dec 21, 2023

Redis is a widely used open-source in-memory data store and cache that has become a key component in the development of scalable microservice systems. While all of the main cloud providers provide fully managed Redis services (Amazon ElastiCache, Azure Cache for Redis, and GCP Memorystore), it may also be simply implemented in Kubernetes if you require additional control over the Redis configurations.

Watch Video

Running Kafka on Kubernetes, across three clouds at Adobe – Adi Muraru (DoK Day EU 2022)

Dec 21, 2023

Adobe runs dozens of Kafka clusters spread across both public (AWS and Azure) and private clouds to power the Adobe Experience Platform message bus.

Watch Video

Why are Operators paramount to running stateful workloads on Kubernetes?

Dec 21, 2023

In this panel with Sylvain Kalache, Head of Content at the DoK Community, drives a conversation featuring Nic Vermandé- Principal Developer Advocate at Ondat, Julian Fischer- CEO at anynines, and Sergey Pronin- Group Product Manager at Percona.

Watch Video

Data on K8s – Where are we now? // Álvaro Hernández, CEO of OnGres

Dec 21, 2023

Álvaro Hernández has been with our community since the beginning. He's seen where we've been, and also has a vision about where we're going.

Watch Video

What are customers’ concerns when running data on Kubernetes? // Álvaro Hernández- CEO of Ongres

Dec 21, 2023

We've crossed the chasm, but what are customers' current concerns when it comes to running data on Kubernetes?

Watch Video

What is data observability on Kubernetes? // Álvaro Hernández, CEO of Ongres

Dec 21, 2023

Observability is a hot topic for SREs in the Kubernetes ecosystem. But what does it mean in the context of data?

Watch Video

Why run stateful workloads on Kubernetes? Sathya Sankaran

Dec 21, 2023

Sathya Sankaran, COO of Catalogic and GM of CloudCasa by Catalogic, let's us know what value end users are getting by running stateful workloads on Kubernetes.

Watch Video

Day 2 Kubernetes- what are the challenges? Sathya Sankaran

Dec 21, 2023

Sathya Sankaran, COO at Catalogic and GM at CloudCasa by Catalogic, let us know the difficulties that folks are facing when it comes to Day 2 Kubernetes.

Watch Video

Is data on k8s becoming boring (in a good way)? Jerome Petazzoni

Dec 21, 2023

We all want technologies that are "wild" to become tame and under control. Can the same thing happen for data on Kubernetes? Let's see what Jerome Petazzoni has to say about it.

Watch Video

What does the new operator for Postgres do? – Gabriele Bartolini EDB

Dec 21, 2023

Gabriele shares with us how they built the Postgres Operator for Kubernetes.

Watch Video

Building a Digital Factory for the Sheet Metal Industry – Elie Assi

Dec 21, 2023

We develop systems to digitize the sheet metal industry with the belief that they should cooperate with each other in an open way. We are convinced that the future lies in creating a software ecosystem that interconnects all levels of the company and even manages to communicate with supplier and customer systems, making for more agile management throughout the entire value chain.

Watch Video

Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes – Arul Jegadish Francis

Dec 21, 2023

We at OpsVerse provide a DevOps tools platform with fully-managed open source-based tools. One of our key offerings is a holistic observability platform. Metrics and logs are straightforward to aggregate, however traces – which are collected using CNCF Jaeger – were left with some holes in advanced insights.

Watch Video

Scaling our SaaS offering to thousands of clusters – Dax McDonald

Dec 21, 2023

Sourcegraph is a code intelligence platform that helps our customers to understand their code better. As we have scaled up, we are starting to run hundreds of instances for our customers in separate kubernetes clusters.

Watch Video

The Challenges of Data Processing On Kubernetes:A look at Spark, Flink, Dask, and Ray – Holden Karau

Dec 21, 2023

This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as some of the current pain experienced by users and developers moving their workloads to Kube. In this talk you will learn about how we “cheated” back in the YARN and Mesos days to make things go fast, why Kubernetes doesn’t like those cheats, and what some alternatives are.

Watch Video

Architecting Your First Event Driven Serverless Streaming Applications on K8 – Timothy Spann

Dec 21, 2023

Once you have built a topic in Apache Pulsar, you will quickly see the need to build event-driven applications. This can require a lot of decisions on what framework to use, where to run it, how to deploy it, and how to manage these applications on Kubernetes cloud natively.

Watch Video

Data streaming on Kubernetes – Yaniv Ben Hemo

Dec 21, 2023

I will cover what is the current data streaming on k8s landscape, why it is important, use cases, and what are the challenges needed to solve

Watch Video

Databases on Kubernetes: Why are they important?

Dec 21, 2023

Watch Video

The Kubernetes Native Database – Jeffrey Carpenter

Dec 21, 2023

In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we’ve started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them.

Watch Video

Open Source Databases on Kubernetes: Best Practices – Peter Zaitsev

Dec 21, 2023

So you’re looking to run your Open Source Database on Kubernetes. What best practices should you follow and what pitfalls should you avoid ? In this presentation we will look at how to run stateful applications on Kubernetes overall as well as what is particularly important for databases - we will cover high availability, security, backups and disaster recovery.

Watch Video

Mastering MongoDB on Kubernetes, the power of operators – Arek Borucki

Dec 21, 2023

"NoSQL Databases on Kubernetes - considerations and best practices with a live demo. MongoDB's natural capabilities like replication, sharding (partitioning data and holding different pieces in separate instances/pods) or failover (failing over from the master, read-write node to other read-only nodes, and promoting the read-only node as the master) can more easily deal with the uncertainty of heterogeneous cloud environments, which makes this database good candidate to launch on Kubernetes cluster."

Watch Video

Inter Cluster PostreSQL on Kubernetes – Julian Fischer

Dec 21, 2023

In this talk you’ll explore how to run a PostgreSQL cluster across multiple Kubernetes clusters. Learn what challenges arise when using asynchronous streaming replication in a set of Kubernetes clusters spanning across several geographical regions.

Watch Video

Highly Available Postgres Clusters In Kubernetes – John Long & Jonathan Gonzalez

Dec 21, 2023

A practical session about running Highly Available PostgreSQL in Kubernetes. The primary objective will be to demonstrate how to set up a reliable architecture in a Kubernetes cluster to achieve low RTO and RPO.

Watch Video

Medical / Healthcare Data on Kubernetes – Olyvia Rakshit & Prasad Dorbala

Dec 21, 2023

Healthcare organizations are transforming their applications and embracing digital platforms for efficient patient care. Today, compute at the edge, plays a critical role in deploying innovative healthcare applications that promise new approaches to patient care.

Watch Video

Shifting Left Stateful Applications In Kubernetes – Viktor Farcic

Dec 21, 2023

Stateless apps are easy to manage. More often than not, a Kubernetes Deployment, with a Service, Ingress, and Horizontal Pod Autoscaler (HPA) is enough. Almost everyone can do it. But, when it comes to stateful applications, things become a bit more complicated. We might need a database and storage. We might need to manage database users and schema. We might need to consider quite a few other things. Stateful apps are harder for everyone, especially if we want to shift left and enable developers to do it themselves.

Watch Video

Kubernetes 360º – Data driven observability – from Secrets to logs – Ben Hirschberg

Dec 21, 2023

If there’s one thing that everyone can agree on - it’s that the sheer scale and complexity of Kubernetes operations is growing constantly. What’s more, cloud native environments are becoming more and more expensive to operate and manage, as well as increasingly difficult to secure. On the bright side, there is a growing ecosystem of exceptional open source tools to help overcome this complexity, and provide greater situational awareness to what’s happening in your many and multiple Kubernetes clusters.

Watch Video

Choosing Kubernetes for Stateful Applications – Akshay Ram & Peter Schuurman

Dec 21, 2023

Learn how customers are increasingly deploying stateful applications on Kubernetes to benefit from portability, economies of scale, and built-in orchestration capabilities. This talk will include how customers choose between using Kuberentes, or a data Software as a Service (SaaS) and stateful capabilities of Kubernetes across two dimensions - the application orchestration and the storage layer. Also learn about MariaDB SKYSQL, a database software as a service that runs thousands of StatefulSet Pods across multiple zones and regions on Kubernetes.

Watch Video

Formula 1 telemetry processing using Apache Kafka on Kubernetes – Paolo Patierno

Dec 21, 2023

Apache Kafka is the de facto data streaming platform used for ingesting vast amounts of data and processing them in real-time. Low latency analytics are vital if users are to react to events as fast as possible and to effectively shape future decision making. Together with Kubernetes, it allows to develop cloud oriented analytics solutions which are highly scalable.

Watch Video

Are StatefulSets broken? Michael Guarino, CTO of Plural.sh

Dec 21, 2023

Are StatefulSets broken? Michael Guarino is no stranger to Kubernetes, and he's seen how it has developed over time to be "friendlier" when it comes to running stateful workloads with features like StatefulSets.

Watch Video

Stateful Apps in a Multicloud Era- Yves Weisser

Dec 21, 2023

"More & more companies use several environments to host their applications. Sometimes an application will be developed in a datacenter & moved to production in the cloud, or vice versa."

Watch Video

What are customers’ challenges when running data on k8s? Joe Gardiner DoK Talks #156

Dec 21, 2023

Joe Gardiner (Director of Cloud Native Architecture - EMEA at Pure Storage. ) has been working with customers and helping them solve their data challenges for years. So what are their concerns when it comes to running stateful workloads on Kubernetes?

Watch Video

DoK Report 2022- DoKC Director Melissa Logan and Stephanie Fairchild of ClearPath Strategies

Dec 20, 2023

Our 2022 report features insights from over 500 executives and technology leaders on how data on Kubernetes has a transformative impact on organizations, regardless of size or tech maturity.

Watch Video

DoK Community Talks: Intro to Why Data Matters

Dec 20, 2023

"Chapter 1: Intro to Why Data Matters Lisa Marie-Namphy, Head of Developer Relations at Cockroach Labs and Sam Ramji, Chief Strategy Officers at DataStax sit down with DoKC to discuss why data matters and what the future of data looks like."

Watch Video

How did South America’s biggest ecommerce store tackle data on K8s?

Dec 20, 2023

"When developers don't have access to data, how can they make informed decisions? Ramiro Berrelleza is the CTO of Okteto, and he shared a case study of what life was like for the largest ecommerce store in South America before they leveraged the benefits of running data on Kubernetes. "

Watch Video

What is Kafka? The rise of one of the world’s most used streaming data technologies w/Abbey Russell

Dec 20, 2023

"Abbey Russell, PM at Cockroach Labs, shared the backstory on how and why Kafka was created. Along the way, you'll learn about - Who Franz Kafka was - Kafka's earliest use at Linkedin in 2010 - Why organizations like Uber/Coursera/Mailchimp use it today - Future of Data Streaming"

Watch Video

Operators 101 – Uma Dhatri

Dec 20, 2023

What are operators? Are they like the old timey ladies at the telephone exchange? Let's find out together

Watch Video

Exploring The Power of Autoscaling – Aditya Tomar

Dec 20, 2023

In this talk you are going to gain an insight about Machine Learning, Kubernetes and how Autoscaling helps organizations.

Watch Video

MongoDB Goes to K8s: A Wild Adventure with Operators – Ritesh Karankal

Dec 20, 2023

Hold onto your hats, folks, because we're about to explore Why do we need an operator to run MongoDB on K8s? How operators work? and what they can do for you? all while enjoying the ride with our trusty sidekick, the Kubernetes Operator.

Watch Video

Kubernetes: The Ultimate Platform for Streamlining Data Streaming- Yash Pimple

Dec 20, 2023

Watch Video

12 chapters of Data on Kubernetes – Atharv Karajgi

Dec 20, 2023

Watch Video

Rook – Helping the Kubernetes Storage Community Thrive

Dec 20, 2023

Rook is an open source cloud-native storage operator, providing support for Ceph to natively integrate with Kubernetes. An introduction to Rook will show how Rook configures Ceph to provide stable block, shared file system, and object storage for your production data. Rook recently joined DoK as a community sponsor. Let’s have a discussion about how we can help the K8s storage community thrive. Rook was accepted as a graduated project by the CNCF in October 2020.

Watch Video

DoK @ Comcast – Deliver Business Outcomes & Improved DevX with Data Services on K8s

Dec 20, 2023

"DoK @ Comcast: Delivering Business Outcomes & Improved DevX with Data Services Running on Kubernetes Presented by Greg Otto, Executor Director, DevX Platforms & Charles Ju, Principal Engineer. Transforming how to deliver measurable value using data on Kubernetes, while providing psychological safety. In this talk, we will share our transformation journey, the “Months to Minutes” outcomes we achieved, the architecture approach, and the human journey from one of our engineers."

Watch Video

DoK + Apache Spark

Dec 20, 2023

Presented by Holden Karau, Spark Committer and Open Source Engineer at Netflix. In this brief talk Holden will cover some of the best practices from trying to deploy both small and large scale Spark on Kube.

Watch Video

Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo and Hera

Dec 20, 2023

At PayIt, we’ve been deploying applications to Kubernetes almost since the beginning of the company. Our data workloads, however, have run instead in AWS Glue. This has worked well enough for the reporting use cases that have been the main focus of this team historically. However, at the beginning of 2022, the PayIt data team began building out a new data platform, and in the process, ran into a number of challenges with Glue. In this talk, I will share the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture that we’ve arrived at today.

Watch Video

Repel Boarders! How to find a Kubernetes operator that really protects your data

Dec 20, 2023

Operators are a godsend for managing data in Kubernetes. But how about protecting it? We'll explore security threats to cloud native databases and show what protection you should look for in operators. Finally we'll introduce a new Data on Kubernetes Community project to develop security standards for database operators in Kubernetes.

Watch Video

Implementing Data & Databases on K8s within the Dutch Government

Dec 20, 2023

A small walkthrough of projects within the Dutch government running databases on OpenShift. This talk shares success stories, provides a proven recipe to `get it done` and debunks some of the FUD.

Watch Video

Persistence at the Edge for Thousands of Chick-fil-A Restaurants

Dec 20, 2023

"Kubernetes is being deployed outside of cloud and datacenter environments, at the Edge. In this sessions you will learn about how Chick-fil-A has been running Kubernetes in ~2,800 restaurants for the past 4.5 years. We'll discuss why this is necessary and useful, what types of data are being used, what is our approach to persistence, and what tradeoffs have we made between persistence guarantees and complexity of solution. "

Watch Video

Get started with AI on AWS with MLFlow and Notebooks on K8s

Dec 15, 2023

In this hands-on workshop, we’ll run an end-to-end project for beginners using an open-source machine learning tools on the public cloud. It will allow anyone to follow easily by accessing the existing documentation and simply following the steps that we are going to provide.

Watch Video

Batch Workloads in Multi-tenant Environment with Apache YuniKorn

Nov 10, 2023

By Sunil Govindan, Wilfred Spiegelenburg

You will get an introduction to Apache YuniKorn – an open-source resource scheduler to redefine resource scheduling on Cloud. To ultimately explain how you can schedule large scale Apache Spark jobs efficiently on Kubernetes in the cloud.

Watch Video

DoKC Town Hall #1 – Comcast and Netflix

Nov 10, 2023

By Greg Otto, Charles Ju,Holden Karau

This video features talks from both Comcast and Netflix. Learn how both gained value in running data on Kubernetes.

Watch Video

DoKC Town Hall #2 – PayItGov & Altinity

Nov 10, 2023

By Robert Hodges, Altinity // Matt Menzenski, Payitgov

This month's town hall featured two speakers. Hear from Robert Hodges of Altinity about how to find a Kubernetes operator that really protects your data. Matt Menzenski of Payitgov, shares the difficulties that we encountered with building, deploying, and orchestrating ETL pipelines in AWS Glue, our decision process for moving those workloads into Kubernetes, and the ELT architecture

Watch Video

Data on Kubernetes Day Europe 2024 talks are now available for streaming!

Sorry, no resources match your criteria.

CloudNativePG Recipe 6: Postgres Vertical Scaling with Storage – part 1

CloudNativePG Recipe 5 – How to migrate your PostgreSQL database in Kubernetes with ~0 downtime from anywhere

CloudNativePG Recipe 4 – Connecting to Your PostgreSQL Cluster with pgAdmin4

CloudNativePG Recipe 3 – What!?! No superuser access?

CloudNativePG Recipe 2 – Inspecting Default Resources in a CloudNativePG Cluster

CloudNativePG Recipe 1 – Setting up your local playground in minutes

Adding Zonal Resiliency to Etsy’s Kafka Cluster | DoKC Town Hall

An Introduction to Custom Resource Definitions and Custom Resources(Operators 101: Part 2)

DoK Database Patterns Whitepaper

What Are Kubernetes Operators? (Operators 101: Part 1)

1000 node Cassandra cluster on Amazon’s EKS? – Matt Overstreet (DoK Day EU 2022)

The Data on Kubernetes Landscape – Melissa Logan (DoK Day EU 2022)

Bringing Apache Cassandra closer to Kubernetes – Jake Luciani (DoK Day EU 2022)

Operator Lifecycle Management – Julian Fischer (DoK Day EU 2022)

What we’ve learned from running a PostgreSQL managed service on Kubernetes – Oleksii Kliukin

Why run Postgres in Kubernetes? – Gabriele Bartolini (DoK Day EU 2022)

The future of data on Kubernetes with Adobe and CNCF – Joseph Sandoval, Xing Yang & Sylvain Kalache

From Laptop to Cloud: Developing Cloud-Native Applications with Containerized Databases – N.Vermandé

Testing the Mettle: Evaluating data solutions for large-scale production to check who stacks up

Autoscaling Stateful Workloads in Kubernetes – Mohammad Fahim Abrar & Md Kamol Hasan (DoK Day EU 22)

PV TrashCan – Protection against accidental deletion of PVs or Namespaces (DoK Day EU 2022)

Build your own social media analytics with Apache Kafka – Jakub Scholz (DoK Day EU 2022)

One Click to Run Apache Spark as a Service on Kubernetes – Bo Yang (DoK Day EU 2022)

Datashim – a framework for declarative management of datasets on Kubernetes – Srikumar Venugopal

Kanister & Kopia: An Open-Source Data Protection Match Made in Heaven – Pavan Navarathna

Weathering The Cloud Storm: Modern Data Management Patterns for Reliability and Availability

Graph in Kubernetes Panel – Wey Gu, Cheukting Ho & Feynman Zhou

Growing up fast: Kubernetes and Real-Time Analytic Applications – Robert Hodges (DoK Day EU 2022)

How to protect your data – Sarah Julia Kriesch (DoK Day EU 2022)

Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Developers (DoK Day EU 2022)

Is your database in Kubernetes production ready? – Mykola Marzhan (DoK Day EU 2022)

Disaggregated Container Attached Storage – Yet Another Topology with What Purpose? – Nick Connolly

Microservices and Kubernetes for your Full Data Lifecycle – Steve Pousty (DoK Day EU 2022)

What’s New in Kubernetes Storage – Xing Yang (DoK Day EU 2022)

Operating FoundationDB on Kubernetes – Johannes M. Scheuermann (DoK Day EU 2022)

Using Kubernetes to deliver a “serverless” service – Jim Walker (DoK Day EU 2022)

Protecting data with CSI Volume Snapshots on Kubernetes – Grant Griffiths (DoK Day EU 2022)

The many uses of Kubernetes cross cluster migration of persistent data – Ryan Kaw (DoK Day EU 2022)

Resilient Redis – Hrittik Roy & Ryan Gray (DoK Day EU 2022)

Running Kafka on Kubernetes, across three clouds at Adobe – Adi Muraru (DoK Day EU 2022)

Why are Operators paramount to running stateful workloads on Kubernetes?

Data on K8s – Where are we now? // Álvaro Hernández, CEO of OnGres

What are customers’ concerns when running data on Kubernetes? // Álvaro Hernández- CEO of Ongres

What is data observability on Kubernetes? // Álvaro Hernández, CEO of Ongres

Why run stateful workloads on Kubernetes? Sathya Sankaran

Day 2 Kubernetes- what are the challenges? Sathya Sankaran

Is data on k8s becoming boring (in a good way)? Jerome Petazzoni

What does the new operator for Postgres do? – Gabriele Bartolini EDB

Building a Digital Factory for the Sheet Metal Industry – Elie Assi

Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes – Arul Jegadish Francis

Scaling our SaaS offering to thousands of clusters – Dax McDonald

The Challenges of Data Processing On Kubernetes:A look at Spark, Flink, Dask, and Ray – Holden Karau

Architecting Your First Event Driven Serverless Streaming Applications on K8 – Timothy Spann

Data streaming on Kubernetes – Yaniv Ben Hemo

Databases on Kubernetes: Why are they important?

The Kubernetes Native Database – Jeffrey Carpenter

Open Source Databases on Kubernetes: Best Practices – Peter Zaitsev

Mastering MongoDB on Kubernetes, the power of operators – Arek Borucki

Inter Cluster PostreSQL on Kubernetes – Julian Fischer

Highly Available Postgres Clusters In Kubernetes – John Long & Jonathan Gonzalez

Medical / Healthcare Data on Kubernetes – Olyvia Rakshit & Prasad Dorbala

Shifting Left Stateful Applications In Kubernetes – Viktor Farcic

Kubernetes 360º – Data driven observability – from Secrets to logs – Ben Hirschberg

Choosing Kubernetes for Stateful Applications – Akshay Ram & Peter Schuurman

Formula 1 telemetry processing using Apache Kafka on Kubernetes – Paolo Patierno

Are StatefulSets broken? Michael Guarino, CTO of Plural.sh

Stateful Apps in a Multicloud Era- Yves Weisser

What are customers’ challenges when running data on k8s? Joe Gardiner DoK Talks #156

DoK Report 2022- DoKC Director Melissa Logan and Stephanie Fairchild of ClearPath Strategies

DoK Community Talks: Intro to Why Data Matters

How did South America’s biggest ecommerce store tackle data on K8s?

What is Kafka? The rise of one of the world’s most used streaming data technologies w/Abbey Russell

Operators 101 – Uma Dhatri

Exploring The Power of Autoscaling – Aditya Tomar

MongoDB Goes to K8s: A Wild Adventure with Operators – Ritesh Karankal

Kubernetes: The Ultimate Platform for Streamlining Data Streaming- Yash Pimple

12 chapters of Data on Kubernetes – Atharv Karajgi