DoKC Day Schedule

May 3rd. 2021 (All times are CEST)

Join us live!
13:15-13:35BREAK

Time Speaker Topic Description
10:00-10:10 Bart/Nellie Welcome A warm welcome to the DoK Community Day. Schedule, interaction, prizes, and more!
10:10-10:25 Patrick McFadin Going from DBA to SRE: Time to make the move As a database administrator, you have been in the middle of some amazing stories of scale and digital transformation. Now our industry is moving quickly into cloud-native architectures and the need for skills to get there has never been greater. The role of Site Reliability Engineer(SRE) is one of the fastest growing job fields in IT and DBAs have the right combination of background and skills to make the transition. I’m here to make the case. Time to make the move from DBA to SRE.
10:25-10:55 Feynman Zhou Introduction to KubeSphere: an Open Source Container Platform and its Ecosystem KubeSphere is a multi-tenant enterprise-grade container platform with full-stack automated IT operation and streamlined DevOps workflows. It provides developer-friendly wizard web UI, helping enterprises to build out a more robust and feature-rich platform, which includes the most common functionalities needed for enterprise Kubernetes strategies. In this talk, we would like to introduce KubeSphere and its open source ecosystem.
Agenda
– Pain points for enterprises to implement Kubernetes in production
– Introduction to the community open source ecosystem: KubeSphere, KubeKey, PorterLB, KubeEye
– Demo: Implementing K8s Multi-cluster management and Cloud Native Observability on KubeSphere
– KubeSphere Community Growth: Present and Future
10:55-11:10 Aitor Artola Kubernetes storage: The last mile Performance of BigData technologies running on Kubernetes is tied to the type of workload and the chosen storage system.
11:10-11:25 Tim Van de Keer Apparently I’m doing DataOps A journey:
How solving a data ingestion problem for a client turned into a data engineering journey over 2 years where at the end of those 2 years we had built a data platform on Google Cloud + Kubernetes and I realized: Apparently I’m doing DataOps?
11:25-11:50 Eric Zietlow Persist Your Data In an Ephemeral K8 Ecosystem Kubernetes and persistent storage go together like oil and water. Kubernetes is inherently an ephemeral system and persistent storage by definition must survive. As a member of the Data On Kubernetes community, Eric will go into the what, why, and how to best design your architecture. He will cover emerging OSS technology solutions like OpenEBS. After his talk, you should have a clear understanding of the path to successfully managing a persistent data storage solution on your Kubernetes cluster.
11:50-12:00 Nellie Tobey Bringing K8s into port with visual learning- Coloring Book Launch! Data on Kubernetes isn’t always the easiest thing to explain. Nellie has been creating artwork for the DoK Community in order to make the concepts more accessible. She’ll show how she does it and officially launch the DoK Community’s first coloring book!
12:00-12:35 Neeraj Bisht and Praveen Kumar GT eCommerce giant Flipkart on data on Kubernetes at scale Flipkart is India’s largest e-commerce company serving millions of Indian customers across thousands of product categories. Flipkart applications run out of its private data centers and most of the applications traditionally used to run on Virtual Machine based architecture. One year back we decided to modernize our infrastructure using Docker and Kubernetes for its portability and agility.
Two senior engineers from Flipkart will talk about the work that is ongoing in supporting Flipkart’s stateful applications on Kubernetes. They will describe and explain how and why they leverage OpenEBS to build their K8s stateful story. They will also share where they are with their journey of shifting petabytes of data and dozens of stateful workloads including many NoSql, SQL, logging, machine learning, and other workloads onto Kubernetes in collaboration with OpenEBS and Mayadata.
12:35-13:00 Jeff Carpenter Data Services for the Masses Hey there SRE teams. Your dev teams want data services with a lot less detail about the underlying database. They just want to know that data API service can scale to meet their needs, be online when needed and be usable in the code they are writing. In this talk, you will see some of the amazing options now available in Kubernetes using the Stargate open source project. Helping your dev teams should be as easy as any other service you deploy.
13:00-13:15 Alvaro Hernandez Why you should be deploying Postgres primarily on Kubernetes Running a Postgres installation, with or without containers, is trivial. However, setting up a production environment is a whole different matter.
Postgres is not by itself a production-ready software: it requires a set of side tools to complement its functionality: connection pooling, monitoring, backup tools, high availability software, you name it. This is called the “Stack Problem”.
Join this brief talk to discuss the Stack Problem, understand how Kubernetes is the platform that best solves it, and what are the main advantages (and disadvantages!) of running Postgres on Kubernetes.
13:35-14:00 Chris Bradford Finding peaceful co-existence between Cassandra and Kubernetes There are bad things that databases do to Kubernetes and there are bad things that Kubernetes does to databases. On top of that, deploying two highly opinionated distributed systems like Cassandra and Kubernetes, there is inevitable conflict. Is the effort even worth the trouble? This work is getting done and in this talk, you will learn about the rapid changes in both projects to make data on Kubernetes default easy. You will see when they are good together, they create an incredible balance.
14:00-14:15 Kunal Kushwaha Scaling Communities to be more Inclusive In this talk I will be sharing my experiences in the community from being a student, to starting Code for Cause and scaling it to 50K+ community members. I’ll also share the importance of getting started and contributing to Open Source software and why Open Source is for everyone and every contributions counts.
I’ve done a few collaborations with DoK Community that were pretty great and taking that as an example I want to share the importance of getting more young people involved in the community.
In addition to this, I’ll also share my experiences of getting involved in the CNCF ecosystem right from my freshman year of college.
Links: https://www.youtube.com/channel/UCfv8cds8AfIM3UZtAWOz6Gg
14:15-14:30 Mario Loria Selling Cloud Native Internally We’ll dive into a few reasons why its so difficult to make internal stakeholders realize the value in adopting cloud-native technology, methods, and community.
14:30-14:55 Sergey Pronin Percona XtraDB Cluster Operator architecture decisions Percona XtraDB Cluster Operator is a drop in replacement for MySQL Enterprise with sync replication running on Kubernetes. It automates the creation, alteration, or deletion of members in your Percona XtraDB Cluster environment. It can be used to instantiate a new Percona XtraDB Cluster replica set, or to scale an existing environment.
In this talk we will cover various architecture decisions we made when building PXC Operator. There are lots of differences between how it can be done on regular VMs and in k8s: PITR implementation, autorecovery, retention policies, haproxy/proxysql & proxy protocol.
14:55-15:20 Eric Zietlow and Aleks Volochnev Deploying open cloud-native data using K8ssandra K8ssandra is an open source project that is trying to perfect the deployment of production Apache Cassandra on Kubernetes. Learn how to deploy across different Kubernetes environments and be successful at deploying Cassandra on Kubernetes.
15:20-15:45 Divya Mohan The art of breaking things (intentionally) Tired of broken links & customer complaints about slowness for applications that you help develop or maintain? You aren’t alone & yes, things don’t need to get worse before they get better! Introduced as a tool to test the resiliency of its infrastructure in 2011 by Netflix, Chaos Engineering is one of the top 5 technologies to watch out for in 2021 per CNCF.

But where do data and Kubernetes fit into this picture? And what does Chaos Engineering mean for you as an infrastructure person handling large volumes of data or as a data scientist working with that data? Hop on board as we try navigating the murky waters of Chaos Engineering wherever you are in your cloud native journey.

15:45-16:00 Sébastien Pahl Why we need an Open Source Observability Distribution Like Linux distributions, Opstrace simplifies the packaging, installation and maintenance of open source observability projects that are otherwise a highly complex stack for you to operate. For example, building tools to improve the critical alert creation/management workflow and to test upgrades so you can confidently—and regularly—upgrade versions of projects to stay up-to-date with security patches and features. We believe that teams big and small need and deserve a better open source experience. An open source distribution is a place everyone can come to participate in that vision. More at: https://github.com/opstrace/opstrace
16:00-16:15 Jean-Yves Stephan Spark on Kubernetes Performance Tuning Session In this talk, we will go over concrete Spark pipelines performance tuning improvements, using the insights of Delight, an open-source monitoring dashboard for Spark. This will be a technical talk with code examples and live demo. We will cover the performance of shuffle, dynamic allocation, memory tuning, and parallelism tuning.
16:15-16:30 Rick Vasquez A Call for DBMS to modernize on K8’s Kubernetes is maturing at a rapid pace, and while it made sense a few years ago to see flat out refusal from the key open source vendors and corporate sponsors of projects to ignore Kubernetes style of deployment as nothing more than a fad, or nothing more than something that you do for non-production environments it’s time for the largest and most popular database engines to start developing the database with cloud native paradigms
16:30-16:45 Abhi Vaidyanatha Kubernetes as an Architectural Canvas TBDAirbyte is a customizable data integration platform optimized for single node execution. Kubernetes will enable an infrastructural shift in making our architecture flexible, composable, and scalable, allowing us to explore a variety of futures for what the ideal Airbyte deployment could look like. Today, we will explore those futures together to give context on architectural trade-offs in Kubernetes.
16:45-17:00 Andrea Henkel The Last Mile- Paving the Road to Success The Last Mile prepares incarcerated individuals for successful reentry through business and technology training.
17:00-17:30 Evan Powell & Patrick McFadin DoK Governance An overview of how the Data on Kubernetes Community is structured. Trainings, projects, how to get involved, ambassadors, and more!
17:30-17:35 Closing Performance- Maserati E
Join us live!