InTowards DevbySai ParvathaneniBuilding a Streaming Data Pipeline: Spark vs. Flink Comparison with Kafka IntegrationWhen handling streams of data, two prominent frameworks that often come into play are Apache Spark and Apache Flink. Both are commonly used…Sep 13, 2024Sep 13, 2024
InDev GeniusbyKoushik DuttaAirflow with Git-Sync in KubernetesInstall Airflow with Git-Sync in Kubernetes Cluster, so that once DAG is Pushed to Git Repository, it appears in airflow web-server UIAug 8, 2024Aug 8, 2024
Tim SpannCreating Apache NiFi — Apache Pulsar — Apache Flink Apps (Citibikenyc data)create-nifi-pulsar-flink-appsJan 18, 2023Jan 18, 2023
InYazilim VIPbyMEHMET ARİF EMRE ŞENChange Data Capture Magic: Streaming with Debezium, Kafka, and DockerIntroductionNov 18, 2023Nov 18, 2023
Batuhan OrhonEfficient Data Streaming: Implementing Kafka Connect and Debezium with DockerIntroductionDec 28, 20233Dec 28, 20233
InMarvelous MLOpsbyBaşak Tuğçe EskiliCreating Vector Database with OpenSearchA vector database is a certain type of database designed to store and search vectors. Vectors, in other words, embeddings are set if…Jan 27, 20242Jan 27, 20242
Kadriye TaylanOpenSearch Node Management and Certificates in Docker RootlessManagement of OpenSearch Nodes Network and Certificates using Docker RootlessFeb 16, 20241Feb 16, 20241
InAnalytics VidhyabyWesleyBosKafka and Nifi integrationUsing Docker, Kafka, Nifi and PythonOct 19, 2020Oct 19, 2020
Martin HynarKafka, NiFi, Schema Registry … all in DockerThis is description of setup to get working lab environment with following components:Nov 11, 2020Nov 11, 2020
Nijat MursaliConfiguration of NIFI and Kafka DockerIn my previous articles, I have talked about the basic concepts of Spark, NIFI and Kafka. However, the main problem here is how we can…Aug 5, 2019Aug 5, 2019
Najma Bader13. Connecting Airflow to a local Postgres DatabaseMy personal notes from the book “Data Pipelines with Apache Airflow” by Bas Harenslak and Julian de Ruiter — Chapter 4, Part 3Oct 31, 2022Oct 31, 2022
Oscar GarcíaHow to start with Apache Airflow in Docker (Windows)Start with Apache Airflow in a docker container.Feb 22, 20226Feb 22, 20226
Saurabh ChawlaSpark-Radiant is now available!Spark-Radiant is Apache Spark Performance and Cost Optimizer. The product, Spark-Radiant will help optimize performance and cost…Sep 19, 2021Sep 19, 2021
InTDS ArchivebySimon Grah6 recommendations for optimizing a Spark jobA guideline of six recommendations that are quickly actionable for optimizing a Spark job.Nov 24, 20219Nov 24, 20219