Sidebar

Data Engineering

"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 7 months ago 99%
The job description vs reality
164
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering fritz_astro 12 months ago 44%
3 Key Takeaways from Airflow Summit 2023 www.astronomer.io
-1
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering fritz_astro 12 months ago 50%
Airflow Summit 2023 - Recordings Now Available https://www.youtube.com/playlist?list=PLGudixcDaxY29qXIXhd90htHp_BFk-Bqf
0
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas 1 year ago 100%
Unified Star Schema https://towardsdatascience.com/the-new-unified-star-schema-paradigm-in-analytics-data-modeling-review-a245b2641dc8

Hi all, I was recently reading about the Unified Star Schema and the Puppini Bridge. I’m curious whether anyone here has experience with it and what their thoughts are. TIA

4
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 1 year ago 100%
Who wants clean data?
19
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering bahmanm 1 year ago 100%
Visualising Kafka Internals

I plan to run a few tests to determine if Kafka is suitable for a certain usecase I have in mind. My idea is to run a local cluster of Kafka servers (either VMs or containers), produce/consume a series of messages, observe a bunch of metrics (Prometheus & Grafana) and custom business logic outcomes. What are some good tools to record and visualise the internals of Kafka cluster? I'm looking for things like consumer lag, topic replication, possibly tracing messages, ... *Originally posted on https://mastodon.social/@bahmanm/110662538718523380*

3
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering clumsydonkey 1 year ago 100%
One big table vs a dimensional model

Hi fellow data engineers, Currently I’m restructuring a pipeline written with pyspark on Databricks. Since it’s a lot of transformations, results in an extensive DAG, but it’s cool to spend some extra processing resources to make a standard dimensional model (apart from the necessary transformations). Was wondering what real benefits you have seen a star schema design has from the “one big table” approach, I could preach to my team? (My goal mainly would be to have a resulting smaller PowerBI model.) And as a side question, what tools do you use to create a dimensional model such a star schema with code? Thanks a lot!

3
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas 1 year ago 100%
Free resource books books.goalkicker.com

Thought I’d share this link. I’m not affiliated in any way.

8
2
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering nydas 1 year ago 100%
Data Vault 2.0 Advanced Material

Hey there community! Does anyone have any resources they could share relating to Data Vault 2.0, specifically the joining of SAL and PIT tables? The two main books on the architecture are very sparse on this area, which I would have thought would be a fairly key component for any mid-to-large organisation.

2
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 1 year ago 83%
33 data offerings by AWS, Azure, Google Cloud

![](https://lemmy.ml/pictrs/image/9fd9e352-6cb8-413b-b6f9-1e40ab4d78c1.webp)

4
0
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 1 year ago 100%
This one never gets old

![](https://lemmy.ml/pictrs/image/d703acc5-18c0-47a6-b4b1-4c5869b845e2.webp)

3
1
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 1 year ago 92%
Data Engineering roadmap

What needs to be added for 2023

12
6
"Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
Data Engineering ephemeral404 1 year ago 100%
Welcome to the community, want to join the mod team?

Fellow data engineers, looking forward to your contribution/participation to the communiy. If you want to help in managing the community, get in touch to join the team

6
1