this post was submitted on 28 Jun 2023
3 points (100.0% liked)

Data Engineering

172 readers
1 users here now

Discussion on Data Engineering topics. Data pipelines, tools and technologies, databases and DBMS, best practices:

Rules:

founded 1 year ago
MODERATORS
 

I get questions like this a lot:

  • Where did this data come from?
  • How do I know I can trust the source?
  • What types of QA checks were applied to this data?

Data lineage is such a chronic issue in data engineering. This blog post from Airbyte gives a good overview & mentions some interesting products/projects that can maybe help out with data lineage.

Unfortunately, I have limited flexibility to purchase or install tools for this in my current role. Anyone rolled their own solution for this?

top 1 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 1 year ago

Apache Nifi maintains a linage table for its data movement and transformation