Cloud-native, data onboarding architecture for Google Cloud Datasets
-
Updated
May 6, 2026 - Python
Cloud-native, data onboarding architecture for Google Cloud Datasets
Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principles on Adventure Works. Features programmatic model generation, event-enhanced Puppini bridges, and temporal resolution across DAS/DAB/DAR layers.
从数据到智能:企业级数据平台的构建、演进与 Agentic BI 实践。一个数据开发的八年数据工程手记:从 0 到 1 构建企业级医药数据平台,再到 Data + AI 转型下的 Agentic BI。
Data Tools Subjective List
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Sonnet Scripts are a collection of pre-built data architecture patterns that you can quickly spin up on a local machine, along with examples of real-world data that you can use with it.
Data Engineering Digest
Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and pipelines.
Codebase for CCAO data infrastructure construction and management
Watcher is an open-source ETL metadata framework built with FastAPI. It provides a structured way to track pipeline executions, manage lineage, measure data freshness and timeliness, and log anomalies — giving data teams a reliable foundation for monitoring and debugging workflows.
2023 · Automated grant-data integration on Databricks — PySpark, Delta & Great Expectations quality gates feeding downstream consumers.
xDBML (eXtended Database Markup Language) is an open markup language for describing the shape of structured data and its semantics in the declarative metadata attached to that shape -- across heterogeneous storage and exchange technologies.
Master's Final Degree Project on Artificial Intelligence and Big Data
📚 An End-to-End Advanced SQL Project covering Data Warehousing, ETL Pipeline (Bronze → Silver → Gold), Star Schema Modeling, EDA, and Advanced SQL Analytics. Built using PostgreSQL, this project simulates a real-world Data Engineering + Data Analytics workflow using raw ERP & CRM data to generate production-ready customer and product insights.
elevata is an Architecture Runtime for modern data platforms — turning metadata into discoverable, controllable, auditable, and deterministic data architecture.
Development of a deep learning project in the field of agriculture We will create a simple image classification model that will categorize Potato Leaf Disease using a simple convolutional neural network architecture.
Data Management Plan: Design and architecture for ON LiMiT data
Global Cycling Company Data Warehouse! We gather data from AdventureWorks 2016, utilize tools like SSMS for efficient storage and retrieval, and employ ETL processes for seamless integration and analysis. 🚴♂️🛠️
This repository contains all the assignments, workshops and mini-projects of course - DAMG7370 Designing Advanced Data Architectures for Business Intelligence
ArcKit test project: National Highways Data Architecture
Add a description, image, and links to the data-architecture topic page so that developers can more easily learn about it.
To associate your repository with the data-architecture topic, visit your repo's landing page and select "manage topics."