Tim Zhang

Seeking a Data & AI Engineering role to apply end-to-end pipeline design and LLM-powered system building for measurable business impact.

Data Engineering Agentic RAG Python SQL GCP / Cloud LangChain

About Me

I'm Weijian (Tim) Zhang — a data engineer and AI-driven pipeline architect who bridges the gap between scalable data infrastructure and production-grade AI applications. I hold an MSc in Artificial Intelligence and Business Analytics (Lingnan University, Hong Kong) and bring 8+ years of hands-on delivery across financial services, government technology, and global retail.

My career has followed a clear trajectory: from designing enterprise data warehouses processing billions of daily transactions for banking clients, to building real-time data quality monitoring for Walmart's B2B platform, and most recently architecting Agentic LLM systems for Hong Kong government slope maintenance operations.

I specialize in the full lifecycle — from raw data ingestion and warehouse modeling to LLM orchestration, retrieval-augmented generation, and cloud-native deployment on GCP. Whether it's a 5-layer dimensional model or a 7-layer Agent architecture, I focus on one thing: delivering systems that produce measurable improvements.

Education

MSc in Artificial Intelligence and Business Analytics — Lingnan University, HK (2025–2026)
Bachelor of Management — Shenzhen University, China (2014–2017)

Domain Experience

GovTech (HK) · Banking & Finance · Global Retail (Walmart) · Enterprise BI/DW

Current Focus

Agentic RAG systems · Multi-Agent orchestration · Cloud-native AI deployment · Data pipeline modernization

Skills

Technical Engineering

  • Python (Pandas, NumPy, scikit-learn) → SRR system, LingO, Walmart monitoring
  • SQL & Data Modeling (Star Schema, 5-layer EDW) → Bank of Ningbo DW, Walmart B2B
  • Big Data: Hadoop, Spark, Hive, StarRocks → Walmart real-time analytics
  • LLM / RAG: LangChain, OpenAI API, Gemini, pgvector → SRR Agentic system, LingO RAG
  • FastAPI, Docker, Linux, CI/CD → SRR microservice deployment
  • GCP (Cloud Run, Cloud SQL), Tencent Cloud → SRR cloud deployment, LingO hosting
  • PostgreSQL, MySQL, SQL Server, Kafka → Cross-project database delivery

Business & Domain

  • Enterprise Data Warehouse Design → Bank of Ningbo 5-layer EDW
  • Financial Data Quality & Compliance → Walmart financial monitoring
  • Government Tech & Public Services → SRR slope maintenance for HK Gov
  • B2B / Retail Analytics & Reporting → Walmart B2B contract & credit reporting
  • Requirements Analysis & Stakeholder Alignment → Cross-team delivery across all projects

Tools & Frameworks

  • Orchestration: DolphinScheduler, Airflow → Walmart pipeline scheduling
  • Visualization: Power BI, Tableau, Matplotlib → Walmart financial dashboards
  • Version Control: Git, GitHub → All project repos
  • Frameworks: LangChain, FastAPI, Streamlit → SRR, LingO, RAGagent
  • Monitoring: Custom alerting (Feishu/Lark), logging → Walmart data quality automation

Projects

Projects are organized by relevance to my target role (Hybrid Data + AI Engineer), strength of quantifiable outcomes, verifiability, and technical diversity across GovTech, enterprise data, and LLM systems.

Featured Projects

SRR — Agentic Case Processing System

Hong Kong · 2025–2026
Technical Lead (99.4% contribution) · Lingnan Cup Competition Project

Problem

Hong Kong Architectural Services Department (ArchSD) case processing relied on manual entry and routing, with average processing cycles lasting days and high error rates.

Approach

  • Designed 7-layer Agentic processing architecture with 17 pluggable atomic capabilities
  • Built multi-format document parsing pipeline with field extraction (A-Q categories) for ICC 1823, TMO, RCC channels
  • Integrated OpenAI GPT-4o with text-embedding-3-small for intelligent classification and response generation
  • Implemented pgvector-based semantic case retrieval with RRF fusion ranking
  • Developed 3-tier quality assessment system (L1 keyword → L2 rules → L3 RAGAS LLM-as-Judge)
  • Created self-healing mechanism with Best-of-N selection and 4-category failure differential rollback
  • Deployed on Google Cloud Run with Cloud SQL and Docker for auto-scaling

Outcome

67 commits, 80,430 lines of code across two iterations. 12/12 requirements fully implemented with end-to-end automation. Reduced processing time by 75% (40 min → 10 min). Projected annual cost savings of HK$6.39M.

Tech Stack

Python 3.11+ · FastAPI · React + TypeScript · PostgreSQL 15 + pgvector · OpenAI GPT-4o · Google Cloud Run + Cloud SQL · Docker

Agentic AI RAG LLM-as-Judge GovTech Cloud-Native

LingO — Multi-Agent University Consultation System

Lingnan University · 2025–2026
System Architect & Lead Developer

Problem

University students, faculty, and administrators struggled with dispersed policy documents across 30+ departments, leading to 70%+ retrieval failure rates and inconsistent information delivery.

Data

Ingested 200+ policy documents, academic regulations, administrative guidelines, and FAQ corpora across three user personas (student, faculty, admin).

Approach

Built AgenticRAG with Hybrid Retrieval (dense + sparse + reranking), Adaptive User Memory for personalized dialogue, RBAC-based access control, and LLM-as-Judge for answer quality assurance.

Outcome

Policy Q&A accuracy >95%. Average retrieval time reduced by >70%. System response within 10 seconds. Demonstrated at Lingnan Innovation Showcase.

Tech Stack

Python · LangChain · OpenAI API · PostgreSQL + pgvector · FastAPI · Streamlit · Docker · GCP

Multi-Agent AgenticRAG Personalization Education RBAC

Walmart B2B Data Platform Modernization

Walmart China · 2023–2025
Data Engineer — Shenzhen RenruiHR

Problem

Walmart China's B2B financial reporting relied on fragmented data sources across SQL Server, StarRocks, and legacy systems, causing delayed reporting, undetected data anomalies, and manual reconciliation overhead.

Data

414,000+ contract records, monthly B2B transaction partitions (9,000–10,000 records/month), cross-source financial metrics including order amounts, fulfillment types, and credit ledger balances.

Approach

Built cross-source financial data quality monitoring with automated anomaly detection (e.g., fulfillment amount exceeding order amount). Designed B2B reporting iterations covering contract lifecycle, countdown tracking, and credit ledger reconciliation. Implemented Feishu alerting for downstream BI consumers.

Outcome

Reduced data incident detection latency from days to minutes. Achieved automated cross-source validation across StarRocks and SQL Server. Enabled consistent financial reporting for B2B operations across multiple business units.

Tech Stack

Python · SQL · StarRocks · SQL Server · DolphinScheduler · Feishu API · Power BI

Data Engineering Data Quality Real-Time Monitoring Enterprise BI Retail
Supporting Projects

Walmart Financial Data Quality Monitoring Automation

Walmart China · 2023–2024

Designed and deployed automated cross-source data validation pipelines comparing StarRocks and SQL Server records. Built Python-based monitoring scripts with configurable thresholds, automated Feishu (Lark) alerting, and downstream notification workflows — reducing manual reconciliation effort and improving data incident response time.

Data Quality Automation StarRocks Python

Bank of Ningbo — Enterprise Data Warehouse Optimization

Banking · 2021–2022

Delivered data warehouse optimization and dimensional modeling for Bank of Ningbo, including a 5-layer EDW architecture (ODS → DWD → DWS → ADS → RPT). Optimized SQL performance for large-scale transaction queries using Hive on Hadoop. Supported new credit system data modeling and migration.

Data Warehouse Hive Dimensional Modeling Banking

Resume & Experience

Download my full resume or browse the career timeline below. My experience spans enterprise data engineering, cloud-native AI deployment, and cross-team delivery across banking, retail, and government sectors.

Download Resume (PDF)

AI Engineer / System Architect

Lingnan University — SRR & LingO Projects
Sep 2024 – Present · Hong Kong
  • Designed and built a 7-layer Agentic architecture for HK government slope repair processing, achieving 75% time reduction and HK$6.39M projected annual savings.
  • Architected LingO multi-agent consultation system with AgenticRAG, Adaptive User Memory, and RBAC, achieving >95% policy Q&A accuracy.
  • Deployed containerized AI services on GCP Cloud Run with PostgreSQL + pgvector for semantic retrieval.

Data Engineer

Shenzhen RenruiHR
Sep 2023 – Aug 2024 · Shenzhen
  • Led migration of IDC data warehouse to Tencent Cloud, restructuring into 5-layer architecture (ODS/DIM/DWD/DWS/ADS) and achieving 87.5% processing time improvement with SparkSQL optimization.
  • Built cross-source financial data quality monitoring across StarRocks and SQL Server with automated Feishu alerting, reducing detection latency from days to minutes.
  • Designed B2B reporting iterations covering 414,000+ contract records — contract lifecycle, countdown tracking, and credit ledger reconciliation.
  • Achieved 99.95% data consistency via Seatunnel incremental syncs and reduced storage consumption by 38% through ORC format optimization.

Data Engineer

Hangzhou Yatop Technology
Apr 2021 – Aug 2023 · Hangzhou
  • Delivered enterprise data warehouse for Bank of Ningbo: designed 5-layer EDW architecture (ODS → DWD → DWS → ADS → RPT) processing billions of daily transactions.
  • Built and optimized Hive/Spark data pipelines for large-scale financial analytics, improving query performance for downstream reporting.
  • Owned data modeling and migration for the new credit system, coordinating requirements across business and technology teams.

Data Engineer

Shenzhen Owned Technology
Feb 2019 – Mar 2021 · Shenzhen
  • Developed and maintained data model layer of Bank of Ningbo's data warehouse, supplying data for downstream financial reporting.
  • Processed data requirements from data market stakeholders with SQL, directly liaising with banking clients for issue resolution.

Engineer

Chengdu HQtimes Technology
Jul 2017 – Nov 2018 · Chengdu
  • Troubleshot system errors for IBM software applications and developed operational scripts with Bash/Python.

Contact