Inside Data Engineering with Erfan Hesami
Join Erfan Hesami as he shares his experience in the world of data engineering, offering insights, exploring challenges, and highlighting emerging industry trends.
Today, we're joined by
from , who’s been working in Data and Analytics Engineering for the last 5 years.To recap: the series follows a Q&A format, featuring professionals who share their journeys, insights, and challenges.
What to Expect:
Behind the Scenes – Get a close-up view of the real work, rhythms, and responsibilities of data engineers in action.
Getting Started – Dive into the essential skills, tools, and entry points that open doors to a data engineering career.
Industry Watch – Stay informed on emerging trends, evolving tech stacks, and shifts driving the future of data engineering.
The Real Work – Go beyond the theory to explore the gritty, unexpected challenges engineers solve in the wild.
Debunking the Hype – Clear up common myths and misconceptions about what data engineers actually do.
From the Trenches – Learn from the experiences, lessons, and advice of seasoned professionals working in the field.
⭐ If you're curious about data engineering or considering it as a career, this series is for you!
Let’s dive into Inside Data Engineering:
How would you describe Data Engineering?
In one sentence, data engineering means making data accessible for stakeholders to create business value. Although it sounds simple, there are many factors involved: why the data matters, what data is needed, and how it should be handled. The how is especially important, which is why we need to understand how to design systems, architect solutions, and model data in a way that maximises business value while keeping costs optimised.
How did you end up being a Data Engineer?
I have a Bachelor's degree in IT Engineering and worked for about two years as a programmer. Later, I decided to pursue a Master's in Business Analytics with a specialisation in Data Science. After graduating, I started working as a Data Analyst. One of the companies I worked for needed me to maintain and improve their existing data pipelines and implement a data quality framework to address data quality challenges, that’s what sparked my interest. I realised that the work I was doing closely resembled that of a Data Engineer, and it motivated me to dive deeper. I started learning from the data engineers in the company, observing how they source data, the technologies they use, and even asking to explore their code repositories. Over time, I kept learning more and gradually began contributing like a Data Engineer.
Here you can read the full story of My Journey from Data Analyst to Data Engineer, along with my advice and recommendations for others looking to make the same transition. Also, I highly recommend checking out the below article:
What's your day-to-day look like?
My day-to-day typically involves gathering requirements to understand business problems, identifying whether the necessary data sources exist, and determining if any new automation is needed. I also maintain and improve existing pipelines and help address issues related to data quality, governance, or infrastructure. I regularly attend meetings with different stakeholders to understand their needs and explore how our team can support them.
Who do you typically work with across teams?
Software Engineers: We work together to understand how internal/external applications function and explore ways to improve data capture, especially for storing historical data and enhancing data quality at the source. I also support them in identifying what information needs to be collected to solve specific business problems and assist with data modelling for transactional databases.
Vendors: External data engineers or teams who share data with us from outside the company. I collaborate with them to ensure the data is reliable, well-documented, and integrated smoothly into our systems.
Engineering Executives: I consult with them on architectural decisions, technology choices, and long-term solutions to align with strategic goals.
BI Developers: I collaborate with them to provide the necessary data in the right format and at the right frequency for reporting and analytics.
ML/AI Engineers: I work with them to understand what data they need, in what format, and how often. Their focus is on building machine learning models and AI applications to solve business problems.
What real-world business problems do you solve through data?
One of the projects I'm most proud of was creating a 360-degree view of customer data as a single source of truth. Previously, data was scattered across different systems, and teams were generating reports based on inconsistent sources. There was no clear data dictionary, no data lineage, and very limited governance; many people didn’t even know where certain data lived. I took the initiative to centralise and model this data, creating a unified customer view that could be used company-wide. This helped teams build their own data marts more confidently and consistently.
What kind of projects do you work on?
I work on building and optimising data infrastructure, ensuring reliable data pipelines and high data quality, and designing systems that support analytics and AI. This includes integrating data from various sources, managing complex transformations, and improving data accessibility for business teams. Some examples include sourcing and modelling data for an AI Copilot project to support account managers, and helping stakeholders detect unprofitable contracts so they can take timely action.
What kind of data do you work with?
I work with both structured and semi-structured data, including traditional fields and columns, as well as open text.
What data size do you work with?
I have experience working with data at terabyte and even petabyte scale, though in my current role, the data volume is typically a few terabytes.
What tech stack do you use?
Azure: All open-source components are hosted and managed within the Azure cloud environment.
Azure storage: Used to store raw and processed data for analytics and AI use cases.
Postgres: Used as a storage layer for analytical purposes.
Data load tool (dlt): Used to ingest data from various sources efficiently.
Dagster: Automate and schedule data workflows
Data build tool (dbt): Used for transforming and modelling data.
Great Expectations: Used for data quality.
What programming languages do you use?
Python and SQL are primary languages for transformation, tooling, etc.
What tools do you leverage for GenAI?
I use Claude for programming assistance, as well as Ollama and Llama models for running and experimenting with generative AI locally.
What is your favorite area of Data Engineering?
My favourite areas of data engineering are data quality and data modelling because they are fundamental to building reliable, trustworthy, and scalable data systems.
How can Data Engineering benefit from GenAI?
I use generative AI models to create test cases and generate documentation. Some open-source tools have model-powered documentation that allows you to quickly search and find relevant information, making development and troubleshooting much more efficient.
What advice would you give your past self as a beginner Data Engineer?
For beginners, I’d suggest focusing on the fundamentals first and not getting lost chasing certifications alone. Early on, I thought that collecting certifications for specific platforms would make me stand out. But I soon realised that not all companies use or require those platforms. More importantly, it’s crucial to understand how these technologies work and to explore the open-source ecosystem. Working on side projects can really help build practical skills. I’m not saying certifications aren’t valuable, but always ask yourself why you’d choose one technology over another; understanding your options is key.
What are some challenging aspects of Data Engineering?
One of the biggest challenges is making decisions across many areas, especially when it comes to architecture. It’s critical to think carefully and always consider the overall system design. Choosing the right tools can also be difficult because new tools constantly emerge, and deciding which one fits best requires experience and judgment. Additionally, there are many ways to implement solutions, and every engineer or team tends to find their own approach, which adds complexity and variability.
What are some common misconceptions about data engineering?
Data engineering is just about creating data pipelines.
Data engineering is only writing SQL.
Data engineers should only focus on engineering tasks.
Reality:
Data engineers should also understand analytics and be good analysts, especially in smaller teams.
When stakeholders bring data problems, data engineers often analyze feasibility and provide insights.
Creating dashboards, charts, and telling data-driven stories can also be part of the role.
Data governance and data quality are critical responsibilities that data engineers must manage to ensure trustworthy and compliant data.
What advice do I have for new beginners?
Don’t jump into trending technologies just for the sake of it, some tools are powerful, but not every company uses them, and not every data problem requires them.
Focus on mastering the fundamentals first; don’t sacrifice core understanding just to follow hype.
Learn from others by reading articles, attending meetups and conferences, and reading books. A book like Fundamentals of Data Engineering by
and Matt Housley really helped me connect the dots and build a strong foundation.Most importantly, apply what you learn through hands-on projects to build real understanding and confidence.
I hope this article was helpful for the readers. Thanks to Erfan for sharing his experience with my audience. Stay tuned for more!
Please reach out if you like:
To be the guest and share your experiences & journey.
To provide feedback and suggestions on how we can improve the quality of questions.
To suggest guests for the future articles.








