The post-modern data stack is coming. Are we ready? – The article discusses the evolving landscape of Data Engineering and how emerging trends, such as Zero-ETL, ChatGPT, and data product containers, are poised to shape its future.
Zero-ETL: Aiming to disrupt data ingestion, Zero-ETL modifies the process by having cleaning and normalization performed directly by the transactional database, allowing data to be loaded seamlessly into the data warehouse. However, this is most efficient when both the transactional database and data warehouse are from the same vendor, like AWS (Aurora to Redshift), GCP (BigTable to BigQuery), and Snowflake (Unistore). Despite the challenges associated with Zero-ETL, such as potential issues with database schema changes, it simplifies infrastructure and enables non-copy data sharing.
ChatGPT: The potential of Large Language Models (LLMs) such as ChatGPT to revolutionize data transformation is explored. Using AI-assisted natural language queries, LLMs could provide self-service data analytics with data stored in a single table rather than multiple linked tables. As a result, AI performs analytics and extracts insights, leading to infrastructure disruption by eliminating the need for relational database tables. However, challenges still need to be addressed in fully implementing this approach.
Data Product Containers: These containers layer functionality over data tables and are part of the cloud-native microservices movement. Containers are portable, abstracting out the infrastructure and providing scalability for cloud usage. Data product containers make data more manageable and accessible, offering a promising way to enhance data management.
The article concludes by emphasizing the continued importance of human data engineers in extracting value from data, even with the ongoing technological advancements in the field. Human expertise will remain indispensable as the landscape of data engineering continues to change and adapt to these new disruptors.