Azure

Prompt Engineering - Productivity in Hyperdrive

In my recent data engineering quest, I had to wrangle data from seven sources and fill our data warehouse using Azure Data Factory. The end result? Twenty-one pipelines doing a synchronized dance, populating twenty-six tables. Back in the pre-ChatGPT era, this Herculean task would’ve demanded three full-time heroes for six months. Enter ChatGPT: in just five months, with the power of 1.25 FTEs, Mission Accomplished. But that’s not all! ChatGPT Playground’s prompts turned me into a coding sorcerer....

February 3, 2024 · 3 min
MIT logo

API Pagination in the wild

In the dynamic API economy, service providers acknowledge the crucial role of API monetization in creating revenue streams. APIs have evolved into the backbone of modern software development, facilitating seamless data exchange between applications. As their complexity and usage increase, the need for efficient data retrieval becomes paramount. Recognizing the symbiotic relationship between API monetization and effective data handling is key for service providers to navigate the evolving landscape successfully, ensuring not only a seamless user experience but also the maximization of revenue potential....

January 31, 2024 · 3 min
DataBricks-AutoLoader

Declarative Programming - Put ROI on steroids by focusing on person-hours, not compute costs

Declarative programming will expedite the Return on Investment in your modern data warehouse implementation. I recently watched Michael Armbrust’s insightful presentation at the Data + AI Summit 2023, where he initiated the discussion by emphasizing the advantages of Declarative Programming. Throughout his talk, he delved into the innovative features of Delta Live Tables, DLT Pipelines, HMR, serverless computing, Enzyme, and more. His primary message, “Declarative programs specify what should be done, not how to do it” left a lasting impression....

November 12, 2023 · 3 min
Platform scale

The Invisible Hand is the new Iron Fist

Platform Scale for a Post-Pandemic world by Sangeet Paul Choudry says Platform manifesto - The Invisible Hand is the new Iron Fist “In a networked age, we are moving from a world of command and control to a self-serve world where participation is encouraged through an invisible hand powered by data, API and algorithms.” - Chapter 1.4, page 45 Adam Smith’s concept of the Invisible Hand, outlined in his renowned work “The Wealth of Nations” in 1776, proposes that individuals pursuing their own self-interest inadvertently contribute to the greater good of society....

October 21, 2023 · 3 min
MLFlow

MLFlow 101 - Dev to Production Environment in a single notebook

MLflow is an open-source platform designed to streamline the machine learning (ML) lifecycle. It was developed by Databricks and has gained widespread adoption within the data science and machine learning communities. MLflow provides a comprehensive set of tools and libraries for managing the end-to-end process of developing, deploying, and maintaining machine learning models. Key features of MLflow include: Experiment Tracking: MLflow allows data scientists and engineers to keep track of their experiments....

September 16, 2023 · 2 min
Azure

Azure Data Factory Copy Activity: The Swiss Army Knife of Data Integration

Think of Azure Data Factory’s Copy Activity as the Swiss Army knife of data integration. Just as a Swiss Army knife combines various tools into one compact design, Copy Activity seamlessly combines multiple data integration functions into a single, powerful feature. Versatility: Just like a Swiss Army knife’s ability to handle various tasks, Copy Activity connects to diverse data sources and destinations, whether they’re in the cloud or on-premises. Efficiency: Like a Swiss Army knife’s efficiency in handling different tasks, Copy Activity streamlines data movement and transformation in a single step....

August 28, 2023 · 3 min
Azure

Digital Twins: Bridging the Physical and Virtual Worlds

In a world fueled by technology, the concept of Digital Twins has emerged as a groundbreaking bridge between the physical and digital realms. Imagine having a virtual counterpart for real-world objects, processes, or systems that mimics their behavior and characteristics in real time. This technology is not just about static models – it’s about creating dynamic, synchronized virtual entities that offer a wealth of benefits across industries. Digital Twins rely on a mix of technologies like IoT, data analytics, machine learning, and cloud computing....

August 12, 2023 · 2 min
Azure API Management

Secure Azure APIs using Azure B2C Active Directory - Sequence Diagram using Mermaid.js

As an enterprise architect, using text-based diagrams can be a smarter approach for several reasons. Text-based diagrams offer superior portability and accessibility, as they are lightweight and can be shared as plain text files through various communication channels, reaching a broader audience without the need for specialized software. Furthermore, by being version-controlled using tools like Git, text-based diagrams facilitate seamless collaboration among team members, enabling efficient tracking of changes and supporting geographically dispersed teams....

July 24, 2023 · 6 min
MIT logo

Non-Linear Regression models

Non-linear regression is a powerful technique used in statistics and machine learning to model and analyze complex relationships between variables. Unlike linear regression, which assumes a linear relationship between the independent and dependent variables, non-linear regression algorithms can capture more intricate and non-linear patterns. This makes them particularly useful when dealing with real-world phenomena where linear relationships are insufficient. Following Non-Linear regression models are covered in the attached Jypter notebook...

July 2, 2023 · 3 min
MIT logo

MIT-GreatLearning Applied Data Science program Capstone Project

During my participation in the MIT-GreatLearning Applied Data Science Program, I experienced an intensive curriculum that encompassed diverse subjects such as Statistics, Python programming, Machine Learning, Neural Networks, and Recommendation Systems. The culmination of the program was the Capstone project, which aimed to predict Loan Default. This involved crucial steps like Data Preparation, Exploratory Data Analysis, constructing models, and generating a comprehensive performance report for the various models employed. The final outcome is a binary classification....

June 17, 2023 · 2 min