đđđData Architecture: Challenges with Azure
For the past 8 months, I have been dealing with Azure. Solving Data Management problems â ETL, Validation and Governance, the usual.
At first, I was excited. Azure is Azure â Opinionated, developer friendly and intuitive.
I did a couple of courses from Udemy on basic data management setups with Azure Data Factory and Azure Synpase. Felt confident that I could handle any challenges.
I also felt fairly confident to create this Data architecture that Azure recommends
Then I saw her face. Now Iâm a believer..
But.. Some things in Azure were overly complex â and the documentation just lead us down paths that ended up in dead ends.. sometimes.
Here are some recent challenges I had with Azure(initially), especially when setting up Data and ETL pipelines..
The Challenges
IAM setup is.. wow.. it wasnât straightforward
Setup users. Setup user groups. Setup Resource groups. Setup so many groups. I was spending a lot of time just setting up groups. And access rights. And privileges. Only to find out that the Permission I though would give Data Factory access to Blob Storage didnât actually give me access to blob Storage.
Now I know what you might be thinking. No, it isnât tough. We have been setting up Users, Roles and permissions in Azure for ages now. Itâs so easy. Well, I didnât find it as intuitive as I hoped it would be. Figured it out eventually, but it wasnât .. well.. intuitive!
Setting up Azure Purview looked easy at first.. but it wasnât
I kept getting some cryptic permission issues when I wanted to see Reports from a scan I ran. I kept puzzling over the fact that I had given myself the permissions on Purview to see just about everything. Yet it was blocking me. The documentation wasnât giving me any hints.
Then I asked GPT and it seemed to have an idea about what I had to do. And I learnt that when we create Data Domains in Purview, those have their own permissions. At the very least the error message could have explained to me from where I was getting blocked. But nope. I wasted a couple of hours trying to figure this out.
Trying to create a Python based Azure Function took me â 2 days!
I tried to follow the documentation to use VS Code to develop the function. Installed the Azure plugins on VSCode. Logged in. Tried to push my function code â got some weird ENOENT error that I couldnât make head or tails about. Googling and GPTing didnât help. I scoured the forums, stack overflows, the documentation and finally ended up on a medium article that alluded to the fact that we need a function.json file. Also, I had my code in the root folder, when in fact, the functionâs python file should be in its own folder. What the actual f#$k!
Finally, after toiling for a couple of days, I got my function to work. No thanks to the documentation. Why canât they just help me out with a simple wizard to create functions. They know the settings. It doesnât have to be this hard!
Azure Data Factory cannot use Self Hosted Integration Runtime
Well it can but only for Copy Data Activities.
They also donât tell you that AzureIntegrationRuntime and AutoresolveIntegrationRuntimes get IP addresses allocated at random â from an exhaustive list of IP addresses per region â which gets updated from time to time.
So whatâs the problem?
You see, if your Data Factory needs to use the integration runtime to talk to a third party system and they are hell bent on whitelisting IP addresses, then there is going to be a huge list of CIDR blocks that they would have to whitelist.
This is the link to that list â Azure IP Ranges and Service Tags
Have at it. So we couldnât use AIRs to connect to the Third party system which was Snowflake btw. Instead I had to use Self Hosted Integration runtimes which donât really perform as well as AIRs.
Now, I know that you will argue that AIRâs are meant for internal pipelines â pipelines internal to Azure. However, this isnât evident and they havenât made it easy to know this either.
Also, they should allow support for scalable integration runtimes to sources like Databricks and Snowflake that allow that scale. I dunno, I wouldnât want the source system to be a bottleneck.
Running Data Factory in Debug mode is.. well.. slow
Atleast, it was for a few days. Took time for the pipeline to startup and run in Debug mode. (It waited for a bit before getting the compute)
Then too, the logs wouldnât tell me the status â not specifically anyway. It was a generic status update and we werenât really sure what was going on with the pipeline.
A very good example was a Mapping Data flow I was trying to debug the one day. It hung on for about 9 minutes before telling me it had failed. For some logic error. 9 Minutes for me to know I had made a mistake!
And on another occasion â this happened
It was stuck like this for 2.5 hours; No logs to tell me what in Godâs name itâs trying to do. I had to kill the pipeline and restart it only for it to execute within 15 minutes the next time â successfully.
Make sure your Storage and Pipelines are in the same region
We learnt the hard way that your pipeline expects the storage to be in the same region or we get some weird IP not found error. No idea whether this is by design.
(An unrelated quote btw)
We know youâve tried because youâve had problems
Azure has a way to go before it becomes â well, forgiving and friendly. I guess that maybe this isnât such a bad thing after all. If platforms remain this complicated, ainât no way that AI be taking over our jobs.
However, let me come right out and say this, I still love Azure as a platform. For things it gets wrong, it gets twice as many things right.
Since we got ahead of the teething issues, we have built some extremely complex systems in Azure. At Scale. It performs flawlessly. It has great instrumentation with Monitor and App Insights.
Once you get a hang of it, Purview is fun. It actually is a lot of fun, looking for your data, cataloguing and classifying it. Thereâs something about neatly organizing your data in hierarchies and folders that feels â Satisfactory!
On hindsight, I might have gotten some things wrong. The documentation doesnât tell me if I was. Maybe some of you will tell me if I was just being paranoid.. or delusional. I would love to know your opinions!
Leave a comment and let me know if you guys faced similar issues with Azure.
Follow me Ritesh Shergill
for more articles on
đ¤AI/ML
đ¨âđť Tech
đŠâđ Career advice
đ˛ User Experience
đ Leadership
I also do
â Career Guidance counselling â https://topmate.io/ritesh_shergill/149890
â Mentor Startups as a Fractional CTO â https://topmate.io/ritesh_shergill/193786