The roles in data

For those who are not familiar with the variations of roles in the field of data, this is what I’ve learned so far about those roles. You have data analysts, data architects, data engineers and data scientists.

A data analyst works on, no surprise, analysing data. Often these are people preparing reports (for instance on sales performance) for management. When working in Microsoft Azure, you would spend a lot of time using Power BI to create reports, using polished data sets to perform transformations and calculations. Your goal is to create visualizations that anyone else in your organisation can easily understand.

To create those polished data sets for analysts to work with, you need a data infrastructure. That’s where the data architect comes in. An architect thinks through what the data needs are on one end, knows the perks of the raw data coming in at the other end and then designs the infrastructure in between.

The data engineer then uses the input from the data architect to build the infrastructure.

Sometimes you want to dig deeper into your data and discover more complex patterns. That’s where the data scientist comes in. These math wizards apply statistical, machine learning and AI models to data and are able to tweak these models using there mathematical knowledge.

I was rather surprised to learn how working in data, a relatively new field to work in, already split up in so many roles. And then I’m not even talking about all the specialisations you could choose within these roles for hard core programmers. For instance my trainer worked for many years solely on optimising SQL statements for a living.

My education prepares me for two roles mainly: the analyst and the engineer. I’m most definitely happy with all the skills I learned about using Power BI. That will help me a lot when I start digging for stories using data sets. The data engineering part is absolutely not my cup of tea. It’s really theoretical and an in-depth crash course on database management and Azure cloud infrastructure. You can compare it to fitting electrical pipes in your home. It needs to be done, otherwise you can’t live in your home, but it’s not as exciting as decorating your home. At least I get more excited about decoration and design than fitting pipes. That said, I’m still very happy to get a solid understanding of the inner workings of databases and the cloud tools one needs to create usable data from raw data. It helps me to be able to instruct others to click the right buttons, write the proper SQL statements and build the pipelines for me, so that I can dig for the data story gold 😉

Door |2021-06-23T17:10:12+02:0023 juni 2021|datascience, flow|0 Reacties

Transforming and visualising data using Power BI

The past two weeks I was introduced to the ins and outs of Power BI. Four full training days I’ve been practising doing transformations on columns, making calculated measures and dragging columns and measures into visualisations. For those who are not into data analyses, Power BI is a piece of software developed by Microsoft to handle data sets. When spreadsheets are no longer sufficient to handle your data, you can step up the game by using Power BI.

Before this training I practised with SQL and Python to create scatter plots and calculate summations, and I have to admit that after using Power BI I finally understand what kind of actions I was doing to data sets when using Python. Power BI is a visual tool, so you click on the transformations you need to do to prepare your data and the results are immediately visible. And you can easily undo a step with one click.

I wouldn’t say Power BI is data analysis for dummies, because you still need to know conceptually understand what you’re doing to the data, but I totally see why many people prefer using Power BI over messing about with Python. It is visual, quicker and can create interactive reports and dashboards. The reporting part is (for now) least interesting to me, as I don’t work in a big company with lots of (sales) data that needs to flow through the organisation. However, I do feel more confident after the past weeks that I’m capable to get meaningful information from data sets. And that was the whole point of investing in this course.

Door |2021-05-04T14:26:14+02:004 mei 2021|datascience, flow|0 Reacties
Ga naar de bovenkant