Read: Het Smelt (The Melting), Lize Spit

It’s rare for books to leave echo’s in my head after finishing reading them. This one did though. For a number of days.

The book’s setting is a small village in rural Flanders (Belgium). The plot narrates from the perspective of a woman growing up there. Only three kids were born in the village in her year of birth, the woman and two boys. They formed a class of three and bonded due to lack of others. The woman looks back on the events during one summer while returning to the village years later to visit a birthday reunion for a lost brother of one of her friends.

I read a reader review of the English translation of this book and the title of that review was ‘Just another coming of age story.’ It’s not. It’s about parental neglect. It’s about siblings dealing with that reality in their own pre-adulthood way. It’s about loneliness. It’s about adolescent play that unexpectedly can turn into something far beyond play. It’s about the long term consequences of living through all those things combined.

It took me a while to finish this book (480 pages). The first half seemed to drag on forever, especially since I had to read it in small instalments. The lonely quarantine evening hours on the couch while both Daughter and Man were already in bed gave me the opportunity to finish the second half more quickly. I can now conclude that everything written does have a point to make at the very end of the book. Just keep on reading when you think you want to quit. My advise: read this book when you have plenty uninterrupted time to immerse yourself in it.

Door |2021-11-25T14:46:08+02:0025 november 2021|flow|0 Reacties

Begrijpelijke brieven schrijven is toch echt een vak

Dochter mocht voor het eerst naar de schoolarts voor de standaard groeicheck van de GGD. Je weet wel, gewicht, lengte, gehoor en zicht worden gemeten. Dit is het eerste bezoek zonder ouder/verzorger erbij, dus kreeg ik een brief bezorgd met de resultaten. Toen ik de brief las moest ik af en toe een beetje gniffelen. De vormgeving en inhoud van de brief is namelijk een mengelmoes van vaktaal en informatieverwerking enerzijds en een poging om het begrijpelijk te houden anderzijds.

In een tabel met titel “0 – Contactmoment 5-6 jaar DA (registratie, dinsdag 26 oktober 2021)” zijn de resultaten van de onderzoekjes vastgelegd. Dikgedrukt zie je de kopjes “Groei“, “Oogonderzoek verzien” en “Gehoor“. Dat is goed te begrijpen.

Onder het kopje “Groei” staan gewicht in kg, lengte in cm en BMI in kg/m2. Ik moest lachen om het noemen van kg/m2 voor de BMI. Technisch gezien klopt het, je rekent een BMI uit door het gewicht in kilo’s te delen door het kwadraat van de lichaamslengte in meters, maar de eenheid van BMI gebruikt verder niemand. Er staat overigens bij het getal geen verdere uitleg wat gezonde waarden zijn. Pas verderop in de brief kan ik concluderen dat er geen reden is voor een vervolgafspraak. Haar BMI zal dus wel goed zijn.

Na de BMI volgt het volgende onderdeel, de ogen. In de omschrijving staat “Conclusie visusbepaling“. Ik heb genoeg kennis van taal om te interpreteren wat ‘visus’ zal betekenen in deze context, maar zelfs ik struikelde met m’n ogen over dit woord.

De laatste regel in de tabel gaat over het gehoor. “Uitslag drempelonderzoek” staat er bij de omschrijving. Nou associeer ik drempels niet meteen met horen, maar het zal voor de ‘insiders’ volstrekt logisch zijn welk onderzoek gedaan is.

Onder de tabel staat een nieuw kopje “Vervolg:”. Er volgen vier korte alinea’s/lange zinnen. De eerste twee zinnen beginnen met een O, de laatste twee zinnen beginnen met een X. Na het lezen van de eerste twee zinnen die gaan over het maken van een vervolgafspraak, in combinatie met de twee keer “prima” die er in de tabel staan over Dochter’s gehoor en zicht, begrijp ik dat er geen alarmbellen af gegaan zijn bij het onderzoek bij Dochter. En inderdaad, de alinea waar de eerste X voor staat begint met “Er is geen vervolgafspraak nodig.” Helder. En meteen daarna brengen ze me toch weer in verwarring: “Het vervolgonderzoek van groep 1-2 vindt plaats in groep 7 en het vervolgonderzoek van de 10/11 jarigen vindt plaats in klas 2 van de middelbare school.” Ik herlees de zin twee keer. Hoe vaak volgt er nog een bezoek aan de schoolarts? Aha! Ze bedoelen te zeggen: het volgende onderzoek vindt pas weer plaats in groep 7, dan is uw kind 10 of 11. Op de middelbare school is er nog een (laatste?) controle, als ze in de tweede klas zit.

Dan eindigt de voorpagina met een X voor “Zie extra toelichting”. Blijkbaar staat er op de achterkant van het papier ook nog wat. Ik draai het blad om en zie in een kader “Opmerkingen en/of toelichting:” Ik lees “Gezellig, vlot meisje!”. Daar ben ik het natuurlijk grondig mee eens. Fijn dat de arts nog een teken van persoonlijke interactie geeft op deze plek. Nu maar hopen dat er ooit een collega van de communicatieafdeling op het idee komt de hele brief een begrijpelijker (en warmer) tintje te geven.

Door |2021-11-03T16:50:16+02:003 november 2021|flow|0 Reacties

Lessons learned while researching data to find an answer.

Yesterday I published a data story on this blog. That was a first of its kind for me. Of course I’ve used graphs before in posts, but that was always reusing other people’s work. This time I did the data work myself. Here is an unstructured list of the things that I learned while doing it.

  • You start with downloading one dataset, but you’ll always need more data. My starting point was to find data on total houses in the country. The institute CBS has plenty of data available on their Statline website. I quickly found a data set with exactly what I needed: ‘Voorraad woningen; standen en mutaties vanaf 1921’. But of course, when you’re trying to find an answer to the question why housing is so expensive, you’ll need to compare it to population size. Therefore you need to download other data sets as well. For instance population growth;
  • Statline doesn’t always give you all the data available. In my exploration I first downloaded a dataset with numbers on population size starting in 1950. I used this mostly for compiling the graphs, only to find out later that there is another data set available that provides population data starting in 1900. My lesson here is to always dig for more when it comes to using CBS’s data;
  • Exploring data becomes messy rather quickly. I downloaded several data sets and used PowerBI to create a dimension table for ‘year’ and added this column to all tables, so that I could use all data across the tables. This phase is needed to discover what’s happening, but it gets more difficult to keep track of which columns you used from which table with each data set you add;
  • PowerBI is a very handy tool for exploring and combining data sets;
  • After the exploration phase, when I discovered the story the data was telling me, I created a new data set only containing the data that I needed. This way I couldn’t pick the wrong column when making the visuals;
  • To create relationships between the tables I used a ‘Year’ dimension table but only used it as a whole number column. I should have created a proper date dimension table to make it even easier to create relationships between the tables (as my teacher already told me to do with every new data model);
  • PowerBI Desktop is not the best tool for creating output outside the Microsoft PowerBI sphere. PowerBI is mainly meant for building ‘live’ dashboards used inside companies via PowerBI service, the online platform accompanying PowerBI. You can publish a report to service so that others inside your company can look at it. However, I want to publish the visuals on my blog. The only thing I can use from PowerBI Desktop is a PDF export. Luckily I know how to use Photoshop and was able to transform each PDF page in a PNG rather quickly, but that means extra steps between producing and publishing. Rather annoying when you have many graphs;
  • It’s easier to create new columns using a simple calculation in a spreadsheet than to use PowerBI’s DAX formulas to get the same result. In PowerBI I only succeeded doing calculations on columns within the same table, not across tables;
  • You need reflection time on what you’re doing with the data. I started exploring the data more than two weeks ago and only after I showed someone my unpublished post I discovered a flaw in my thinking. In one of my graphs I plotted three lines, two of which were a cumulation of population and houses and the third line was a yearly count of migrant surplus. I was comparing apples and pears to make a point. I corrected this and created a new graph comparing births, deaths and migrants, all accumulative since 1950.
  • I want to learn how I can create interactive SVG-plots on my website so readers can see the actual data behind the graphs.
Door |2021-11-02T11:55:33+02:002 november 2021|dataanalyses, datascience, flow|0 Reacties

Exploration: why are rents and housing prices going through the roof?

One of the recurring themes of the past few months, while we’re waiting for a new cabinet to form (since March 2021, the longest formation in Dutch history), is the rising housing prices. Many people who want to move house simply can’t find a place to live unless they’re willing to pay more than they can afford. Too few houses are for sale compared to the number of people looking for a new home, resulting in people bidding well beyond asking price. Rents are going through the roof, resulting in ridiculous prices in the bigger cities for a ‘house’ that includes a kitchen, a bedroom, a bathroom and living area all on less than 20m2. It’s like living in an Ikea cubicle and you pay half your starter salary for it.

Many reasons for rising prices are given. Professional and amateur real state investors are blamed. They take houses off the market to redo them, break them up in smaller units (rooms even) and then rent them for ridiculous prices. And then there are more and more migrants coming into our country who occupy our homes. Or it’s the ridiculous low interest rate for mortgages, so people can lend more and thus pay more for a house. All these reasons certainly contribute to rising prices. But I also know that no sane 25 year old envisions themselves paying half their salary for a room no bigger than they had in their parents’ home. So why is it that those types of tiny apartments still get tenants? That can only happen when there is a huge shortage on housing. But why are there too few houses built?

I wanted to understand the supply of houses better and turned to the Dutch institute CBS for data. I will write a more polished version of what I found on my other (Dutch) website, so I used Dutch in the graphics. I hope you can forgive me for that.

Here’s a summary.

I first wanted to know how many houses are built every year. There is a dataset available with the total number of houses built and demolished each year, starting in 1920. It also has a total of houses available at the start of a year. I plotted this against the total population.

Green = Total Population, Yellow = Number of houses

When you look at the yellow line, you clearly see a decline during the Second World War. Many houses got destroyed. The 1950’s are known for ‘Woningnood’. People had to take in others when owning a large home and many cheap houses were built to accommodate as many as people as quickly as possible. In this graph you can clearly see that the number of households in the fifties is much higher than the number of houses available.

Green = Total Households, Yellow = Number of houses

That can only mean several households living together under one roof. As you can see the two lines are much closer together in 2020 than in 1950.

Then I got thinking. Who actually needs a house? Adults. Not kids. So I plotted the same graph, but then using the numbers for the adult (20 and older) population since 1950.

Green = Total Adult (20+) Population, Yellow = Number of houses

I saw something interesting happening. Starting in 1966 the number of adults rises more quickly than the number of houses. Why is that? Twenty years earlier WWII ended, resulting in the well-known baby boom. Mid-sixties the first of that group became adults. You can also see that the trend line for growth in the number of adults is slightly steeper than that for growth of house supply. I zoomed in on this further. I calculated year-on-year growth of the population and housing supply.

Yellow = growth percentage houses, Green = growth percentage total population

Based on this graph it seems that growth of newly built houses keeps up with population growth. But adults need homes. Therefore I included the adult population in the graph as well.

Yellow = growth percentage houses, Green = growth percentage total population, Purple = growth percentage adult (20+) population

From this graph you could conclude that enough houses were being built. But you have to remember that the market was already lacking enough homes in 1950, the start date of this graph. Then there was a baby boom and although more houses were being built, it didn’t really make up for the existing shortage. Also notice that since 2007 the number of adults is growing again. The third generation baby boomers (the grand children of the baby boomers) are entering adulthood.

Then there is a totally different trend to add pressure on the housing market. Look at the average number of people living together in a household.

Average size of households in number of people.

The number of singletons living in a house rose quickly since 1980

Yellow = number of houses, Green = single person households, Purple= multiple person households

You can clearly see the added pressure when you look at the year-on-year-growth of single households.

Yellow = growth percentage houses, Green = growth percentage total population, Purple = growth percentage adult (20+) population, Pink = growth percentage single person households

Year-on-year growth of single people looking for a home far exceeds the growth of extra homes on the market.

A shortage of houses to begin with, a baby boom generation, more single households, a new generation of adults looking for a home to start a family in. That seems to be the cocktail that drives prices up right now.

I also wanted to know how big the shortage could be based on the available data. I therefore calculated the difference each year between the number of new homes available and the number of extra people each year.

Yellow = amount of extra houses available, Green = number of population growth, Pink = difference between the two

As you can see most years less houses were built than new people were added to the total population. A rough calculation of the built-up shortage since 1950 using the most recent number of people forming a household (2,14) rounds up to about 776.000 houses.

The actual shortage will be bigger as there already was a shortage before 1950. Recent numbers shared in reports talk of more than 900.000 homes that need to be built in the coming years to make sure homes become affordable and accessible again. My crude calculation comes close to that number and only takes into account the years starting in 1950 and doesn’t project future population growth nor an even further decline of number of people in one household.

I also discovered something interesting. In 2020 almost exactly the same number of people died as were born. There was a steep increase in the number of deaths in 2020 causing this parity earlier than expected. The result of a pandemic.

Yellow = number of living births, Green = number of deceased.

The consequence is that for the first time in history the total population growth in 2020 can be credited to a migration surplus.

Yellow = amount of extra houses available, Green = number of population growth, Pink = difference between the two, Purple = migration balance

But migration surplus is not as big as some politicians want us to believe.

Green = total of people died since 1950, Yellow = total of people born since 1950, Purple = total of migrant surplus since 1950

Clearly too few homes were built over a long period to keep up with population growth and declining household size in The Netherlands. When there is more demand for a product than supply, prices will go up. Therefore, investing money in the housing market makes a lot of sense, especially in a situation where having large sums of money (more than €100.000,-) costs money when left on a bank account. And every new inhabitant, either by being born or by moving from another country, adds pressure to the market.

Building, building, building is the only solution. For every ten houses one extra needs to be built, at least. But that’s easier said than done in the complex world of permits, land owners, borders between municipalities and provinces, and a huge shortage of technically skilled personnel to build us those homes (whom, ironically, we already need to ‘import’ from Poland and further to the East). And then I’m not even thinking about future implications of rising sea levels. The majority of inhabitants live below sea level already…

Isn’t there any hope for the under thirties currently longing for a proper place to live and start a family in? There is. Those born just after the war, the baby boom generation are all older than 70. As much as we want our (grand)parents to live forever, they won’t. So be prepared to compromise on where you are going to live for the next decade at least, keep the pressure on politics to reduce carbon emissions AND build more houses (in a sustainable manner of course), be welcoming to migrants who can build houses and take care of your (grand)parents while you are working your ass off to pay your current rent/mortgage. And don’t forget to make babies along the way. They’re a lot of work, but also adorable and great teachers of living in the moment. By the time they become smelly teenagers you’ll be able to afford that big home with a separate floor for them.

Door |2021-11-01T13:31:09+02:001 november 2021|dataanalyses, flow|1 Reactie

The daunting task of getting your mac ready for data science

Today I did something that I have been postponing for some months now: creating an environment on my iMac to be able to do my own data analysis projects.

I learned to use Python and data science packages such as pandas and matplotlib, but all in the safe environment of datacamp. Now that I’m contemplating running my own projects, I had to install Python3 and the packages on my own computer. I started searching online last week, got a bit overwhelmed with all the variations on things other people listed they installed, felt too confused to continue and closed my browser without installing anything.

This afternoon I was ready to try again. I was mentally prepared this time, so I took my time to compare the variations on installing Python3. I quickly discovered that I should refine my search on using Python for data science, as otherwise I would be installing tools geared towards developers.

I first manually installed Python3 and then read at several data science websites about Anaconda, ‘your data science toolkit’ and ‘developed for solo practitioners’. That sounds like me. I installed it, created a new environment using the latest Python (3.10.0) version and ran straight into trouble when installing some packages. Of course the error messages were very human readable (not), but it mentioned lots of version numbers and greater than, equal to, or smaller than signs. I clearly chose the wrong version of Python to work with. I trashed the environment I created, made a new one using the auto-suggested Python version and automagically everything I needed was in there.

As a final step I installed Jupyter Notebook, by clicking on the install button within Anaconda’s GUI, tested whether it worked with a bit of example code one of the helpful instruction sites had and it worked! I now have a fully functioning data science environment waiting for me to do some awesome projects.

Door |2021-10-07T15:59:19+02:007 oktober 2021|datascience, flow|0 Reacties
Ga naar de bovenkant