Programming

At 50 Years Old, Is SQL Becoming a Niche Skill?

SQL has just turned 50. Only a little bit older than me. Sigh.

This post was originally triggered - and I choose that word carefully - by a recent experience on a cloud cost-optimisation project. These experiences prompted me to consider how things had changed since I started working in software.

As part of the project that provoked me, I was researching the topic of cloud cost analysis and was struck by how many people complained that the tools the big three providers give you are not adequate to the task. I had a look myself and indeed found that the questions I wanted answering were difficult to answer using the GUI tools.

No problem, I thought: I'll just download the raw CSVs and find an open source project that sticks the data in a database, allowing me to write whatever queries I want on the data. After some searching I couldn't find any such project, so wrote one myself and stuck it on GitHub. Once this was built I could write all sorts of queries like 'Compare costs per GCP service/project/sku with previous month, showing any percentage change greater than x', modifying them at will to hone in on the specific data I was interested in much faster and more effectively than could be achieved with GUIs. Some of the more generic ones are in the repo.

While working on this with a very bright and knowledgeable younger colleague, I was struck by the fact that he'd never needed to learn SQL, and he made sense of these queries by comparing them to other JSON-based *QLs that he'd come across (PromQL, GraphQL etc). This surprised me.

The Good Old Days

I'm going to grossly simplify matters here, but when I was coming up in the industry (around 2001) three-tier backend applications were where it was at. There were webservers (usually Apache), application servers, and databases. Knowledge and experience was thin on the ground, so you had to know a bit of everything. HTML, Javascript (mostly form submission fiddling), a language to do business logic in the application server layer (usually Java in those days, but I'd sworn off it for life), and SQL for the database. Different people had different preferences and specialisms, but you had to have a basic knowledge of all those technologies to be an effective developer.

The entire development engineer industry was about 'dev' at this point. 'Ops' was a challenge that was at the time usually either done in a half-hearted way by the developers themselves (in smaller orgs), or passed over to the sysadmins to manage (in bigger orgs). The smaller orgs grew into a devops mindset ("we're not hiring a separate team to support the operation of the system, it's cheaper to get the devs to do it"), and the bigger ones embraced SRE ("we need capable engineers to support and manage site reliability engineer live systems and there's economies of scale in centralising that."). Data science was also not really a thing then.

There were DBAs but they were there to ensure backups were done, and punish those who polluted their treasured databases with poorly written queries or badly defined indexes. They therefore sometimes doubled up as SQL tutors to the devs. 

Since almost everyone was a dev, and devs did a bit of everything, engineers were often asked SQL questions in job interviews, usually starting with a simple join and then asking about indexes and performance characteristics for the more advanced candidate.

The Rise of the Specialist

Along with the rise of ops in the noughties came the continuation of increasing specialisation in the IT industry. We'd had the DBA for some time, but the 'dev' was gradually supplanted by the front-end engineer, the tester, the data scientist, the DevOps engineer, and later the SRE, the security engineer... the list goes on.

I say this was a continuation because this process has been going on since computers existed, from the original physicists and technicians that maintained and programmed machines the size of rooms bifurcating into COBOL and FORTRAN programmers in the 60s and 70s, leading to system admins in the 80s, and then network admins, and so on.

Parallels to this can be seen in the medical profession. Once upon a time we had a 'ship's surgeon', who had to deal with pretty much any ailment you might expect from having cannon balls fired at you while lugging your heavy cannons to face the enemy, and (in more peaceful moments) disease-ridden sailors sharing whatever diseased they'd brought in from shore leave. Fast-forward hundreds of years and we now have plastic surgeons that specialise in just in ear surgery (otologists).

And a similar thing seems to be happening with computing jobs. Whereas once upon a time everyone who interacted with a computer at work would have likely had to type in (or even write) a few SQL queries in the past, now there's a data scientist for that. The dev now attends scrum meetings and complains about JIRA (and doesn't even write much JQL).

Is SQL Becoming a Niche Skill?

So back to the issue at hand. Opinion seems to be divided on the subject. These interactions were typical when I asked on Twitter:

Screenshot 2024-05-02 at 12.47.05

There's no doubt that historical demand for SQL has always been strong. While SQL's elder sibling COBOL clings to life on the mainframe, SQL remains one of the most in-demand skills in the industry, according to the TIOBE index. In the 20 years since 2004 it's gone from 7th to 8th most popular language along side mainstays like C and upstarts like C++, Java, and Visual Basic.

Not only that, but SQL is still a widely used data querying language outside its common domains. Those who prefer SQL to JQ (and who among us without a PhD in mathematics do not?) might be interested in DuckDB, a SQL-fronted alternative to jq. And facebook's osquery, which gives a common SQL interface to query different operating systems' state for compliance, security, or performance purposes.

'Schemas Are Bad, M'Kay?'

In another recent project with a similar demographic of engineer, I was horrified to be voted down about using a relational database as the consensus was 'schemas are bad' and 'MongoDB is more flexible'. This, I thought, is a world gone topsy-turvy. To gain dominion over your data, you need to wrestle it to the floor with a schema. Without that, you won't have the power to range over it with whatever questions you have to ask it, and you won't have a database, just a set of documents or a fancier filesystem.

To this it's often objected that schemas are 'hard to change', but this is only true when you've got many users and code dependencies on the database or huge amounts of data, not when you're initially building up a system. They can be annoying to change, because the syntax varies between vendor and you often end up looking up whether it's ALTER TABLE ADD INDEX or CREATE INDEX. LLMs almost banish this problem, however, as they are incredibly good at saving you time on this.

Poking my head out of that world I've been shocked at how few people currently in IT really understand data programming/management these days. npm and Javascript sure, python probably, but data, its management and any programming around it seems to have been pushed to the sides for the 'data scientists' to deal with, as though data were not an essential aspect of pretty much any computing work.

But perhaps I'm just an old ship's surgeon who might know how to saw a leg off and administer smelling salts, but doesn't have the deft skill of an otologist who practices on their specialism on the daily. I'll just call myself a 'full stack' engineer and hope no-one notices.

Comments
Leave your Comment