In the spirit of the last couple of years, we review developments in what we have identified as the key technology drivers for the 2020s in the world of databases, data management, and AI. We are looking back at 2021, trying to identify patterns that will shape 2022.
Open source and cloud
Open source software has been on the rise for a while, and we don’t see any signs of this growth slowing down. According to Gartner’s 2021 Hype Cycle for Open-Source Software (OSS): “Through 2025, more than 70% of enterprises will increase their IT spending on OSS, compared with their current IT spending. Plus, by 2025, software as a service (SaaS) will become the preferred consumption model for OSS due to its ability to deliver better operational simplicity, security, and scalability”.
Gartner’s predictions of open source at large are even bolder and more specific when it comes to open source in the database and data management world. As far back as 2019, Gartner predicted that the future of databases is the cloud, and that future of databases is also something else: it is open source.
By 2022, Gartner predicted, more than 70% of new in-house applications will be developed on an open-source database, and 50% of existing proprietary relational database instances will have been converted or be in the process of converting. At the dawn of 2022, it’s hard to verify the accuracy of these predictions.
What we can do however, is offer a reasonable, even if a bit of a cliche by now, explanation. Gartner also offers a hint to this, by linking OSS and the cloud via SaaS. OSS can mobilize people beyond the boundaries of a single organization to contribute to high-quality software, and it can skip the line in enterprise sales and establish a presence by winning developer hearts and minds. Those are widely acknowledged facts by now.
In the database and data management world, especially when it comes to analytics and AI, the focal point is the data. The reason why those databases and data management systems operate in the first place is to collect the data that will be used to build analytics and AI applications.
For many organizations, databases and data management systems become somewhat of a commodity that is best operated in the cloud, where resilience and elasticity is someone else’s job. This way, organizations can focus on their core mission, which is to use the data to deliver value.
While it’s hard to verify the accuracy of Gartner’s predictions concerning the prevalence of OSS in the world of data, we can look at some indications as a proxy for this. First, in January 2021, OSS databases surpassed closed source rivals on db-engines.com, the popular website that keeps track of database metrics.
In addition, how well databases and data management systems score among the fastest-growing OSS projects offers another hint. The ROSS (Runa OSS) Index is an index created and maintained by Runa Capital, a Venture Capital supporting founders who are building disrupting companies across B2B SaaS, deep tech, and software for regulated industries. The people at Runa look for promising companies with a fast-growing army of fans and keep track of them at Github as part of their investment plans.
Throughout 2021, about 35% of the OSS projects included in the ROSS Index have been databases and data management systems, including the likes of Appwrite, Prisma, and SeMI Technologies, which we have covered in this column. In an OSS ecosystem that includes everything from front-end development to blockchain applications, databases and data management systems are over-represented. And all of those projects apply what is by now the standard playbook for OSS: a free-to-use baseline version, plus an enterprise version offered via the SaaS model in the cloud.
Another related piece of evidence as to the prevalence of OSS and the cloud as operating models is the amount of funding received by companies built on this combination in 2021. 2021 has seen OSS behemoth Databricks raise a $1B Series G round in February, and a $1.6B Series H in August, bringing its valuation to $38 billion. Plus, Confluent, another OSS behemoth, filed for an IPO.
2021 has also seen a few more unicorns in the OSS data world. Graph database Neo4j raised a $325 million Series F funding round, the biggest in database history, bringing its valuation to over $2 billion. Apollo GraphQL raised a $130 Million Series D round at a $1.5 billion valuation, Yugabyte raised a $188M Series C funding round at a $1.3B valuation, and CockroachDB raised a couple of rounds too, with the latest being a $278M Series F at a valuation of $5 billion.
And that’s not even considering all the aspiring data and AI OSS unicorns out there, from OctoML and Edge Impulse to Superconductive and Startree. As Luis Ceze, OctoML CEO and founder, told ZDNet recently, there is a lot of capital flying around and being invested in OSS companies creating value. We expect this trend to continue in 2022.
What we saw very little mainstream uptake for in 2021 was a more fine-grained way to account for the value generated via OSS. This is a topic we first started exploring in 2019, and in 2021 we featured the CHAOSS project — the most elaborate effort we have seen to capture the value OSS communities generates. Balancing, or even defining, makers and takers in the OSS world remains controversial, and 2021 saw two more commercial OSS vendors, Elastic and Grafana, changing their licenses.
Two, Three, Many Blockchains
Blockchain platforms are by and large open source too, but although data-related, theirs is a different story. Let’s get that out of the way: was 2021 a breakout year for blockchain? No, not really. Will 2022 be a breakout year for blockchain? Probably not. But that’s not the point. Blockchain’s sudden rise to stardom in 2017 was rather abrupt and premature. The concepts and the technology are still under development, while mainstream adoption is still tentative.
To speak in hype cycle terms, blockchain is going through the Trough of Disillusionment. But that does not mean it’s without significance. To reiterate: the transformational potential is there, but there’s still a long way to go, both on the technical and on the organizational and operational side of things.
In 2020, blockchain-powered DeFi rose to prominence. In 2021, DeFi hit the reality wall. DeFi stands for Decentralized Finance. In short, DeFi’s promise is to be able to cut out middlemen from all kinds of transactions. In 2020, DeFi saw lots of growth, some of it warranted, we noted last year.
Unsurprisingly, in what is becoming a pattern in the blockchain world after ICOs, scammers flocked in. ‘Rug pulls’, a scam scheme associated with DeFi, accounted for 37% of all scam revenue in blockchain in 2021 compared to just 1% in 2020, totaling $2.8 billion.
Plus, as Gartner notes, cryptocurrency prices have crashed in recent months. However, as the statement continues, it’s important to not conflate the value of blockchain with the most recent price of various coins. Volatility (and scams, and failures, we might add) is to be expected as crypto markets sort out. Meanwhile, blockchain innovation is moving steadily forward. Let’s see what’s going on on that front.
Bitcoin, the most widespread blockchain-based cryptocurrency, saw an upgrade code-named Taproot. Taproot is seen as an enabler for developers to integrate new features that will improve privacy, scalability, and security.
Ethereum, the second in order blockchain-based cryptocurrency, continued down its long and winding path to break away from proof-of-work and transition to proof-of-stake. After releasing the so-called Beacon chain after years of research and development in December 2020, on August 5, 2021, Ethereum’s “London” upgrade launched successfully on mainnet as the last hard fork before the transition to Proof-of-Stake / ETH 2.0.
Perhaps more importantly, however, we have seen the notion of multiple blockchains take off in various Ethereum alternatives. In March 2021, we covered IOTA’s big comeback. In October 2021, IOTA released smart contracts — the base for DeFi — while also supporting Ethereum interoperability and multiple blockchain networks.
Polkadot, the proof of stake blockchain Ethereum’s co-founder Gavin Wood started building to address flaws in Ethereum, launched parachains in 2021. Polkadot leverages a network of parallel blockchains called parachains that can run their own assets and ecosystems while maintaining interchain interoperability. This is also Polkadot’s way of supporting smart contracts.
Just like his co-founder Gavin Wood, Charles Hoskinson went on to do his own thing after developing Ethereum. Cardano is another proof-of-stake blockchain, which was at some point the “third coin” in crypto. Cardano recently added smart contract capabilities too.
Last but not least, blockchain oracle service Chainlink has also moved towards smart contract territory, after introducing off-chain reporting, a new general purpose secure compute framework.
Join us in the coming days for the upcoming review on AI and knowledge graphs.