Neo4j is Open Core – Now What?

The word responsibility can be thought of in a simple way. If you have an ability what is your response to that ability? If there’s a need and you’re able to meet it, what is your response? Will you act and use your ability to improve the situation for the betterment of everyone? If keeping the best open source graph database on the market today fully open source and truly enabling “Graphs for Everyone” is a mission for you and/or your organization then keep reading through to the end. If you care but don’t need to know all the details right now then skip down to the TL;DR.

We’ve been in the graph community for a while now (circa 2011) and we know a bit about the origins of the open source neo4j graph database. We’ve seen first-hand the powerful transformation that graph brings to organizations and its potential to enhance the way complex data is stored, processed and queried now and in the future. We know that the success of an open source database goes well beyond the commits made to the repository. Engineering is important, yes, and there is plenty of that needed when building a database, but it’s not the only contributing factor to enabling a great open source database, especially in the emerging graph database space. There are many key roles including entrepreneurs to build supporting businesses and ecosystem products, early customer adopters, implementations that stress and push the technology to the edge, testing and reporting defects from usage, promotion, discussion and exposure at events, meetups and conferences, community support, training and skill building across the user base, a sharing of vision and community involvement and likely many more that we don’t even see.

The open source neo4j graph database (hereafter the lowercase “neo4j”) has thrived since 2011 because it has been fully open source. In the very early days it’s difficult to get adoption of a new database technology, especially one that is closed source. Many organizations have open source initiatives, especially around databases, to avoid the lock-in and closed ecosystems that occur with relational databases from Oracle, SAP and Microsoft. The open stack initiative was born out the need to scale database technologies to achieve business solutions without facing unknown and often staggering licensing costs. The Open Source Initiative created momentum around the adoption of open source databases and neo4j has benefitted from this by being fully open source.

Unfortunately neo4j has always had a split licensing personality. In the early days, Emil (founder and CEO of Neo Technology, Inc.) faced challenges in how to commercialize neo4j and figure out what features were enterprise vs. community. In the end, the line was drawn to designate clustering, backups, restore, monitoring, metrics, security, constraints, scalable database format for big graphs, and enhance cypher as enterprise features while the core kernel, cypher and server capabilities would be designated as community. With the separate designations for enterprise and community came separate licenses AGPLv3 and GPLv3 respectively. Why Emil chose these licenses stems from the beginning when neo4j was mostly used as an embedded database and the less permissible enterprise AGPLv3 license would have required some projects embedding neo4j enterprise to be open source or get a commercial license (in the early days we worked on some of these embedded cases).

When server mode emerged in 1.8 (2012) and folks stopped embedding neo4j, the AGPLv3 open source license allowed usage of the standalone server without a commercial license. The introduction of server mode was essential and if the license had changed at that time to restrict enterprise features (such as clustering in server mode), it would have greatly impacted the adoption of still very immature database. Instead, what followed was the start of Neo Technology’s split licensing personality. On the one hand, if you visited the Github neo4j open source project you would see an AGPLv3 license that allowed enterprise usage and at the major Graph Connect conference Neo Technology would promote these enterprise features and the open source contribution of them. On the other hand, sales and marketing would create confusion among commercial users that neo4j enterprise needed a license to avoid open sourcing their project code (which was only true for embedded use cases of which there are only a handful still to date).

It was only recently that Neo4j, Inc. came into existence. Before that it was Neo Technology, Inc. and the commercial license was so appropriately called the Neo Technology Commercial License (NTCL). This meant community and enterprise could be used as-is without warranty under GPLv3 and AGPLv3 respectively or exclusively enterprise supported under the NTCL. Any neo4j users had the freedom to checkout neo4j community and enterprise source under one GitHub repository, compile and run all the tests, see the open development of the project, report issues and build distributions of both community and enterprise under GPLv3 and AGPLv3 respectively. This was good for community as it showed a vibrant ecosystem where everyone understood that core freedoms were being promoted and Neo Technology was able to commercialize enterprise under NTCL for enterprise customers that needed support.

At some point in neo4j’s growth and adoption, around year 2015, neo4j became a clearly viable database – this wasn’t the case from 2011-2015 when it suffered heavily from failed deployments as the result of scalability and reliability issues for many now common use cases. With neo4j becoming increasingly popular and more commercial licenses in hand, Neo Technology accelerated commercialization efforts around neo4j enterprise which included an increase in marketing rhetoric that focused on commercial customers needing a license (NTCL) to use enterprise. At the same time, Neo Technology continued to promote the open source nature and contributions of the company. This dual position led to confusion in the community and frustrations among commercial neo4j users but ultimately allowed Neo Technology to grow its commercial customer base while appealing to open source communities giving it all the benefits of riding the open source wave.

It was between 2016-2018 that Neo Technology made the big changes. Unknown to the community they acquired trade marks on Neo4j and Cypher (both were commons usage until that point) and Neo Technology, Inc. changed its name to Neo4j, Inc. The build tooling that had allowed neo4j enterprise users to build enterprise distributions was removed from the open source repository. The enterprise license changed from AGPLv3 to AGPLv3 + Commons Clause and was applied to the then mainline 3.4 release and backported to patch releases for 3.2 and 3.3 to attempt a block at subsequent patch releases. And when Neo4j, Inc. realized that the additional Commons Clause didn’t prevent usage of enterprise features, it made the ultimate decision to remove all enterprise modules including tools and tests from open source and declare publicly that Neo4j, Inc. is now an Open Core organization leaving neo4j community alone as open source.

So why does it matter so much that the enterprise features were taken closed? Well for any like-minded architects out there with open source initiatives we know that a database we choose needs to be open source so we can see it, build it, know there is a community engaged around it and good commercial support so that our organization has the option to reduce risk and improve business continuity as desired. We also know that any viable database needs to be performant (enhanced cypher runtime), vertical and horizonal scale (clustering, sharding, query parallelization), operational and administration ease (backups, restore, monitoring, metrics, logging), security to lock it down and constraints to ensure integrity. All of these are core features for a modern database to be in a modern data architecture. It was the enterprise features and the fact they were open that enabled neo4j to enter the modern architecture. The reason we say modern architecture is because it’s not just an enterprise architecture. Any more these days, any initiative worth undertaking requires a scalable architecture. Any product or platform worth building demands an underlying scalable foundation and very often an open source architecture in addition. The remaining community features do not meet these needs and while enterprise features can still be access through Neo4j, Inc. they are propriety and do not meet open initiative requirements.

Organizations that adopt a proprietary enterprise should be aware of the specialized data format in enterprise today. The database format for building large graphs (bigger than 34 billion nodes/relationships) was designated as an enterprise feature and is required when using enterprise. When users move from community to enterprise there is a data format migration performed that is irreversible. This means that all enterprise users today, after they upgrade to 3.5 will be using a proprietary data format that will forever have them locked-in. Based on Neo4j, Inc.’s trajectory, this may be intentional so they can continue to move closer to their vision of becoming the Oracle of graph databases as they think of it.

We see a bigger vision for graph unfolding where there is going to be a much bigger need for scalability, performance and compute in a distributed manner as graphs begin to play a central role in machine learning, artificial intelligence and the new age definitions of both as they will be continue to be redefined to move beyond the definitions from the 60’s. The way we see it, clustering, sharding, parallel cypher and big graph performance will be essential in this coming age and cannot be proprietary, closed source enterprise features. Every organization has a vested interest in this future mission and it’s essential that the handling of this critical component is done in the open for everyone.

The only way for neo4j enterprise users and those community users needing enterprise features is to make a bold move to The Graph Foundation. We started Graph Foundation, Inc. (referred to as The Graph Foundation) in June 2018 when we noticed Neo4j’s position beginning to change and the implications of this for the community and ecosystem. The Graph Foundation is a nonprofit with 501(c)(3) status and its goal is to take over neo4j enterprise development and continue forward under a model that closely resembles that of The Linux Foundation.

We see that one of the most important aspect about neo4j is that it’s an open source native graph database. We call it native to separate from other graph layers built on RDBMS and NoSQL databases that use indexes or joins to form relationships while native graphs like neo4j store nodes and relationships as fast mapped pointers at a store level giving peak performance when traversing complex, densely connected data. In 2003, we were obsessed with learning about open source and the world that it had created. As a big users of Linux we learned of the efforts of the Free Software Foundation that gave us not only the freedoms we came to know and love but the tools to build and thrive without limitations. As a throwback to those early days and our beginnings on GNU (remember GNU’s Not Unix), we decided on the name ONgDB (oh-n-gee-db) which stands for Open Native Graph DB but also ONgDB’s Neo4j Graph DB. We just couldn’t resist the double meaning and the shout out to those that have come before us in pioneering the Free Open Source Software movement.

In order to move ONgDB forward in a significant way, we’re going to need a lot of help. If keeping the best graph database on the market today open source and truly enabling “Graphs for Everyone” is a mission for you and/or your organization, there are a few things you can do to be an activist in this mission and help the cause.

First, help spread the word and make every individual and organization aware of what’s happening. Don’t let the conversation end at a press release from Neo4j, Inc. This is bigger than that and Neo4j does not represent the collective voice of the graph community now or those that will need an open community and its solutions in the future. The fate of many is incorrectly in the hands of a few that do not seek interests in the best alignment for the future in store for us all.

Second, if you have the skill, ability or just sheer passion for learning and working with a foundation of inspiring individuals ready to teach and help you grow, this is one of the greatest technology initiative of your time to be part of in ushering in a change for the ages. We are going to take graph to the next level and you will be part of this historical movement.

Third, if you have open initiatives for data architecture at your organization then you need to cut your licenses from Neo4j, Inc. Don’t invest monetarily in a proprietary future. If you need commercial support beyond the community, and many organizations will, The Graph Foundation has commercial sponsors working and contributing to the open source initiative and you should contact us for those commercial sponsors that will work with you to give you the support, liability and warranty you need to succeed. Many of the engineers working on ONgDB alongside foundation engineers are coming from commercial sponsors that are donating time and resource to move the mission forward. Sponsors such as GraphGrid, the top contributor to date have the most talented pool of engineers and commercial momentum to give enterprise customers the best choice for commercial support.

Lastly, The Graph Foundation does have operating costs and is being donated to heavily by sponsors like GraphGrid. In order for the foundation to move forward we need to have more capacity within the foundation to continue on development and road mapping. Our desire is to have many commercial entities influencing the future decisions of The Graph Foundation. The Graph Foundation is a 501(c)(3) which has nonprofit status from the IRS so any donations are tax deductible by any enterprise that makes a donation. Corporate sponsorship will help ensure The Graph Foundation thrives and has the backing to correctly move the broader graph vision forward in a way that serves those most invested in the future of graph.

We see the future ahead; it’s connected and built on graph. We want to connect with you, right now to ensure that this future remains free and open to all. To this end we will not stop to until we have brought to fruition this vision which stands before us all. Will you join this mission?

 

TL;DR

       Neo4j, Inc. made key performance, scalability and security features proprietary as of 3.5

       We must protect the freedoms of neo4j enterprise users by keeping enterprise open source to ensure a vibrant future for everyone, not the few

       Under current enterprise designations, the biggest gains in big graphs to power AI/ML/DeepGL, distributed/parallel Cypher and large clusters with sharding will be Neo4j Enterprise Edition features and require lock-in to a proprietary database

       The Graph Foundation, a nonprofit 501(c)(3), is taking over enterprise open source development and with the help of sponsors and the community will push forward a vision of graph that will be for everyone

       The Graph Foundation needs help! Consider sponsoring and getting commercial support from approved commercial sponsors to ensure enterprise development continues forward

Leave a Reply

Your email address will not be published. Required fields are marked *