Apoorva is very excited to announce a partnership with MarkLogic! So, for our next blog, we’d like to share a little bit about what makes MarkLogic so great, and why we’re thrilled to be partnered with them.
MarkLogic is an enterprise NoSQL (‘Not Only’ SQL, nowadays) database. In the past, we’ve mentioned some of the advantages of NoSQL databases over their relational, SQL counterparts. In particular, these benefits include a level of scalability and flexibility not readily available in relational database models. These qualities are particularly relevant to the big data challenges many businesses today face. NoSQL is necessary for companies that require database management systems that can scale and deliver performance across enormous data sets and numbers of users. Think Facebook, Google, Amazon. Of course, nowadays, NoSQL is applied much more extensively than by just the tech behemoths. In fact, most real-time web and cloud applications employ NoSQL databases.
That’s not to say that relational database management systems (RDBMS) don’t still have an important role to play in the IT world — they do. When you have a known (or mostly known) data model, with centralized applications and structured data, where this is an emphasis on complex querying, analysis, and reporting, an RDBMS can still be the best option. But, in a data-intensive environment, where there are decentralized applications receiving structured and unstructured data, which demand continuous availability and horizontal scalability, a NoSQL database is the likely solution.
But saying ‘NoSQL solution’ is about as vague as saying ‘back in the day…’ In truth, NoSQL databases can (and do!) vary a lot from one another, and NoSQL is really not a wholly descriptive term. Unlike SQL relational databases, which are all based on tables, primary and foreign keys, and their interrelationships, there is no standard architecture for NoSQL databases. They are united simply by the fact that they are not relational databases.
Four main NoSQL database types include document store, key-value, column family store, and graph database. In a document store database, each record and its associated data is considered a ‘document.’ All data associated with a particular document object is stored together, allowing unstructured data to be easily stored. We’ll develop document store databases below.
Next, there are key-value databases. These have the simplest data model of the four main NoSQL database types. Good for when most of the access to the data is done using a key, it is generally used for very specific purposes, often for caching website visits, or storing and managing user profiles and preference settings.
Column family databases rely on rows and columns in a fashion similar to a relational database. However, a column family database is capable of scaling to as many rows as necessary, and each row can have any number of columns. Again, fairly specific in its use, this type of NoSQL database is frequently employed for event monitoring and content management systems, as well as blogging platforms.
Graph databases focus on the relationships between the data, relying on nodes, edges, and properties. It is fundamentally relationship-oriented and therefore excellent for managing highly connected data and querying those connections quickly and efficiently. Given even just brief overviews of the main types of NoSQL databases, it is important to understand that NoSQL is by no means a comprehensive term, and there are very real, very appreciable reasons as to why you may select one architecture over another.
For now, we will focus predominantly on document store databases as MarkLogic is, at its core, a document store database. Among the four main NoSQL database types listed, document store databases are the most versatile, and most well-equipped for general purpose use as they are both powerful and flexible. Additionally, document store databases tend to be easy to use, and structure data in an intuitive way. When processing information, humans naturally tend to think in terms of structural hierarchies and groupings, which happens to be the structure of a document. So a document store model tends to coincide well with the human perspective. Also important to note, a document can contain all of the information typically contained in the row of a relational table. Moreover, documents, in the case of MarkLogic, can include XML, JSON, text, and large binaries (such as PDFs and Microsoft Office documents).
Document stores promote easier application development as relational modeling for semi and unstructured or aggregated data can be timely, costly, and insensitive to evolving business needs. Because document store databases don’t rely on a schema as an RDBMS will, it is more flexible and better able to adapt to changing business demands. In the case of MarkLogic, the database is ‘schema agnostic’ but ‘structure-aware’, meaning that while it does not have an established schema, one can be enforced (and quickly changed) when necessary. Also great about document stores is that all data within a document is self-contained. There is no need for foreign keys or normalization or transformation between tiers. Such factors allow this type of NoSQL database to work across a wide variety of industries including media, financial services, and healthcare.
So now you’re feeling comfortable with why you might want to use a document store NoSQL database. But why do you want to use MarkLogic, specifically? There are a number of attributes that we believe make MarkLogic superior to its competitors. To begin with, while a document store database at heart, MarkLogic offers the ability to store and manage RDF (Resource Description Framework) triples, thus incorporating some graph database capabilities. This data model, known as semantics, allows the linking of different entities based on the relationships between them in order to form triples (consisting of subject, predicate, object). Such a data model is very flexible in handling heterogeneous data and has been leveraged to build smarter applications. MarkLogic uses RDF triples alongside their document stores to link data, provide context, and describe metadata, as well as integrate disparate data without corrupting the original source. What’s more, the use of semantics enhances MarkLogic’s search capabilities, which brings us to our next point.
Compared to many NoSQL database systems, MarkLogic has extremely advanced search capabilities. And, unlike many competitors, MarkLogic’s search is built-in as opposed to an add-on. MarkLogic relies on indices to create powerful query capabilities. By indexing the data on load, it immediately becomes searchable, and multiple indices can be used at once to enhance the search. The use of semantics, as aforementioned, further augments MarkLogic’s advanced search.
Finally, for us, what really makes MarkLogic superior is the fact that it affords enterprise capabilities. Where most NoSQL databases forfeit consistency in order to gain high availability, not so with MarkLogic. Instead, MarkLogic boasts all of the features that enterprises demand to run and maintain mission-critical applications — namely ACID transactions (Atomicity, Consistency, Isolation, Durability), disaster recovery, government-grade security, performance monitoring tools, and elasticity and scalability — all without sacrificing high availability. Document store databases may be functionally suitable for a wide range of industries, but without the added enterprise benefits offered by MarkLogic, it may not be an option in areas such as financial services, healthcare, or government. ACID compliance assures that database transactions are, in fact, processed reliably, and is a prerequisite to guard against data corruption, state reads, and inconsistent data. Moreover, we’ve spoken previously to the importance of disaster recovery. MarkLogic’s inclusion of enterprise features necessary to ensure maintenance of mission-critical activities serves to differentiate it from the competition. Couple that with our personal experience with knowledgeable, productive and friendly team members, and we can’t help but be excited about working with them.
If you’re interested in learning more about MarkLogic, or about our partnership with them, contact us at 844-500-DATA or through our website, apoorva.com. Here at Apoorva, we are looking forward to a long and fruitful partnership!