SpectroDynamics LLC

Distributed Object-Oriented Database

OkeanosTM   
   
OkeanosTM is a very-high performance object store, which can be accessed concurrently by multiple users and by multiple threads running on either one or on multiple computers interconnected in a distributed way, i.e. either on a local- or a wide-area network like e.g. the Internet.
  
The concept of "object store", or "object datastore", is a modern approach to data storage and management and it is also known in the industry as "object-oriented database" or OODB or even OODBMS (Object Oriented database management Systems). It represents a radically different and much more efficient way of storing and retrieving data from the traditional relational database management systems (RDBMS), which are still based on rather simplistic table-driven column-structured data representations and which require the use of query languages like SQL, etc. with all their limitations.
 
HOW DID DATABASES EVOLVE?
    
Decades ago, traditional databases appeared as ways of storing large hierarchical data structures and with the evolution of hardware and systems software, they rapidly evolved along the so-called relational path. In a relational database (RDB) data to be archived must be structured into records, which are essentially rows that are respectively composed by precisely defined fields, and then inserted into tables. 
      
For example, an employee data base may be made to understand first and last names, address, salary, etc. for every single employee stored (as a row) in its fold. Multiple tables then can be logically combined when they share one or more common columns. For example a Human-Resources database can have the names of employees per department along with salary grades, name of supervisor, etc. Two such tables can be combined (a.k.a. "joined") by using e.g. the field of last names as the common column.
      
Structured front-end languages e.g. SQL, have been developed for users to query a relational database either online, or as part of batch programming. Such query languages access an RDB database and retrieve, or store records. Batch programs accessing though such RDB databases around the clock are either procured by an enterprise from 3rd party SW vendors, or are developed internally, usually by the departmental IT development resources of the enterprise.
   
OBJECT-ORIENTED (OO) THINKING IN PROGRAMMING LANGUAGES
  
In the last ~25 years, and for whatever historic reason, a massive move has occurred in the software industry towards object-oriented (OO) programming languages and methodology. Typical examples (among a multitude) are some well-known languages, e.g. Java, C+, Python, Perl 5, etc.and some less well known languages, e.g. Smalltalk, OCaml, Common Lisp, etc. OO programmers apparently have found that by mapping a real-life project onto an OO-oriented framework of concepts, which they can then code in an OO-language is much more natural and therefore less complex to develop, to document, to test, to deliver, and to maintain.
  
OBJECT-ORIENTED THINKING IN DATABASES
  
This advancement however has not been consistent with the corresponding lack of progress and evolution in database thinking. Vast stores of data remain stored in old-fashioned relational databases that can only be accessed sequentially (if not painfully) as inflexible rows or columns, yet the very programs themselves that are written by the IT departments of the 21st century to access such databases are written more and more in modern object-oriented languages that force the programmer to not think in terms of rows and columns, but in terms of standalone objects, i.e. specific instances of a class of similar objects.
   
Objects are characterized: (i) by slots (with each slot carrying potentially radically disparate types of information that describes the object, i.e. alphanumeric text, boolean data, a piece of audio or video recording, an image file, a piece of software source code, etc.), (ii) by methods (i.e. detailed software procedures of how to deal with specific events, stimuli, or triggers to which the object must react or respond), (iii) by the object inheriting properties from a plethora of higher-level classes with a superset of characteristics, and (iv) by the object encapsulating much behavior that must on some occasions remain hidden from some pieces of software.
    
A COUPLE OF EXAMPLES
   
If a typical company wants to build say its employee database, that is obviously a quite predictable scenario. But, for the sake of argument, what happens if e.g. a government intelligence agency or a law enforcement agency want to upgrade a database to also include beyond names and addresses of people now also their high-resolution fingerprints? Or their iris eye scans? Or voice sample recordings? Or telephone intercepts from wire tapping the enemy? Or, images from space? Or, video taken from surveillance cameras?
      
What if all these multimedia files are also annotated and cross-indexed by analysts from several other intelligence agencies who wish to exchange and cross-pollinate information beyond organizational borders? What if this was a pharmaceutical company, where complex chemical formulas, experiments, clinical trials, reports, video of measurement instruments output, 3-dimensional graphics representation of molecules etc. must be combined with researcher names, project budgets, regulatory compliance reports, etc. all in a seamless expansion framework?
    
Most importantly, what happens if the example's intelligence agency, or pharmaceutical company cannot foresee what other new information in the future might be desirable or required inside an evolved form of the very same database? How do you go about expanding in such an unpredictable way a relational database that has been "proudly" designed instead to be rigid and inflexible?
    
HOW TO THEN BRIDGE THIS GAP?
    
RDBMS vendors have predictably taken the easy way out. It is the equivalent of "pushing a square peg into a round hole". They fashioned and propose translation modules that MAP objects as the external programmers devise them, to and from DB-internal relational table representations. Complexity begets complexity and programs become longer and more complicated to write, debug, document and maintain. Armies of people need to be hired with complementary skills in all the stages of this elaborate scheme, and that inflates budgets directly and indirectly. Cost, delays, and lack of performance are the natural outcomes of such an approach. There has to be another better way!
   
The OkeanosTM way!
 
HOW DOES OKEANOSTM ADDRESS SUCH CONCERNS?
 
In OkeanosTM objects are self-describing and retrieval is self-reliant. Classes can evolve smoothly. The meaning of self-reliance as implemented in OkeanosTM however is dual and UNIQUE in the industry.
 
First, whenever new classes are defined, or existing classes are redefined and new instances of them must be saved in the database, the OkeanosTM system detects whether or not the database has seen the last definition, or if it needs to be automatically updated for evolutionary changes. If the database is found up-to-date, then just the new instances are saved. Otherwise, sufficient information will be retained about the new class and about its superclasses so as to be able to automatically reconstruct them as needed in the future.
 
Second and  most important, however, is that when OkeanosTM retrieves an object of a class for which the application program has never seen a definition, that class and all of its superclasses are automatically retrieved and recreated from the object store before retrieving the instance object. Instances of the new class can be readily viewed and updated, and new objects of the class can be created. A user e.g. in Boston could create new data classes, and users elsewhere in the country can immediately begin retrieving stored instances of them without ever needing to know what the Boston programmer had done.
        
Other databases must have full advance knowledge of stored object classes. Applications programs based on such alternative databases must be extensively rewritten and recompiled to incorporate new data classes. At best, these other databases present a jumble of unintelligibe bytes for retrieved objects of unknown classes. Not so though with OkeanosTM
      
SOME SNAPSHOTS UNDER THE HOOD
  
OkeanosTM uses persistent classes to represent stored data. Object classes represent much more than "dynamic schema". A schema is a fairly rigid specification of data residing in objects. A true dynamic class describes data, inherits properties from superclasses, allows instant redefinition at any time, and participates fully in method dispatch, that is in the selection of appropriate class-specific functions to perform any kind of data handling. Previously, stored instances are automatically updated to reflect class definition changes upon retrieval.
  
Every object in the database is assigned a unique OID (Object Identification) number, which is universally unique and comprises the time of creation and the MAC of the computer that originated the transaction. When objects are retrieved from the database, their OID's are first retrieved based on the specific ad hoc or fixed query criteria.
    
Only when the set of collected OID's is narrowed down to the exact subset of objects that satisfy specific elaborate query conditions, will the real objects be extracted from the database for presentation to the user. This vastly reduces the initial amount of data to look at and vastly improves performance.
    
Any kind of data can be stored and modified in the persistent store (resident in memory for immediate access by cache) and they automatically register with the database upon creation. However, they will be stored into the actual physical medium with subsequent COMMIT-CHANGES transactions. OkeanosTM uses a distributed 2-phase COMMIT. First, all databases involved in a transaction are asked to verify the integrity of the proposed changes. If all goes well, then all these databases are asked to commit their individual changes and each one of them then updates their individual transaction log files. If anything is wrong, then a ROLLBACK is issued and the request is cancelled.
   
OkeanosTM works in a distributed environment, where the actual data may reside in multiple databases that are located on multiple host computers and multiple threads/users are envisioned as trying to access this generalized datastore. During a ROLLBACK, a new running-transaction time stamp is negotiated with each database, and multi-database updates are coordinated in such a way that they can be identifed in transaction logs.  
  
OkeanosTM ensures data consistency through a strict supervision of Transaction Time-Order (TTO) whereby time differences between systems, and conflicting moments of reading or writing by multiple threads do not compromise the integrity of data and locks. Elaborate lock management systems are in place, to ensure that remote systems which may be susceptible to loss-of-link due e.g. to communication disruption, etc. do not compromise the operability of the rest of the database on other systems.
   
Under the hood, OkeanosTM runs a very high performance direct memory-mapped image of the database files, its B-Trees and Heaps. The underlying operating system offers already a highly polished page management system, as well as a file caching system, so OkeanosTM takes full advantage of them completely seamlessly and invisibly to the user.
 
HIGH PERFORMANCE, EFFICIENCY & EASE OF USE
  
SpectroDynamics engineers have measured the performance advantages offered by our direct memory-mapped operation and we have found it to outperform classical buffered streaming I/O by being about 100 times faster. Adding to our high performance, we have also implemented our own advanced version of sophisticated B-Tree algorithms and implemented advanced proprietary techniques like OK-Maps, OK-Sets, etc. to further facilitate the efficient and versatile use of the database.
  
OkeanosTM implements its extraordinary distributed architecture by fully and transparently relying on the SpectroDynamics ButterflyTM platform for distributed computing.
  
Object-oriented datastores implemented using OkeanosTM technology have no capacity size limits imposed upon them.
    

     

Please contact SpectroDynamics today for a no-obligation consultation. We will be happy to discuss your needs and we will show you how OkeanosTM technology can help transform your business, by improving your operations, and efficiency, while reducing your risks and costs. 

For more information, please contact us at:

info AT spectrodynamics DOT com

(C) Copyright 2007-2010 SpectroDynamics, LLC. All rights reserved.