« OpenHpi : Course In-Memory Data Management | Main | SAP HANA - No 'The database is the bottleneck' anymore »
Friday
Aug232013

What is the difference between SAP HANA and a traditional RDBMS like Oracle?


Yeshua Ben Elohim : Don't mix the old wine with the new wine or put the new wine in the old bottles
- it will break!

Sometimes it's confusing and surprising reading blogs about in-memory database computing and the understanding of it; Right at the beginning of this Blog I must claim : If processing of all data of an traditional RDBMS would be done in-memory (all data in place) you are still not having an in-memory database at all!

Now did I take a too deep look into the wine bottle?
definitely No!

How traditionally RDBMS work;

As in the introduction of traditional RDBMS memory was very very expensive and compared to that disk space was a lot cheaper so the disk based RDBMS was born. As I look back from today - this was always meant as to be an intermediate solution.

Understanding the Buffer Cache and it's nature

Even though data of traditional RDBMS is disk based - the only place to operate on that data is in the CPU-registers, so the buffer cache is needed to bring a subset of the data nearer to the CPU without the need of I/O on every block access;

The buffer cache itself is nothing else then a small, virtual and logical memory window of the complete disk based data. Data blocks which will read into the buffer cache and are replacing other cached data blocks (already flushed ones),  blocks changed in the buffer cache will written down to the disk on checkpoint
and contiguously changed committed data will be logged as a byte stream by the log writer.

Because the buffer cache is a virtual window on the file block oriented data there is no capability of direct memory access to the data or more precise to a specific row of a table once loaded into the buffer cache. You need to organize a lot of lists, semaphores and memory address translation stuff to get a specific row from the buffer cache, because the unique identifier of a row the rowid. The rowid is not a memory based construct but a file based one - it contains no direct info where a specific row is located in the buffer cache - the rows starting address in the memory. A lot of CPU-cyles are needed to translate this virtual file cache nature into the a memory addressable one.

Back to the intro, if you would resize the buffer cache to hold the complete data in the cache
you still have all these virtual file based mechanisms; No direct memory-access to a row - you deal still with  a disk based behaving RDBMS; This is not in-memory databasing!
Do not mix up old stuff with new one.

Real In-Memory databasing - SAP HANA

Now as CPU and Memory has increased with it's capacity/capabilities with stellar growth even a larger amount of data could be hold directly completely in-memory.

Hence on startup of a SAP HANA database all data is loaded into memory - then there is no need to check anymore if a data is already in memory or a read from disk is necessary. The data due to column stores (vertical colum wise storage, mean values of one attribute are stored sequential in memory)  is CPU-aligned;
no virtual expensive calculation of LRU, logical block addresses ... but direct (pointer) addressing of data.

Additionally with SAP HANA the data is dictionary compressed means the table itself is modelled as a micro starschema, tables data contains only integers (CPU -friendly and compact) or bitmaps as data
referencing the dictionary maintained values of the column and even more the usage of native advanced features of the CPU for example SIMD (Single instruction, multiple data) is supported.

The main database storage now is the RAM instead of the disks;

with this in mind an SAP HANA is able too be multitudes faster compared to traditional RDBMS
even the data on the old style RDBMS would fit completely in the buffer cache.

In a real in-memory database you won't find any rowids anymore ;)

/KR



Reader Comments (3)

Thank you for this simple explaination :)

February 19, 2014 | Unregistered CommenterOla

Ola you are welcome !

April 5, 2014 | Registered CommenterKarl Reitschuster

Simple , to the point crispy explanation. One can easily understand the underlying functional difference b/w the two.

February 6, 2017 | Unregistered CommenterAnil

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>