Sunday, September 29, 2013

Database-- 12C New In-Memory Option -- Flip a switch and all your applications run much faster!

This article was written based on Larry Ellison's speech in Oracle Open World 2013..
Larry Ellison 's keynote was about Oracle Database 12c In-Memory Database, M6 Big Memory Machine, Oracle 's backup system and Oracle 's advantages in Cloud.
You can reach the key note from the following url ;  http://ermanarslan.blogspot.com/2013/09/larry-ellison-oracle-openworld-keynote.html


The reason to prefer the in memory option is to make the system go faster, ofcourse. The design goal should be at least 100x faster.. In database world it primarily means "queries run 100x faster..", as Ellison states.
Accelerating the queries is the purpose that directly and mainly affects the Data warehouse systems, as we use analytical queries to obtain the meaningful data.. Doing things in memory provides the query results at the speed of thought.. That is, the answers are coming back faster than you can come up with the questions...
So, with In Memory Option queries are faster, but also the transactions should be not slowed down, too. As there are bunch of Oltp and Erp applications in Oracle, the design goal should be to speed up the transactions, as well. By using In Memory Option, the transactions running on these Enterprise Level mixed workload type (Oltp/Anaytical/Erp) applications should be at least x2 faster..



 So Oracle 's new solution for In memory Operations is to store the data images in memory in different shapes which fit the type of the operation....The implementation of this idea will accelarate both the queries and transactions. Ofcouse, the implementation of this idea seems so hard because Oracle needed to protect the transaction recovery, logging and consistency while storing the data in memory in two different shapes..

Traditionally, In Rdbms, the data is stored in rows. Storing the data in rows makes processing very fast but for few rows with a lot of columns.. Alternatively, data can be stored in columns, too. This method makes query processing become very fast. But again, this is faster in few columns with a lot of rows..

Oracle 's new In Memory technology is based on a better idea..
In Oracle Database 12c In Memory option, the data is stored in both format, and put in the memory.. There are two stores in this method named Column Store, and Row Store. (One store for the transactions and one for the queries/analytical queries).
The data stored in the Row Store is in Row Format , and the data stored in the Column Store is in column format, which is 90 degree rotated version of the row format.
When you insert a data, Oracle will update row store and column store..
When you update one, you will update the other. The data is consistent. Transactional integrity is protected..
Column store makes x100 query performance. Row store makes transactions go faster..

Following is prepared by Oracle to show the speed of the Oracle 's 12c In Memory Option.
In this example, Oracle runs a query on a table which has 3,167,531,763 rows.. The data in the table is coming from the real life. The data is based on the Wikipedia searches.. The user searches that is done in the last week of August..



The question is "how can transactions go faster?".. This question comes to mind, because in this technology, there are two stores in memory, and both of them should be updated.. So there is an additional work...
Oracle explains this as follows,
Normally we create 2 index for oltp , 10 index (maybe more) for analytic queries. Even, we create the indexes for queries, we speed up them, but then our transactions are slowing down..
Inserting a row into a table means , insert row + update index + update index... + update index..  Maintaining those indexes is a very expensive operation and slows down OLTP..
So, what Oracle suggest here is, actually dropping the analytical indexes, as the column store technology is developed to replace the analytical indexes..
When the analytical indexes are dropped, the need to update these indexes is taken away..
So, no need to update analytical indexes, and this is where Oracle new In-Memory Option takes advantage in OLTP performance in comparison to the traditional In-Memory systems..
In addition, there is no transactional logging for the column store.. Column are highly compressed.
So OLTP runs dramatically faster with these enhancements. When Oracle updates the data, it goes to row store and makes the update there, without updating any indexes..
If Oracle executes a query, it goes to the column store, which by its nature, supplies x100 query performance..
Also, as Larry Ellison states, there is very little overhead in maintaining the column store. The data is put in the column store on startup or first access, but this doesnt mean that we need to store the entire database in memory..  Consider Exadata, there is a memory hierachy on Exadata. Oracle understands the data and put the data in memory or flash or disks, based on the acess frequency.
In addition to that, columns never had any indexes will benefit from that.
The column processing is optimized, as each cpu core scans columns at billions of rows per second. It s parallelized..
Cpu 's are able to do that because they use vector instructions, known as simd (cpu instruction)
So, with this single instruction, Oracle can scan multiple values, speeding up the scan..

Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.
With a SIMD processor there are two improvements in loading the data and operating on it. A number of values can be loaded all at once. Instead of a series of instructions saying "get this, now get the next", a SIMD processor will have a single instruction that effectively says "get n " (where n is a number that varies from design to design). For a variety of reasons, this can take much less time than "getting" each data individually, as with traditional CPU design.


Flynn's taxonomy
Single instructionMultiple instruction
Single dataSISDMISD
Multiple dataSIMDMIMD
SIMD instructions are widely used to process 3D graphics, although modern graphics cards with embedded SIMD have largely taken over this task from the CPU.

Joins are also faster. Joins are converted into fast columns scans..
In addition to that, the compilicated report objects can be created in memory, and fast scans can be used to populate the report objects in memory..
These options make the Database software Cloud-Ready..

To turn on in memory option in Oracle Database 12c,

Set the parameter  -> inmemory_size=XXX GB
alter table | partition ... inmemory;
Drop analytic indexes ...

No changes required in sql, in application..No data migration is required.
Every application ,everything runs without a single change. RAC works..

Larry Ellison also announced the The Big Memory Machine, as the fastest machine in the World for In Memory Databases..
Big Memory Machine has 32 TB DRAM, brand new processor Sparc M6-32 processor (x2 cores of M5). It has 96 threads per processor and interconnect which is made of silicon(3Tbytes per second) and much much faster than infiniband..
It s avaliable today..  It s also available in SuperCluster form connected with infiniband to Exadata IO Subsystem..
You can gather detailed information about M6-32 from the following Oracle links.
M6-32 -> http://www.oracle.com/us/products/servers-storage/servers/sparc/oracle-sparc/m6-32/overview/index.html
M6-32 supercluster -> http://www.oracle.com/us/products/servers-storage/servers/sparc/supercluster/supercluster-m6-32/overview/index.html

Here is a comparison with IBM p795 gathered from the Ellison's speech.





Information based on this article is based on the Larry Ellison's keynote in Open World 2013..
We could not be there, but thanks to Oracle, we are following remotely :)




No comments :

Post a Comment