Wednesday, April 1, 2015

Moore's Law, Cloud Computing and DW/BI

In this blog, I will explore the latest DW/BI trends in the context of Moore's law and cloud computing. So without further ado, let us first understand what these terms mean.

Moore's Law:


Moore’s law states that the number of transistors per square inch on an integrated circuit doubles every two years. In other words, the processor speeds, or overall processing power for computers will double every two years. This can be seen in the picture on the left. From the year 1971, the processing power and the number of transistors have been on the increase and this trend is expected to continue.


Cloud Computing:


Cloud computing is the delivery of on-demand computing resources over the Internet on a pay-for-use basis. Cloud computing is defined as a type of computing that relies on sharing computing resources rather than having local servers or personal devices to handle applications.







We can see from the above definitions that, Moore's law and Cloud computing typically deal with handling the processing power and storage needs of companies. If the company was to adopt Moore's law, they own their data storage but run the risk of having a lot of unused storage and processing capability. On the other hand, if they avail the cloud computing capabilities, they rent the storage and pay only for what they use. Of course, cloud is more expensive. Hence, both of them have their pros and cons and the companies need to make the decision and adopt one of them in order to handle their increasing data storage needs. 

Now, let us see how the classical data warehouse has evolved due to the advent of cloud computing.


The Elastic Data warehouse



The Snowflake Elastic Data Warehouse is a new data warehouse built for the cloud. Its architecture separates data storage from compute, making it  able to take advantage of the elasticity, scalability and flexibility of the cloud. It is a relational database with full support for standard SQL. Hence, Snowflake empowers analyst with self-service access to data, which enables organizations to take advantage of the tools and skills that they already have.

This elastic Data Warehouse provides:
  • Data warehousing as a service. With snowflake's data warehouse, analysts can focus on getting value from data rather than on managing hardware and software.
  • Multidimensional elasticity.  Elastic scaling makes it possible to simultaneously load and query data because every user and workload can have exactly the resources needed, without contention.
  • Single service for all business data.  Analysts can query structured and semi-structured data in a single system without compromise.


Amazon Redshift



Amazon Redshift is a petabyte-scale data warehouse solution that makes it simple to analyze a company's data using existing business intelligence tools. Amazon Redshift is fast, cheap, secure, fully managed and scalable.

It delivers fast query performance by using columnar storage technology to improve I/O efficiency and parallelizing queries across multiple nodes. Amazon Redshift uses standard PostgreSQL JDBC and ODBC drivers, allowing companies to use a wide range of familiar SQL clients. With a few clicks of the AWS Management Console or a simple API call, it is easy to change the number and type of nodes in the cloud data warehouse as the company's performance or capacity needs change.

Personally, I believe it is better to pay for what we use instead of owning a lot of unused data storage and processing power that is rarely used. So, I think cloud computing is the way forward for building and scaling data warehouses.


References:

http://www.ibm.com/cloud-computing/us/en/what-is-cloud-computing.html
http://www.webopedia.com/TERM/C/cloud_computing.html