In this blog, I will explore the latest DW/BI
trends in the context of Moore's law and cloud computing. So without further
ado, let us first understand what these terms mean.
Moore's
Law:
Moore’s law states that the number of
transistors per square inch on an integrated circuit doubles every two years.
In other words, the processor speeds, or overall processing power for computers
will double every two years. This can be seen in the picture on the left. From
the year 1971, the processing power and the number of transistors have been on
the increase and this trend is expected to continue.
Cloud Computing:
Cloud computing is the delivery of on-demand computing resources over the Internet on a pay-for-use basis. Cloud
computing is defined as a type of computing that relies on sharing
computing resources rather than having local servers or personal devices to
handle applications.
We can see from the above definitions that, Moore's
law and Cloud computing typically deal with handling the processing power and
storage needs of companies. If the company was to adopt Moore's law, they own
their data storage but run the risk of having a lot of unused storage and
processing capability. On the other hand, if they avail the cloud computing
capabilities, they rent the storage and pay only for what they use. Of course,
cloud is more expensive. Hence, both of them have their pros and cons and the companies
need to make the decision and adopt one of them in order to handle their
increasing data storage needs.
Now, let us see how the classical data warehouse
has evolved due to the advent of cloud computing.
The Elastic Data
warehouse
The Snowflake Elastic Data Warehouse is a new data warehouse built for the cloud. Its architecture separates data storage from compute, making it able to take advantage of the elasticity, scalability and flexibility
of the cloud. It is a relational database with full support for standard
SQL. Hence, Snowflake empowers analyst with self-service access to data, which
enables organizations to take advantage of the tools and skills that they
already have.
This elastic Data Warehouse provides:
- Data warehousing as a service. With snowflake's data warehouse, analysts can focus on getting value from data rather than on managing hardware and software.
- Multidimensional elasticity. Elastic scaling makes it possible to simultaneously load and query data because every user and workload can have exactly the resources needed, without contention.
- Single service for all business data. Analysts can query structured and semi-structured data in a single system without compromise.
Amazon
Redshift
Amazon Redshift is a petabyte-scale data warehouse solution that makes it simple to analyze a company's data using existing business intelligence
tools. Amazon Redshift is fast, cheap, secure, fully managed and scalable.
It delivers fast query performance
by using columnar storage technology to improve I/O efficiency and
parallelizing queries across multiple nodes. Amazon Redshift uses standard
PostgreSQL JDBC and ODBC drivers, allowing companies to use a wide range of familiar
SQL clients. With
a few clicks of the AWS Management
Console or a simple API call, it is easy to change the number and type of
nodes in the cloud data warehouse as the company's performance or capacity needs
change.
Personally, I believe it is better to pay for what we use instead of owning a lot of unused data storage and processing power that is rarely used. So, I think cloud computing is the way forward for building and scaling data warehouses.
References:
http://www.ibm.com/cloud-computing/us/en/what-is-cloud-computing.html
http://www.webopedia.com/TERM/C/cloud_computing.html