Recently I became curious about Hadoop and Cassandra and decided to read more about them. The current trend in information technology creates a demand for different approaches for data storage. With an unimaginable amount of data flowing through the Internet (think Facebook, Twitter and Google), there has been a demand for distributed data storage which is also reliable and fast.
Hadoop is, simply put, a framework for creating a cluster of worker nodes, managed by a master node, which distributes tasks among the worker nodes. It is maintained by the Apache Software Foundation, and runs on Java.
Cassandra is a distributed and scalable database system, open-sourced by Facebook in 2008, and now maintained by the Apache Software Foundation. It is totally different from a relational database system, which has been a hype in the ’90s. Everything I learned about databases, did not apply to Cassandra.
So I decided to buy a book – the O’Reilly book on Cassandra – in Japanese.