# Preface There many been many developments in distributed systems, databases and the applications build on top of them, there are various driving forces: 1. Handling huge volumes of data. 2. Businesses need to be agile, test hypotheses cheaply and respond quickly to markets. 3. Free & open source software has become very successful and is preferred now to commercial or in-house solutions 4. CPU clock speeds are barely increasing. But multi-core processors are standard and networks are getting faster. Parallelism is only going to increase. 5. Even small teams can now build systems that are distributed across machines and regions - thanks to **IaaS** (think *AWS*) 6. Many services are expected to be highly available. Extended downtime is unacceptable. An application is *data-intensive* if data is it's primary challenge. - The quantity of data. - The complexity of data. - The speed at which data changes. This is opposed to *compute-intensive* where the CPU is the bottle neck.