The term commodity hardware is cited. However, commodity is referred to as:
(This doesn’t sound much like the romantic notion of dirt cheap infrastructure to me!)
- mid-level rack servers with dual sockets
- As much error-correcting RAM as is affordable
- SATA drives optimised for RAID storage (think IOPS)
SATA Data Transfer Rate
Version Gbits/sec MBytes/sec Year
1.0 (I) 1.5 150 2001
2.0 (II, 3G) 3.0 300 2004
3.0 (III, 6G) 6.0 600 2009
3.2 (Express) 16.0 1969 2013
More on RAID here
Using RAID on the DataNode FS used to store HFDS content is a bad idea because HDFS already has replication and error-checking bullt in. RAID is strongly recommended on the NameNode for additional security (HDFS uses disks to durably store metadata about the FS).
Topology: All of the master and slave nodes must be able to open connections to each other. Client nodes need to be able to talk to all of the master and slave nodes.