|
|
|
What is a cluster file system?
A cluster file system is a system that clusters multiple independent nodes (servers) over the network and delivers it as a single storage unit. While RAID groups employ multiple disks to multiply the capacity, speed and stability, the cluster file system groups the multiple storage servers to enable high capacity (up to hundreds of TB), high bandwidth (up to hundreds of Gbps), and high availability (24*7 service), something that is difficult to accomplish with a RAID group.
What are the potential problems in applying this system in the Internet service environment?
Typical cluster file systems emphasize high capacity and stability, since they are designed with the application as the main use. Therefore, when applied in the Internet service environment, in which traffic (delivery) is the key issue, problems can arise in handling the natural increase in traffic while maintaining high availability (24*7 service) as the number of users increases.
What is CDNetworks¡¯ cluster file system for digital contents?
The cluster file system used by CDNetworks is optimized for digital content services and offers high traffic volume handling capability and high availability, while ensuring large capacity and stability.
The following table shows the characteristics of digital content. For UGC video, a typical example of digital content, content uploaded (written) by a relatively small number of users is viewed (read simultaneously) by a large number of users. CDNetworks¡¯ cluster file system reflects that fact by allocating more resource cost to technically process the write and emphasizing the simultaneously read content in order to make the Internet service environment run more efficiently.
|
| Type |
Typical File |
Digital Content |
| File Size |
Typically 100¡¯s KB |
Up to 100¡¯s MB |
| Capacity |
Gradual increase |
Explosive growth |
| Usage |
Write and modify oriented |
Read oriented |
| Read |
Random access |
Sequential Access |
| Simultaneity |
Low simultaneity |
High simultaneity |
| Request Characteristics |
Spread (Generally balanced request) |
Hot-Spot (Requests concentrated on some contents) |
|
|
Improved efficiency of digital content
|
|
Storage node grouping and traffic distribution / balancing between the groups
|
|
Expandable storage nodes and linear traffic processing capability increase
|
|
Automatic recovery from node problem
|
|
Automatic/semi-automatic balancing between nodes
|
|
Node direct type, which eliminates the relay node
|
|
CDN cache extension function (Hot-Spot, Event Traffic and CDN interface)
|
|
Support for standard interface protocols (WebDAV and HTTP)
|
|
Interface-able with various systems (Program API, mount and network drive connection)
|
|
User friendly, powerful administration tool (CLI Administration Tool and Web Administration support)
|
|
Various statistics functions (Hot-Content, traffic trend and node specific trace data)
|
|