| SCI and Competitive Interconnects for Cluster Computing | |
| The SCI Standard and Applications of SCI | p. 3 |
| Introduction | p. 3 |
| SCI Overview | p. 4 |
| Background | p. 4 |
| Goals | p. 4 |
| Concepts | p. 6 |
| Discussion | p. 11 |
| The SCI Standard and Some Extensions | p. 11 |
| Logical Layer | p. 12 |
| Cache Coherence Layer | p. 19 |
| Extensions | p. 22 |
| Applications of SCI | p. 23 |
| System Area Network for Clusters | p. 23 |
| Memory Interconnect for Cache-Coherent Multiprocessors | p. 26 |
| I/O Subsystem Interconnect | p. 30 |
| Large-Scale Data Acquisition System | p. 31 |
| Related Communication Networks and Concepts | p. 31 |
| Concluding Remarks | p. 34 |
| A Comparison of Three Gigabit Technologies: SCI, Myrinet and SGI/Cray T3D | p. 39 |
| Introduction | p. 39 |
| Levels of Comparison | p. 40 |
| Direct Deposit | p. 41 |
| Message Passing (MPI/PVM) | p. 42 |
| Protocol Emulation (TCP/IP) | p. 44 |
| Gigabit Network Technologies | p. 45 |
| The Intel 80686 Hardware Platform | p. 46 |
| Myricom Myrinet Technology | p. 47 |
| Dolphin PCI-SCI Technology | p. 48 |
| The SGI/Cray T3D - A Reference Point | p. 48 |
| ATM: QoS - But Still Short of a Gigabit/s | p. 50 |
| Gigabit Ethernet - An Outlook | p. 50 |
| Transfer Modes | p. 51 |
| Overview | p. 51 |
| "Native" and "Alternate" Transfer Modes in the Three Architectures | p. 54 |
| Performance Evaluation | p. 56 |
| Performance of Local Memory Copy | p. 58 |
| Performance of Direct Transfers to Remote Memory | p. 58 |
| Performance of MPI/PVM Transfers | p. 61 |
| Performance of TCP/IP Transfers | p. 64 |
| Discussion and Comparison | p. 65 |
| Summary | p. 67 |
| SCI Hardware | |
| Dolphin SCI Adapter Cards | p. 71 |
| Introduction | p. 71 |
| Overview of the Adapter Cards | p. 71 |
| Operating Modes of the SCI Cards | p. 73 |
| SCI Requester | p. 74 |
| Address Mapping | p. 74 |
| SCI Transaction Handling | p. 75 |
| SCI Packet Requester | p. 77 |
| SCI Responder | p. 78 |
| Mailbox | p. 79 |
| Access Protection | p. 79 |
| Atomic Access | p. 79 |
| Host Bridge Capabilities | p. 80 |
| DMA Transfers | p. 80 |
| DMA Transfers on the SBus Card | p. 80 |
| DMA Transfers on the PCI Card | p. 80 |
| Interrupter | p. 81 |
| Concurrency Issues | p. 81 |
| Write Assembly | p. 81 |
| Efficient Store Barrier | p. 81 |
| Performance | p. 82 |
| Applications and Topologies | p. 82 |
| SAN Interface Adapter | p. 83 |
| Remote I/O Connection and Data Acquisition | p. 83 |
| Switches and Topologies | p. 83 |
| Cluster Software | p. 85 |
| The TUM PCI/SCI Adapter | p. 89 |
| Introduction | p. 89 |
| The PCI/SCI Adapter Architecture | p. 90 |
| SCI Packet Encoding and Decoding | p. 92 |
| Overview of Packet Processing | p. 92 |
| Choosing the Technology | p. 92 |
| Internal Structure of the FPGA | p. 93 |
| Structure of the Packet Manag er as a Microcode Sequencer | p. 95 |
| Microcode Examples | p. 97 |
| Benefits of the Micro Sequencer | p. 98 |
| The SCI Unit | p. 99 |
| Preliminary Results for the PCI/SCI Adapter | p. 99 |
| Related Work | p. 100 |
| Conclusion | p. 100 |
| Interconnection Networks with SCI | |
| Low-Level SCI Protocols and Their Application to Flexible Switches | p. 105 |
| Introduction | p. 105 |
| Data Format of SCI Packets | p. 105 |
| Flow Control | p. 107 |
| Flow Control in Rings | p. 107 |
| Packet Sequence in SCI | p. 108 |
| Determination of State Transitions | p. 109 |
| Bandwidth Multiplexing | p. 110 |
| Bandwidth Management in One Ring | p. 110 |
| Idle Symbols | p. 112 |
| Time-Out Determination | p. 113 |
| Network Interface | p. 113 |
| Requirements | p. 114 |
| Products | p. 114 |
| Routers | p. 115 |
| Requirements | p. 115 |
| Products and Challenges | p. 116 |
| Flexible Router | p. 117 |
| Strip-off Decision | p. 118 |
| Routing Decision and Topology | p. 119 |
| Rule-Based Routing | p. 120 |
| Conclusion and Outlook | p. 121 |
| SCI Rings, Switches, and Networks for Data Acquisition Systems | p. 125 |
| Introduction | p. 125 |
| SCI-based Data Acquisition Systems | p. 126 |
| SCINET Test Beds | p. 127 |
| Measurement Results | p. 129 |
| SCI Switches | p. 134 |
| Efficient Use of SCI Switches | p. 136 |
| Multistage SCI Networks | p. 139 |
| Simulation Results | p. 141 |
| Summary and Conclusions | p. 146 |
| Scalability of SCI Ringlets | p. 151 |
| Do SCI Ringlets Scale in Number of Nodes? | p. 151 |
| Ringlet Bandwidth Model | p. 152 |
| Transaction Formats | p. 152 |
| Packet Generation | p. 155 |
| Address Distribution | p. 155 |
| Locality | p. 156 |
| Bypass Rate | p. 157 |
| Echo Packet Rate | p. 158 |
| Output Link Utilization Factor | p. 160 |
| Scalability Evaluation | p. 160 |
| Common Assumptions | p. 161 |
| Uniform Ringlet Traffic | p. 162 |
| Non-uniform Ringlet Traffic | p. 162 |
| Changing Packet Lengths | p. 163 |
| Discussion | p. 163 |
| Conclusion | p. 165 |
| Affordable Scalability Using Multi-Cubes | p. 167 |
| Introduction | p. 167 |
| Interconnect Overview | p. 168 |
| Methodology | p. 168 |
| Analysis | p. 170 |
| "Hot-Link" Analysis | p. 170 |
| "Hot-B-Link" Analysis | p. 171 |
| Results | p. 172 |
| Conclusions | p. 174 |
| Device Driver Software and Low-Level APIs | |
| Interfacing SCI Device Drivers to Linux | p. 179 |
| Introduction | p. 179 |
| Layers of Functionality | p. 180 |
| Address Spaces | p. 180 |
| Levels of Hardware Abstraction | p. 180 |
| Resource Management | p. 182 |
| Virtual Mapping | p. 183 |
| Robustness | p. 184 |
| Why Linux? | p. 185 |
| Interfaces of the Driver | p. 186 |
| Hardware | p. 186 |
| Linux | p. 187 |
| User Processes | p. 188 |
| SCI Drivers on Other Nodes | p. 188 |
| Conclusions | p. 189 |
| SCI Physical Layer API | p. 191 |
| Introduction | p. 191 |
| Scope of the Standard | p. 192 |
| SCI Physical Layer API Architecture and Features | p. 193 |
| Exception Handling | p. 195 |
| Endianness | p. 195 |
| Supported Data Types | p. 196 |
| Miscellaneous Procedures | p. 196 |
| Address Translation Model | p. 197 |
| Global Object Identifier | p. 199 |
| SCI Global Address Resolution | p. 200 |
| Shared Memory Transactions | p. 200 |
| Packet Transactions | p. 202 |
| Block Transactions | p. 202 |
| Message Passing Transactions | p. 203 |
| Cache Transactions | p. 204 |
| Conclusions | p. 205 |
| Message Passing Libraries | |
| SCI Sockets Library | p. 209 |
| Introduction | p. 209 |
| Rationale | p. 209 |
| Overview | p. 210 |
| Features and Design | p. 210 |
| Features | p. 210 |
| Components | p. 211 |
| Communication via the SSLib | p. 212 |
| Connection Setup | p. 214 |
| Handling Special System Calls | p. 216 |
| Other Calls Intercepted and Handled by the SSLib | p. 218 |
| Out of Band Data | p. 218 |
| Implementation Aspects | p. 218 |
| Communication Among Components | p. 218 |
| SSLib Layers | p. 219 |
| Choice of Most Efficient Communication Mechanism | p. 220 |
| SSLib Implementations | p. 221 |
| Control Transfers | p. 221 |
| Functional Tests and Performance | p. 222 |
| Related Work | p. 224 |
| Conclusions | p. 227 |
| TCP/IP over SCI under Linux | p. 231 |
| Introduction | p. 231 |
| SCIP Structure | p. 232 |
| Packet Driver Interface | p. 232 |
| Hardware Address Resolution | p. 232 |
| Other Implementation Issues | p. 233 |
| Performance | p. 234 |
| Configuration | p. 234 |
| Latency | p. 234 |
| Throughput | p. 235 |
| Conclusion | p. 237 |
| PVM for SCI Clusters | p. 239 |
| Overview | p. 239 |
| Parallel Virtual Machine | p. 239 |
| PVM Implementations | p. 240 |
| Models for Zero-Memory-Copy Data Transfer | p. 241 |
| SCI Communication Model | p. 242 |
| PVM-SCI | p. 243 |
| System Architecture | p. 243 |
| Supporting Multiple Interconnects | p. 245 |
| Reducing Memory Copies | p. 245 |
| Ring Buffer Management | p. 246 |
| Performance Results | p. 247 |
| Conclusions | p. 247 |
| ScaMPI - Design and Implementation | p. 249 |
| Introduction | p. 249 |
| Scali Systems | p. 249 |
| The SCI Memory Model | p. 250 |
| Coordinating Use of Shared Locations | p. 251 |
| Ensuring Safe Data Transport in SCI - Checkpointing | p. 252 |
| Shared Address Space Programming without the Drawbacks | p. 252 |
| ScaMPI Design Goals | p. 253 |
| ScaMPI Implementation | p. 254 |
| Fault Tolerance | p. 254 |
| User Friendliness | p. 256 |
| Third Party Software | p. 256 |
| Performance Results | p. 257 |
| Barrier | p. 258 |
| All-to-All Communication | p. 259 |
| Conclusions | p. 260 |
| Shared Memory Programming Models and Runtime Mechanisms | |
| Shared Memory vs Message Passing on SCI: A CaseStudy Using Split-C | p. 267 |
| Introduction | p. 267 |
| Introduction to Split-C | p. 268 |
| Introduction to Active Messages | p. 269 |
| Message-Passing Implementation | p. 269 |
| Active Messages on Top of SCI | p. 269 |
| Split-C on Top of Active Messages | p. 272 |
| Shared Memory Implementation | p. 273 |
| Split-C on Top of SCI | p. 273 |
| Experimental Evaluation | p. 274 |
| Micro-benchmarks | p. 274 |
| Table of Contents provided by Publisher. All Rights Reserved. |