| The Anatomy of the Grid: Enabling Scalable Virtual Organizations | p. 1 |
| Software Component Technology for High Performance Parallel and Grid Computing | p. 5 |
| Macro- and Micro-parallelism in a DBMS | p. 6 |
| An Introduction to the Gilgamesh PIM Architecture | p. 16 |
| High Performance Computing and Trends: Connecting Computational Requirements with Computing Resources | p. 33 |
| Support Tools and Environments | p. 34 |
| Dynamic Performance Tuning Environment | p. 36 |
| Self-Organizing Hierarchical Cluster Timestamps | p. 46 |
| A Tool for Binding to Threads Processors | p. 57 |
| VizzScheduler - A Framework for the Visualization of Scheduling Algorithms | p. 62 |
| A Distributed Object Infrastructure for Interaction and Steering | p. 67 |
| Checkpointing Facility on a Metasystem | p. 75 |
| Optimising the MPI Library for the T3E | p. 80 |
| Performance Evaluation and Prediction | p. 84 |
| Optimal Polling for Latency-Throughput Tradeoffs in Queue-Based Network Interfaces for Clusters | p. 86 |
| Performance Prediction of Oblivious BSP Programs | p. 96 |
| Performance Prediction of Data-Dependent Task Parallel Programs | p. 106 |
| The Tuning Problem on Pipelines | p. 117 |
| The Hardware Performance Monitor Toolkit | p. 122 |
| VIA Communication Performance on a Gigabit Ethernet Cluster | p. 132 |
| Performance Analysis of Intel's MMX and SSE: A Case Study | p. 142 |
| Group-Based Performance Analysis for Multithreaded SMP Cluster Applications | p. 148 |
| Scheduling and Load Balancing | p. 154 |
| ON Minimising the Processor Requirements of LogP Schedules | p. 156 |
| Exploiting Unused Time Slots in List Scheduling Considering Communication Contention | p. 166 |
| An Evaluation of Partitioners for Parallel SAMR Applications | p. 171 |
| Load Balancing on Networks with Dynamically Changing Topology | p. 175 |
| A Fuzzy Load Balancing Service for Network Computing Based on Jini | p. 183 |
| Approximation Algorithms for Scheduling Independent Malleable Tasks | p. 191 |
| The Way to Produce the Quasi-workload in a Cluster | p. 198 |
| Compilers for High Performance | p. 204 |
| Handling Irreducible Loops: Optimized Node Splitting vs. DJ-Graphs | p. 207 |
| Load Redundancy Eliminations on Executable Code | p. 221 |
| Loop-Carried Code Placement | p. 230 |
| Using a Swap Instruction to Coalesce Loads and Stores | p. 235 |
| Data-Parallel Compiler Support for Multipartitioning | p. 241 |
| Cache Models for Iterative Compilation | p. 254 |
| Data Sequence Locality: A Generalization of Temporal Locality | p. 262 |
| Efficient Dependence Analysis for Java Arrays | p. 273 |
| Parallel and Distributed Databases, Data Mining and Knowledge Discovery | p. 278 |
| An Experimental Performance Evaluation of Join Algorithms for Parallel Object Databases | p. 280 |
| A Classification of Skew Effects in Parallel Database Systems | p. 291 |
| Improving Concurrency Control in Distributed Databases with Predeclared Tables | p. 301 |
| Parallel Tree Projection Algorithm for Sequence Mining | p. 310 |
| Parallel Pruning for K-Means Clustering on Shared Memory Architectures | p. 321 |
| Experiments in Parallel Clustering with DBSCAN | p. 326 |
| Complexity Theory and Algorithms | p. 332 |
| Beyond External Computing: Analysis of the Cycle Structure of Permutations | p. 333 |
| Heaps Are Better than Buckets: Parallel Shortest Paths on Unbalanced Graphs | p. 343 |
| Efficient Synchronization of Asynchronous Processes | p. 352 |
| Applications on High-Performance Computers | p. 358 |
| Scanning Biosequence Databases on a Hybrid Parallel Architecture | p. 360 |
| A Parallel Computation of Power System Equations | p. 371 |
| Level-3 Trigger for a Heavy Ion Experiment at LHC | p. 375 |
| Experiences in Using MPI-IO on Top of GPES for the IFS Weather Forecast Code | p. 380 |
| Instruction-Level Parallelism and Computer Architecture | p. 385 |
| Branch Prediction Using Profile Data | p. 386 |
| An Efficient Indirect Branch Predictor | p. 394 |
| The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures | p. 403 |
| Improving Conditional Branch Prediction on Speculative Multithreading Architectures | p. 413 |
| Instruction Wake-Up in Wide Issue Superscalars | p. 418 |
| Execution Latency Reduction via Variable Latency Pipeline and Instruction Reuse | p. 428 |
| Memory Bandwidth: The True Bottleneck of SIMD Multimedia Performance on an Superscalar Processor | p. 439 |
| Macro Extension for SIMD Processing | p. 448 |
| Performances of a Dynamic Threads Scheduler | p. 452 |
| Distributed Systems and Algorithms | p. 457 |
| Self-stabilizing Neighborhood Unique Naming under Unfair Scheduler | p. 458 |
| Event List Management in Distributed Simulation | p. 466 |
| Performance Evaluation of Plausible Clocks | p. 476 |
| Building TMR-Based Reliable Servers Despite Bounded Input Lifetimes | p. 482 |
| Fractional Weighted Reference Counting | p. 486 |
| Parallel Programming: Models, Methods and Programming Languages | p. 491 |
| Accordion Clocks: Logical Clocks for Data Race Detection | p. 494 |
| Partial Evaluation of Concurrent Programs | p. 504 |
| A Transparent Operating System Infrastructure for Embedding Adaptability to Thread-Based Programming Models | p. 514 |
| Nepal - Nested Data Parallelism in Haskell | p. 524 |
| Introduction of Static Load Balancing in Incremental Parallel Programming | p. 535 |
| A Component Framework for HPC Applications | p. 540 |
| Towards Formally Refining BSP Barriers into Explicit Two - Sided Communications | p. 549 |
| Solving Bi-knapsack Problem Using Tiling Approach for Dynamic Programming | p. 560 |
| Numerical Algorithms | p. 566 |
| Parallel Implementation of a Block Algorithm for Matrix 1-Norm Estimation | p. 568 |
| Eigenvalue Spectrum Estimation and Photonic Crystals | p. 578 |
| Polynomial Preconditioning for Specially Structured Lienar Systems of Equations | p. 587 |
| Parallel Application of a Novel Domain Decomposition Preconditioner for the Stable Finite Element Solution of Three-Dimensional Convection-Dominated PDEs | p. 592 |
| Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture | p. 602 |
| Routing and Communication in Interconnection Networks | p. 611 |
| An Analytical Model of Deterministic Routing in the Presence of Hot-Spot Traffic | p. 613 |
| Improving the Accuracy of Reliability Models for Direct Interconnection Networks | p. 621 |
| On Deadlock Frequency during Dynamic Reconfiguration in NOWs | p. 630 |
| Analysis of Broadcast Communication in 2D Tori | p. 639 |
| Optimal Many-to-One Routing on the Mesh with Constant Queues | p. 645 |
| Multimedia and Embedded Systems | p. 651 |
| A Software Architecture for User Transparent Parallel Image Processing on MIMD Computers | p. 653 |
| A Case Study of Load Distribution in Parallel View Frustum Culling and Collision Detection | p. 663 |
| Parallelisable Zero-Tree Image Coding with Significance Maps | p. 674 |
| Performance of the Complex Streamed Instruction Set on Image Processing Kernels | p. 678 |
| A Two Dimensional Vector Architecture for Multimedia | p. 687 |
| Multiprocessor Clustering for Embedded Systems | p. 697 |
| Cluster Computing | p. 702 |
| Prioritizing Network Event Handling in Clusters of Workstations | p. 704 |
| Fault Tolerance for Cluster Computing Based on Functional Tasks | p. 712 |
| PAPI Message Passing Library: Comparison of Performance in User and Kernel Level Messaging | p. 717 |
| Implementing Java on Clusters | p. 722 |
| Predictive Coscheduling Implementation in a Non-dedicated Linux Cluster | p. 732 |
| Self-Adjusting Scheduling of Master-Worker Applications on Distributed Clusters | p. 742 |
| Smooth and Efficient Integration of High-Availability in a Parallel Single Level Store System | p. 752 |
| Optimal Scheduling of Aperiodic Jobs on Cluster | p. 764 |
| HMM: A Cluster Membership Service | p. 773 |
| Dynamic Processor Allocation in Large Mesh-Connected Multicomputers | p. 783 |
| A New Communication Mechanism for Cluster Computing | p. 793 |
| Isolated Dynamic Clusters for Web Hosting | p. 801 |
| Metacomputing and Grid Computing | p. 805 |
| Cactus Application: Performance Predictions in Grid Environments | p. 807 |
| Cactus Grid Computing: Review of Current Development | p. 817 |
| UNICORE: A Grid Computing Environment | p. 825 |
| Portable Parallel CORBA Objects: An Appraoch to Combine Parallel and Distributed Programming for Grid Computing | p. 835 |
| CORBA Lightweight Components: A Model for Distributed Component-Based Heterogeneous Computation | p. 845 |
| Building Computational Communities from Federated Resources | p. 855 |
| Scalable Causal Message Logging for Wide-Area Environments | p. 864 |
| From Cluster Monitoring to Grid Monitoring Based on GRM | p. 874 |
| Use of Agent-Based Service Discovery for Resource Management in Metacomputing Environment | p. 882 |
| Parallel I/O and Storage Technology | p. 887 |
| Optimal Partitioning for Efficient I/O in Spatial Databases | p. 889 |
| Improving Network Performance by Efficiently Dealing with Short Control Messages in Fibre Channel SANs | p. 901 |
| Improving MPI-I/O Performance on PVFS | p. 911 |
| Problem Solving Environments | p. 916 |
| Remote Visualization of Distributed Electro-Magnetic Simulations | p. 918 |
| Solving Initial Value Problems with Parallel Maple Processes | p. 926 |
| Design of Problem-Solving Environment for Contingent Claim Valuation | p. 935 |
| Author Index | p. 939 |
| Table of Contents provided by Blackwell. All Rights Reserved. |