No abstract available.
Proceeding Downloads
QCD with dynamical fermions on the connection machine
We have implemented Quantum Chromo-Dynamics (QCD) on the massively parallel Connection Machine in *Lisp. The code uses dynamical Wilson fermions and the Hybrid Monte Carlo Algorithm (HMCA) to update the lattice. We describe our program and give ...
Vectorization on Monte Carlo particle transport: an architectural study using the LANL benchmark “GAMTEB”
Fully vectorized versions of the Los Alamos National Laboratory benchmark code Gamteb, a Monte Carlo photon transport algorithm, were developed for the Cyber 205/ETA-10 and Cray X-MP/Y-MP architectures. Single-processor performance measurements of the ...
Parallelizing a large scientific code - methods, issues, and concerns
Objectives of this study were to develop techniques and methods for effective analysis of large codes; to determine the feasibility of parallelizing an existing large scientific code; and to estimate potential speedups attainable, and associated ...
Benchmark calculations with an unstructured grid flow solver on a SIMD computer
An unstructured grid flow solver was implemented on a massively parallel computer, and benchmark computations were performed. The solver was a two-dimensional computational fluid dynamics (CFD) code that performs first-order, steady-state solutions of ...
Implementation of a hypersonic rarefied flow particle simulation on the connection machine
A very efficient direct particle simulation algorithm for hypersonic rarefied flows is presented and its implementation on a Connection Machine is described. The implementation is capable of simulating up to 4 x 106 hard sphere diatomic molecules using ...
Computational aerothermodynamics
Aerothermodynamics is defined1 as “the study of the relationship of heat and mechanical energy in gases, especially air”. To those familiar with fluid dynamics (the study of the flow properties of liquids and gases) this means that we must consider ...
Practical parallel supercomputing: examples from chemistry and physics
We use two large simulations, the chemical reaction dynamics of H + H2 and the collision of two galaxies to show that current parallel machines are capable of large supercomputer level calculations. We contrast the different architectural tradeoffs for ...
Capability of current supercomputers for the computational fluid dynamics
The computer code named LANS3D, one of the representative Navier-Stokes codes in Japan, is taken as a example and the capability of the current CFD technology is discussed. This code was developed for the numerical simulation of high-Reynolds number ...
Supercomputing of circuits simulation
The Circuit Analysis is very important in the development of LSI. We have been conducting the speed-up of the circuit analysis program SPICE-GT. SPICE-GT is based on the SPICE 2G.6 program from the University of California Berkely. We achieved speed-up ...
Computations of soil temperature rise due to HVDC ground return
The purpose of this paper is to present an application which historically, did not make use of computing methodology in the solution of design problems. The design of High Voltage Direct Current (HVDC) ground electrodes involves the careful selection of ...
A radar simulation program for a 1024-processor hypercube
We have developed a fast parallel version of an existing synthetic aperture radar (SAR) simulation program, SRIM. On a 1024-processor NCUBE hypercube it runs an order of magnitude faster than on a CRAY X-MP or CRAY Y-MP processor. This speed advantage ...
Parallel MIMD programming for global models of atmospheric flow
Modeling atmospheric flow is one application of supercomputers. In this paper we present some concepts for implementing global flow algorithms on shared memory multiprocessors. We describe how an analysis of the algorithms combined with the appropriate ...
Computational fluid dynamic-current capabilities and directions for the future
Computational fluid dynamics (CFD) has made great strides in the detailed simulation of complex fluid flows, including some of those not before understood. It is now being routinely applied to some rather complicated problems, and starting to impact the ...
Parallel algorithm and VLSI architecture for a robot's inverse kinematics
The inverse solutions of a robotic systems are generally produced by a serial process. Due to the computing time of processing geometry data and generating an inverse solution corresponding to a specified point in Cartesian trajectory is larger than the ...
Supercomputers in computational ocean acoustics
In this paper, we report on some computational experience in solving ocean acoustic propagation problems in three dimensions on supercomputers. The underlying Helmholtz equation is transformed into a parabolic-type equation in the Lee-Saad-Schultz model ...
A study of dissipation operators for the euler equations and a three- dimensional channel flow
Explicit methods for the solution of fluid flow problems are of considerable interest in supercomputing. These methods parallelize well. The treatment of the boundaries is of particular interest both with respect to the numeric behavior of the solution, ...
A computer assisted optimal depth lower bound for sorting networks with nine inputs
It is demonstrated that there is no nine-input sorting network of depth six. The proof was obtained by executing on a supercomputer a branch-and-bound algorithm which constructs and tests a critical subset of all possible candidates. Such proofs can be ...
Realities associated with parallel processing
At the T. J. Watson Research Center, there is a very active Condensed Matter Physics Group engaged in the study of semiconductors such as silicon (Si) and gallium-arsenide (Ga-As)1. One of the most important computer codes developed at Watson is a ...
How a SIMD machine can implement a complex cellular automata? a case study: von Neumann's 29-state cellular automaton
This study is a part of an effort to simulate the 29-state self-reproducing cellular automaton described by John von Neumann in a manuscript that dates back to 1952. We are interested in the programming of very large SIMD arrays which, as a consequence ...
Automatic vectorization of character string manipulation and relational operations in Pascal
In our paper of Supercomputing '88, an overview of V-Pascal, an automatic vectorizing compiler for Pascal, was presented with focus on its Version 1. In that paper, as one of those higher functions to be added to Version 2 V-Pascal, vector-mode ...
Neural network simulation on shared-memory vector multiprocessors
We simulate three neural networks on a vector multiprocrssor. The training time can be reduced significantly especially when the training data size is large. These three neural networks are: 1) the feedforward network, 2) the recurrent network and 3) ...
Concurrent and vectorized Monte Carlo simulation of the evolution of an assembly of particles increasing in number
Parallel Monte Carlo techniques for simulating the evolution of an assembly of charged particles interacting with a background gas medium under the influence of the electrical field are presented. This simulation problem has inherent parallelism in ...
Protein structure prediction by a data-level parallel algorithm
We have developed a software system, PHI-PSI, on the Connection Machine that uses a parallel algorithm to retrieve and use information from a database of 112 known protein structures (selected from the Brookhaven Protein Databank) to predict the ...
Vector and parallel algorithms for Cholesky factorization on IBM 3090
In many engineering applications, a solution of Fx = b is required, where F is a positive definite symmetric matrix. This is usually done by the Cholesky factorization, F = RRT, where R is the lower triangular Cholesky factor. This is a compute ...
FFTs in external of hierarchical memory
Conventional algorithms for computing large one-dimensional fast Fourier transforms (FFTs), even those algorithms recently developed for vector and parallel computers, are largely unsuitable for systems with external or hierarchical memory. The ...
Macrotasking the singluar value decomposition of block circulant matrices on the Cray-2
A parallel algorithm to compute the singular value decomposition (SVD) of block circulant matrices on the Cray-2 is described. For a block circulant form described by M blocks with m x n elements in each block, the computation time using an SVD ...
A block QR factorization algorithm using restricted pivoting
This paper presents a new algorithm for computing the QR factorization of a rank-deficient matrix on high-performance machines. The algorithm is based on the Householder QR factorization algorithm with column pivoting. The traditional pivoting strategy ...
Tuning the rank-n update in a wavefront solver for peak performance
The wavefront solver is a type of linear equation solver that is suitable for solving the system of linear equations that arises in many finite-element applications. A new version of the wavefront solver was recently introduced into the ANSYS® program ...
Load balancing and task decomposition techniques for parallel implementation of integrated vision systems algorithms
Integrated vision systems employ a sequence of image understanding algorithms in which the output of an algorithm is the input of the next algorithm in the sequence. Algorithms that constitute an integrated vision systems exhibit different ...
Efficient computation of the singular value decomposition on cube connected SIMD machine
The singular value decomposition (SVD) has many real-time applications. Recently, there has been much interest in developing efficient methods to compute SVD in parallel machines. This paper presents an efficient method for computing SVD in a cube ...
Index Terms
- Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
SC '17 | 327 | 61 | 19% |
SC '16 | 442 | 81 | 18% |
SC '15 | 358 | 79 | 22% |
SC '14 | 394 | 83 | 21% |
SC '13 | 449 | 91 | 20% |
SC '12 | 461 | 100 | 22% |
SC '11 | 352 | 74 | 21% |
SC '10 | 253 | 51 | 20% |
SC '09 | 261 | 59 | 23% |
SC '08 | 277 | 59 | 21% |
SC '07 | 268 | 54 | 20% |
SC '06 | 239 | 54 | 23% |
SC '05 | 260 | 62 | 24% |
SC '04 | 200 | 60 | 30% |
SC '03 | 207 | 60 | 29% |
SC '02 | 230 | 67 | 29% |
SC '01 | 240 | 60 | 25% |
SC '00 | 179 | 62 | 35% |
Supercomputing '95 | 241 | 69 | 29% |
Supercomputing '93 | 300 | 72 | 24% |
Supercomputing '92 | 220 | 75 | 34% |
Supercomputing '91 | 215 | 83 | 39% |
Overall | 6,373 | 1,516 | 24% |