Bifrost: a Python/C++ Framework for High-Throughput Stream Processing in Astronomy

Radio astronomy observatories with high throughput back end instruments require real-time data processing. While computing hardware continues to advance rapidly, development of real-time processing pipelines remains difficult and time-consuming, which can limit scientific productivity. Motivated by this, we have developed Bifrost: an open-source software framework for rapid pipeline development. Bifrost combines a high-level Python interface with highly efficient reconfigurable data transport and a library of computing blocks for CPU and GPU processing. The framework is generalizable, but initially it emphasizes the needs of high-throughput radio astronomy pipelines, such as the ability to process data buffers as if they were continuous streams, the capacity to partition processing into distinct data sequences (e.g., separate observations), and the ability to extract specific intervals from buffered data. Computing blocks in the library are designed for applications such as interferometry, pulsar dedispersion and timing, and transient search pipelines. We describe the design and implementation of the Bifrost framework and demonstrate its use as the backbone in the correlation and beamforming back end of the Long Wavelength Array station in the Sevilleta National Wildlife Refuge, NM.

Miles D. Cranmer, Benjamin R. Barsdell, Danny C. Price, Jayce Dowell, Hugh Garsden, Veronica Dike, Tarraneh Eftekhari, Alexander M. Hegedus, Joseph Malins, Kenneth S. Obenberger, Frank Schinzel, Kevin Stovall, Gregory B. Taylor, Lincoln J. Greenhill

JAI publication

arXiv:1708.00720

Further details

GitHub

Documentation and Tutorials

Example usage:

import bifrost as bf
import bifrost.blocks as blocks
import bifrost.views as views

raw_data = blocks.read_wav(['heyjude.wav'], gulp_nframe=4096)
gpu_raw_data = blocks.copy(raw_data, space='cuda')
chunked_data = views.split_axis(gpu_raw_data, 'time', 256, label='fine_time')
fft_output = blocks.fft(chunked_data, axes='fine_time', axis_labels='freq')
squared = blocks.detect(fft_output, mode='scalar')
transposed = blocks.transpose(squared, ['time', 'pol', 'freq'])
host_transposed = blocks.copy(transposed, space='cuda_host')
quantized = bf.blocks.quantize(host_transposed, 'i8')
blocks.write_sigproc(quantized)

pipeline = bf.get_default_pipeline()
pipeline.shutdown_on_signals()
pipeline.run()