In addition to their role as simulation engines, modern supercomputers can be harnessed for scientific visualization. Their extensive concurrency, parallel storage systems, and high-performance interconnects can mitigate the expanding size and complexity of scientific datasets and prepare for in situ visualization of these data. In ongoing research into testing parallel volume rendering on the IBM Blue Gene/P (BG/P), we measure performance of disk I/O, rendering, and compositing on large datasets, and evaluate bottlenecks with respect to system-specific I/O and communication patterns. To extend the scalability of the direct-send image compositing stage of the volume rendering algorithm, we limit the number of compositing cores when many small messages are exchanged. To improve the data-loading stage of the volume renderer, we study the I/O signatures of the algorithm in detail. The results of this research affirm that a distributed-memory computing architecture such as BG/P is a scalable platform for large visualization problems.