Since gem5 (Ruby) models the memory hierarchy, future revisions of GPGPU-Sim could more rigorously define the interface between the shader pipeline and the memories. In particular, neither the fetch or memory access (ldst_unit) pipeline stages in GPGPU-Sim export a clear API for connecting with caches. We propose a more rigorous use of pipeline registers to communicate to and from the fetch and ldst units (e.g. see gem5/src/cpu/inorder/*). These units can then be implemented either within or outside GPGPU-Sim, depending on integration aims. If the desire is to implement the fetch and/or ldst units within GPGPU-Sim but to allow tight integration with gem5, we recommend that GPGPU-Sim export the gem5 port interface (gem5/src/mem/port.*) to be connected to the gem5 memory system.
All of the files in gem5-gpu/gpu/gpgpu-sim are thin wrappers around GPGPU-Sim objects. We foresee generalizing these interfaces with C++ inheritance. Depending on how this is implemented, these wrappers could live in either gem5-gpu or GPGPU-Sim.
While generalizing these interfaces the Future gem5-gpu Software Architecture Vision will likely be followed. In particular, the CudaContext - a construct orthogonal to the choice of GPU simulator - will be moved out of CudaGPU to slim it down.