The limited off-chip bandwidth of memory accesses increasingly becomes the bottleneck of entire stream processing system. Many methods have been adopted into stream memory system to alleviate this problem,but current design didn't consider enough about the relationship between application-specific memory accessing patterns and the utilization rate of off-chip bandwidth. This paper first estimates the effect of primary design parameters targeted on different access patterns through analysis and experiments. Based on these results,some architecture modifications are proposed for various parallel degrees of stream accesses. By widening the address generators and adding short-task priority scheduling,the locality and parallelism among memory accesses are explored fully,along with better load balance. These optimizations can significantly improve the utilization efficiency of DRAM bandwidth and further boost the final performance of the entire streaming program.