随着大数据技术在不同领域的快速应用,构建大数据应用系统的开发与运行一体化平台,降低大数据技术在各行各业应用普及的门槛,为面向领域的大数据应用系统的快捷开发和高效运行提供方法、工具和平台支撑,成为大数据产业发展的迫切需求.由于大数据固有的复杂性、动态性、多样性及其价值独创性,目前尚未形成系统化的大数据软件开发方法,难以满足不同领域对大数据全生命周期处理的多样化需求.大数据时代的软件工程面临的挑战,体现在互为依赖的两方面:面向大数据全生命周期的集成设计开发环境和基于软件生命周期的系统运行分析工具.结合特定领域的实际需求,研究面向领域的大数据应用系统开发与运行一体化平台技术,覆盖大数据生命周期(获取、清洗、集成、分析、呈现)以及软件生命周期(设计、开发、运行、优化),形成自管理、自适应、自优化的平台化解决方案.在此基础上,开展面向装备物联网及气象民生服务的大数据示范应用,以验证平台的有效性.
Big data technology is widely adopted across many disciplines. In order to build sustainable big data application systems and facilitate its rapid development and delivery of expected values with minimum efforts, innovative software engineering methodology and an integrated development and management platform for big data applications are in dire needs. Big data is complex, volatile, lack of correlation and value scarce by nature, which makes it difficult to form standardized and systematic technological solutions to meet the diversified requirements for life cycle management of big data in different application domain. Software engineering in big data era has to address two major challenges: data life cycle management with integrated development environment and software life cycle management using run-time behavior analysis tool. This paper proposes a domain requirements driven approach for big data application systems development and run-time support platform, covering the entire big data life-cycle, including dada collection, storage, computation, analysis, visualization, as well as the software systems life cycle. This platform forms a self-managing, self-adaptive, self-optimizing solution. The proposed techniques are applied in specific application domains such as industry 4.0 and meteorological engineering to provide an illustration and validation of the new platform.