数据探索(data exploration)是有别于数据服务与数据分析的第3种体现大数据价值的技术手段。数据服务强调从微观层面获取满足用户需求的精准信息;数据分析强调从宏观层面为用户提供数据洞察,进而提供决策支持;而数据探索是一种支持用户在微观层面和宏观层面进行自由切换的、深入浅出的、交互式发掘数据价值的方式。首先,简要介绍大数据价值发掘的传统技术手段和特点,并引入探索式搜索;其次,详细阐述探索式搜索的定义与模型,总结探索式搜索的特点;随后,基于组件化的思想,设计探索式搜索系统框架,并综述每个组件所涉及到的挑战与关键技术;最后简要介绍了笔者在知识库探索式搜索方面的尝试。
Exploratory search is a new approach for discovering the value of big data, compared with data serving and data analysis. Data serving emphasizes to meet users' information need at the micro-level, and data analysis emphasizes to discover insights among data at the macro-level. However, exploratory search is a way to support user to freely swap between micro-level to macro-level and interactively explore the value of data as well. Firstly, approaches for discovering the value of big data were discussed. Secondly, the definition, model and characteristics of exploratory search were illustrated. Thirdly, the architecture of exploratory search systems was designed, and a review of the challenges and techniques of each component of the architecture were given. Finally, preliminary results of exploratory search in RDF knowledge bases were introduced.