本文简述了对大数据引发科技新挑战的一些粗浅认识,这些认识会随着大数据时代的变迁而改变。作者首先描述了大数据时代需要解决的三个技术问题,即将非结构化与半结构化数据转化为可分析的数据形式;大数据复杂性描述与建模;数据异构性与决策异构性。本文还进一步讨论了特定的大数据流以及数据驱动的决策管理等基本的大数据问题。在结论中,本文提出了一些数据科学的公开性研究问题,这些问题日益发展,最终将超越大数据的范畴。
This paper briefly presents the author’s understanding to the new challenges of science and technology, triggered by Big Data. These observations could change along with the changes of Big Data Era. Firstly, it describes three problems that need to be solved in the era of big data: 1) how to transform unstructured and semi-structured data into a data format that can be analyzed via known data mining; 2) how to describe and model the complexity of Big Data; 3) Data Heterogeneity and Decision-making Heterogeneity. This paper further discusses other important issues of Big Data, such as the Big Data Streams and Data-driven Decision-making. In the conclusion, this paper provides some open research problems, including Data Science, which is more fundamental and may eventually exceed the scope of Big Data.