函数依赖是关系数据库和数据建模中所需的关键约束知识。在海量数据中挖掘函数依赖时为降低噪音干扰和提高效率,该文采用带有满意度函数依赖的概念及挖掘带有满意度函数依赖的算法(MFDD),对噪音进行测度与表达,并有效挖掘得到函数依赖最小集。利用对属性散列度的测度概念,在带有满意度函数依赖的理论框架内采用3条优化策略,实现了属性预扫描算法。结果表明:基于该算法可显著提高挖掘效率。
The functional dependency (FD) is a key constraint knowledge in relational databases and data modeling. However, noisy data and low efficiencies restrict the ability to mine functional dependencies in massive databases. Functional dependencies with degrees of satisfaction were used to discover minimal sets of functional dependencies (MFDD). The method not only measures the noises, but also efficiently discovers the minimal set of functional dependencies. A degree of diversity was used with a pre-scanning operation to evaluate the attribute value diversity to develop three optimization strategies for the functional dependency with a degree of satisfaction. Both theoretical analyses and test results show that the algorithm significantly improves the mining efficiency.