In this paper,we present a novel approach utilizing attributes correlation for the sampling task on nonuniform hidden databases. We propose the method of calculating the attributes dependency and construct the sampling template according to the attributes dependency. Then,we use the sampling template to gen-erate initial sampling queries and propose a bottom-up algorithm to search the sampling template. We also conduct extensive ex-periments over real deep Web sites and controlled databases to illustrate that our sampling method has good performance both on the quality and efficiency.
In this paper,we present a novel approach utilizing attributes correlation for the sampling task on nonuniform hidden databases. We propose the method of calculating the attributes dependency and construct the sampling template according to the attributes dependency. Then,we use the sampling template to gen-erate initial sampling queries and propose a bottom-up algorithm to search the sampling template. We also conduct extensive ex-periments over real deep Web sites and controlled databases to illustrate that our sampling method has good performance both on the quality and efficiency.