The goal of this paper is to identify potential research directions and challenges that need to be addressed to perform privacy-preserving data integration. Increasing privacy and security consciousness has lead to increased research (and development) of methods that compute useful information in a secure fashion. Data integration and sharing have been a long standing challenge for the database community. This need has become critical in numerous contexts, including integrating data on the Web and at enterprises, building ecommerce market places, sharing data for scientific research, data exchange at government agencies, monitoring health crises, and improving homeland security.
Unfortunately, data integration and sharing are hampered by legitimate and widespread privacy concerns. Companies could exchange information to boost productivity, but are prevented by fear of being exploited by competitors or antitrust concerns. Sharing healthcare data could improve scientific research, but the cost of obtaining consent to use individually identifiable information can be prohibitive. Sharing
health care and consumer data enables early detection of disease outbreak, but without provable privacy protection it is difficult to extend these surveillance measures nationally
or internationally. Fire departments could share regulatory and defense plans to enhance their ability to fight terrorism and provide community defense, but fear loss of privacy could lead to liability. The continued exponential growth of distributed personal data could further fuel data integration and sharing applications, but may also be stymied by a privacy backlash. It is critical to develop techniques to enable the integration and sharing of data without losing privacy. The need of the hour is to develop solutions that enable widespread integration and sharing of data, especially in domains of national priorities, while allowing easy and effective privacy control by users. A comprehensive framework
that handles the fundamental problems underlying privacy preserving data integration and sharing is necessary. The framework should be validated by applying it to several important domains and evaluating the result.
Concurrently, various privacy-preserving distributed data mining methods have also been developed which mine global data while protecting the privacy/security of the underlying data sites. However, all of these methods also assume that data integration (including record linkage) has already been done. Note that while data integration is related to privacy-preserving data mining, it is still significantly different. Privacy-preserving data mining deals with gaining knowledge after integration problems are solved. First, a framework and methods for performing such integration is required.
c-pgms.blogspot.com Moved
15 years ago
No comments:
Post a Comment