Seminar Reports: Privacy Preserving Data Integration And Mining : Quantifying Privacy Disclosure

Sunday, October 19, 2008

Privacy Preserving Data Integration And Mining : Quantifying Privacy Disclosure

In real life, with any information disclosure there is always some privacy loss. We need reliable metrics for quantifying privacy loss. Instead of simple 0-1 metrics (whether an item is revealed or not), we need to consider probabilistic notions of conditional loss, such as decreasing the range of values an item could have, or increasing the probability of accuracy of an estimate. In general, a starting classification could measure the following: probability of complete disclosure of all data, probability of complete disclosure of a specific item, probability of complete disclosure of a random item. Privacy-preserving methods can be evaluated on the basis of their susceptibility to the above metrics. Also some of the existing measures can be used in this direction. For example, one of the popular metrics (Infer(x ! y)) used in database security can be easily applied for measuring privacy loss in schema matching phase. In the original definition H(y) corresponds to entropy of y, and Hx(y) corresponds to conditional entropy of y given x then privacy loss due to revelation of x is given as follows:
Infer(x ! y) =H(y) − Hx(y)
H(y)
Note that for schema matching phase, what is revealed to the human for verification can be modeled as revealing x. Although this measure can be used in many different cases, it is hard to calculate the conditional entropies. Therefore, there is need for developing different privacy metrics.

Seminar Reports

Sunday, October 19, 2008

Privacy Preserving Data Integration And Mining : Quantifying Privacy Disclosure

No comments:

Find It

My Blog List

Category

Blog Archive

Followers