ResearchHub | Open Science Community

Continual Observation of Joins under Differential Privacy

Wei Dong et al.May 29, 2024

The problem of continual observation under differential privacy has been studied extensively in the literature. However, all existing works, with the exception of [28,51], have only studied the simple counting query and its derivatives. Join queries, which are arguably the most important class of queries in relational databases, have only been considered in [28,51], but the solutions offered there have two limitations: First, they only support a few specific graph pattern queries, which are special cases of joins. Second, they require hard degree/frequency constraints on the graph/database instance, and the privatized query answers have errors proportional to these constraints. In this paper, we propose a new differentially private mechanism for continual observation of joins that overcomes these two limitations. Our mechanism supports arbitrary joins and predicates, and do not require any constraints to be given in advance, even over an infinite stream. More importantly, it yields an error that is proportional to the actual maximum degree/frequencies in the graph/database instance at the current time of observation. Such an instance-specific utility guarantee is much preferred for the continual observation problem, where the database size and the query answer may change significantly over time.

Artificial Intelligence

Theoretical Computer Science

0

Paper

Artificial Intelligence

1

0

Save

0

Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys

Wei Dong et al.Sep 26, 2024

Answering SPJA queries under differential privacy (DP), including graph pattern counting under node-DP as an important special case, has received considerable attention in recent years. The dual challenge of foreign-key constraints combined with self-joins is particularly tricky to deal with, and no existing DP mechanisms can correctly handle both. For the special case of graph pattern counting under node-DP, the existing mechanisms are correct (i.e., satisfy DP), but they do not offer nontrivial utility guarantees or are very complicated and costly. In this paper, we propose two mechanisms for solving this problem with both efficiency and strong utility guarantees. The first mechanism, called R2T, is simple and efficient, while achieving down-neighborhood optimality with a logarithmic optimality ratio. Down-neighborhood optimality is a new notion of optimality that we introduce for measuring the utilities of DP mechanisms, which can be considered as a natural relaxation of instance optimality, and it is especially suitable for functions with a large or unbounded sensitivity. Our second mechanism further reduces the optimality ratio to a double logarithm, which is also known to be optimal, thus we call this mechanism OPT 2 . While OPT 2 also runs in polynomial time, it does have a higher computational cost than R2T in practice. Both R2T and OPT 2 are simple enough that they can be easily implemented on top of any RDBMS and an LP solver. Experimental results show that they offer order-of-magnitude improvements in terms of utility over existing techniques, even those specifically designed for graph pattern counting.

Artificial Intelligence

Machine Learning

0

Paper

Artificial Intelligence

Machine Learning

0

Save

0

Almost Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy

Wei Dong et al.Dec 2, 2024

Differentially private mechanisms achieving worst-case optimal error bounds (e.g., the classical Laplace mechanism) are well-studied in the literature. However, when typical data are far from the worst case, instance-specific error bounds---which depend on the largest value in the dataset---are more meaningful. For example, consider the sum estimation problem, where each user has an integer xi from the domain {0,1,...,U} and we wish to estimate ∑i xi. This has a worst-case optimal error of O(U/ε), while recent work has shown that the clipping mechanism can achieve an instance-optimal error of O(maxi xi ⋅ log log U /ε). Under the shuffle model, known instance-optimal protocols are less communication-efficient. The clipping mechanism also works in the shuffle model, but requires two rounds: Round one finds the clipping threshold, and round two does the clipping and computes the noisy sum of the clipped data. In this paper, we show how these two seemingly sequential steps can be done simultaneously in one round using just 1+o(1) messages per user, while maintaining the instance-optimal error bound. We also extend our technique to the high-dimensional sum estimation problem and sparse vector aggregation (a.k.a. frequency estimation under user-level differential privacy).

Philosophy

Artificial Intelligence

0

Paper

Philosophy

Artificial Intelligence

0

Save