Special Focus: Privacy

Personal data protection has surged to the center of the policy debate, leading to the introduction of the EU General Data Protection Regulation (GDPR). Several BCCP Fellows analyzed the issue of data privacy from different angles. This post summarizes some of their findings.

Privacy in the Sharing Economy— or “Prism is a Dancer”

Peer-to-peer resource sharing platforms have exhibited considerable growth and are expected to continue doing so in the future. Importantly, online marketplaces, such as Airbnb, have started to obliterate the boundaries between private and economic spheres. Marketing personal resources online is inherently associated with the disclosure of personal, potentially intimate, information, which raises unprecedented questions of privacy. Yet, thus far, there is little research on the role of privacy considerations in the sharing economy literature. Applying the privacy calculus framework, BCCP Fellow Timm Teubner (TU Berlin) and Christoph M. Flath (University of Würzburg) investigate how privacy concerns and economic prospects jointly shape potential providers’ intentions to offer resources through different channels.

In their forthcoming article, they argue that an individual’s privacy concerns of disclosing detailed information on her apartment through different communication channels exhibit a curvilinear form where information is readily shared within small communities (e.g., among close circles of friends or family) as well as large-scope platforms that are publicly accessible and targeted to potentially any Internet user. Conversely, privacy concerns are very pronounced on intermediate levels, with medium-sized audiences and limited anonymity. This conjecture is theoretically related back to providers’ assessment of their audience, suggesting that privacy concerns emerge as an intricate product of personal connection to the audience and perceptions of magnitude.

The authors evaluate their hypotheses by means of data from an online survey, providing support for the proposed effects. They discuss these findings in view of potential strategies from the platform sponsor’s perspectives, including social media integration and the provision of tools for privacy management.

The article is available here: Teubner and Flath (2019, in press). Privacy in the Sharing Economy. Journal of the Association for Information Systems.

Uncertainty is Key to Privacy-Preserving Data Analysis

Data is considered to be one of the most valuable assets in modern economies. Its analysis plays a key role in making sound decisions and helping business to achieve effective operation. Therefore, many applications count the number of distinct elements in a large data stream. Applications include network monitoring, web analytics, and location-based services. However, large-scale data collection has some clear drawbacks. Specifically, it can reveal personal details that should remain private. Thus, new technologies need to be developed to enhance privacy.

Probabilistic data structures -- i.e. algorithms that reduce complexity of data and provide approximate answers with a certain degree of certainty -- have been identified to serve as a privacy-enhancing technology. Due to their probabilistic nature, they inherently follow the principle of data minimization. In a recent article, BCCP Fellows Björn Scheuermann and Florian Tschorsch propose a novel and versatile privacy-enhanced technology that aims to balance privacy, accuracy, and computational efficiency. They develop a method for so-called distributed counting with accurate estimates and a high level of privacy protection, based on cardinality estimators. To this end, they also contribute a novel probabilistic analysis approach that compares an attacker's a-priori and a-posteriori knowledge to assess transparently the privacy properties of the proposed algorithms. To obtain more complex statistics, one of the challenges is to combine individual estimates, e.g., to calculate correlations, which may lead to a poor accuracy due to error propagation. To address these challenges, the authors develop a novel method to perform efficient set intersection cardinality estimates.

In two research projects, they then apply and extend their findings. Björn Scheuermann investigates a privacy-by-design approach to loyalty programs and electronic payment systems in the Goodcoin project. Florian Tschorsch explores ways to realize privacy-preserving human resource analysis with transparent and secure access logs in the project on Anonymous Predictive People Analytics.

The full paper "P2KMV: A Privacy-preserving Counting Sketch for Efficient and Accurate Set Intersection Cardinality Estimations" is published IACR Cryptology ePrint Archive.

The Price of Privacy—An Evaluation of the Economic Value of Collecting Clickstream Data

The digital economy offers a plethora of opportunities. Every business transaction and stakeholder interaction leaves a digital footprint, providing a major opportunity for the systematic analysis of corresponding data assets in order to gain managerial insight, enhance firm operations, and improve decision-making. This is the value proposition of business analytics, a fact-based management paradigm increasingly adopted by industry. In consumer facing business processes, such as marketing and sales, excessive data gathering and processing by enterprises raise concerns related to privacy and possible consumer exploitation. Intuitively, the firm may have an incentive to collect the largest amount of (consumer) data possible but how much data is actually needed?

In their recent article, BCCP Fellow Stefan Lessmann and colleagues Annika Baumann, Johannes Haupt, and Fabian Gebert consider a specific type of data, called clickstream data, and examine the degree to which more data gathering and higher degrees of privacy invasion actually create utility to the firm. Clickstream data comprises information about the behavior in which a visitor interacts with a website. Baumann et al. identify multiple data items that firms can derive from the clickstream and categorize these according to their privacy friendliness. For example, static data items related to the device or operating system a visitor is using to access the website facilitate analysis of user behavior at an aggregated level. Baumann et al. argue that corresponding information is less privacy invading than dynamic pieces of information that enable attitudinal profiling by tracing a consumer’s webpage usage behavior in the form of, e.g., page scrolling actions, mouse movements, the time spend on a given page, etc. The group of most severe clickstream features include data items that facilitate disclosure of personal identifiable data.

Baumann et al. consider a transactional e-commerce website and assume a profit-maximizing site owner. Using the framework of cost-sensitive machine learning, they estimate the marginal utility of targeted marketing actions, such as digital coupons, when leveraging clickstream data items that increasingly invade consumer privacy. Empirical results from real-world digital marketing data suggest diminishing returns of privacy invasion. A targeting model that uses the full set of clickstream features performs marginally better than an alternative targeting model developed on the basis of less hazardous data items. The results shed light on the privacy-profitability trade-off and offer a new perspective for e-commerce analytics. If firms can successfully market greater respect of consumer privacy, abstaining from gathering and using certain types of privacy sensitive data might be the best strategy.

The full paper The price of privacy: An evaluation of the economic value of collecting clickstream data” is published in Business & Information Systems Engineering.