What kind of info to share? - Economics of Security and Privacy

• vulnerabilities

• malware info + signatures

• malicious resources (IP, domain, NS, etc.)

• security logs and alerts

• incident data and meta-data

• impact evaluations

• defense efficiency data

• ...

9.4. A little history...

• problem of viruses on DOS

• "secure" OS, Windows NT introduced

• virus bulletin for quick dissemination of incidents

• CERT - limited disclosure policy

• info is shared, but no source code

• still the case today

• bugtraq mailing lists

9.5. Disclosure arguments (1/2)

• full disclosure - security view

• more info helps to make the best decision

• disclosure forces software vendors to improve quality

• expose info for a wide audience, allow them to contribute to solutions

• liability of disclosure

• does it lead to better security?

• used for exploits

9.6. Disclosure arguments (2/2)

• no disclosure - anti-virus view

• vulnerabilities should be kept secret - fear of public exploitation

• disclosure of malware code does not reveal more about the vulnerability

• better to keep "good security practices"

• partial disclosure

• warn users, but don't give full info

• mostly practiced

• used for malware

9.7. Incentives in info sharing

• interconnectivity increased exposure to attacks

• externality

• despite technology defenses, firms remain vulnerable

• attacks decrease firms' values

• interconnectivity also enables coordinated defenses

• distribute security info for enhanced protection

• federal initiative to share security info

• incentive issues

• reporting game model

9.8. Assumptions

• type of information shared - common or private

• value associated with infosharing

• type of game - Bertrand or Cournot model

• nature of products - substitute or complement

• firms' market share

9.9. Infosharing model

• based on Gordon-Loeb (Chapter 4)

• two firms

• probability of breach for I depends on

• own investment

• other's investment

• other's sharing in

• simplifaction:

9.10. Sharing is exogeneous

• defined by an external agency

• two security breach functions

• in NE, the security level is independent of

• but, the security spending is less

• firms also spend lesson security

• information sharing is beneficial only if marginal sec spending fulfills a condition

• social welfare is increased with sharing

• if enforcement is costless

9.11. Sharing is endogeneous (1/2)

• firm selects

• sharing choice is public info before they make security investment Result:

In a NE, they will not share!

9.12. Sharing is endogeneous (2/2)

Solution:

• regulator can measure losses

• cost: the actual losses of others

• subsidy: optimal losses at full sharing

• firm has incentive to minimize social costs

• in practice, this looks different

9.13. Sharing security info in practice

• incident reporting databases

• US DoE - national labs

• coordinated effort to share incident logs

• general resistance

• DShield - share firewall logs

9.14. DShield for log sharing

• SANS Internet Storm Center - DShield database

• over 500000 sensors in 50 countries

• supported by SANS tuition money + volunteering incident handlers

• most active attacking IPs

• top ports scanned

• purpose:

• early warning

• detection of large-scale attacks

• improved firewall config Critiques of DShield:

• uncontrolled and messy

9.15. Blacklisting

• compile a list of known bad resources

• first DNSBL was the Real-time Blackhole List (RBL), created in 1997, at first as a BGP feed by Paul Vixie

• inventor was Eric Ziegast, working for Vixie

9.16. Black/white/grey/red listings

• blacklist: known spammers and other badness, target is zero FP

• redlist: alert list for domains before blacklisting (currently monitored or using privacy services)

• greylist: aggressive senders, not necessarily bad

• whitelist: list of proven legitimate domains that show up in feeds for some reason

9.17. Email handling behavior

• blacklist: block

• redlist: ?

• greylist: require proof of work

• whitelist: let pass

9.18. DNS blacklists (DNSBLs)

• uses DNS to distribute blacklisting info

• query for

• TXT record can refer to the cause

9.19. Blacklisting criticism

• the method for listing can be questioned

• listing dynamic addresses or ranges

• delisting might be cumbersome

• predictive lists have high FP rate

9.20. How useful is the information?

• first-hand info: observed malicious event E at location X

• add E to a blacklist

• distribute the blacklist

• if there is E' similar to E, block it

How relevant is a spammer in Malaysia for a user in Hungary?

9.21. Highly Predictive Blacklisting (HPB)

• global worst offender list (GWOL)

• compiled from a log database (DShield)

• global view on offenders

• list too big, firewall overload

• misses targeted attacks (of low global volume)

• sensor coverage might be limited

• local worst offender list (LWOL)

• from local observation at the firewall, IDS

• entirely reactive, no predicitive power

• both

• attacker needs to "hit the bar"

Which information is better for blocking/filtering?

9.22. HPB properties

• log pre-filtering to reduce alerts

• relevance-based attack source ranking

• compute similarity of attack sources

• algorithm similar to PageRank

• iterative application to establish relevance

9.23. Results

9.24. Phishing

• phishing: produce a fake website to get login info

9.25. Cooperative defense against phishing

• malicious websites

• banks, registrars, auction sites, multi-player online games, online merchants

• takedown as a defense

• needs to be quick to avoid exposure (email)

• needs cooperation from hosting companies

• often outsourced to third parties - brand-protection firms

• special campaigns

• rock-phish - domain has to be taken down

• fast-flux

9.26. Phishing defense

• collect malicious URLs

• information is distributed (sent to everyone)

• organizations collect URLs and other data

• PhishTank

• Anti-Phishing Workgroup (APWG)

• CastleCops

• anti-spam companies (due to volume)

• takedown companies create their own feedsApproach: measure takedown time based on data feeds

• all data contains "verified" URLs

9.27. Results (1/3)

• takedown company

• non-cooperation results in longer phishing site lifetime

9.28. Results (2/3)

• - large banks, - small banks and credit unions

• large banks would benefit more from cooperation!

9.29. Results (3/3)

• damage due to non-cooperation

• calculated from average looses and exposure time

• might be strongly biased

9.30. Sharing phishing URLs

• Phish-market protocol to securely share URLs

• modeling takedown companies

• interest only in URLs for a certain client circle

• share only URLs the other party is interested in

• ideal solution - TTP

• distributed crypto solution is usually much slower

9.31. Phish-market protocol

• PMP - execute single transactions

• some room for cheating

• quality of URLs

• DoS

• mapping interest of buyers

9.32. Protocol overview

9.33. Sharing phishing URLs

• Phish-market protocol to securely share URLs

• shares only what the receiver wants

• does not reveal to the provider which URLs are sent

• secure counting of the URLs given (to allow compensation of the provider later)

• does not count URLs already at the receiver [fragile]

9.34. Reading for next time

• Odlyzko, A., "Privacy, Economics and Price Discrimination on the Internet" Proceedings of the 5th international conference on Electronic commerce, 2003

optional:

• Acquisti, A., "The Economics of Privacy," draft presentation, website:

http://www.heinz.cmu.edu/ acquisti/economics-privacy.htm

10. 10 Underground economy

10.1. What is privacy?

• Freedom to develop (Scoglio 1994)

• Aspect of human dignity (Bloustein 1964)

• Right to be left alone (Warren and Brandeis 1890)

• Ability to control own space (Sweeney 2002)

• Tort (Prosser 1960)

• Disclosure of intimate facts

• False light

• Misappropriation

• Intrusion into somebody's solitude

• Ability to control access to one's information (in/out) (Noam 1996, Samarajiva 1998 - among many others)

10.2. Anonimity or identifiability?

10.3. Anonimity or identifiability? (2)

10.4. Anonimity or identifiability? (3): the anonimity paradox

10.5. Anonimity or identifiability? (4)

10.6. Anonimity or identifiability? (5)

10.7. Anonimity or identifiability? (6): pro and contra

Anonimity

Identifiability

• "being anonymous"

• Fine-tunes information sharing

• No customized observation or surveillance

• Speaking without consequnces

• "Tabula rasa"

• "transparence"

• Everyone sees everything

• Meaningful functions, two-way communication

• Accountability, verifiability

• Building areputation

10.8. Anonimity or identifiability? (7): solving the anonimity

paradox

10.9. Anonimity or identifiability? (8): solving the anonimity paradox

10.10. Anonimity

• Anonymous, who cannot be identified in a set of potential subjects.

• Anonimity set!

• any participant

• similar features

• many facets

• in general

• from an attacker's perspective

10.11. Anonimity (2)

• Example: the green sender is anonymous if it's anonymous within the set of potential senders.

10.12. Anonimity (3)

• Anonimity subject

• Individual anonymity

• Global anonymity

• not always the same!

• context-dependent:

• observer,

• duration of observation,

• properties,

• etc.

• Characterizing anonymity:

• numerical: probability of uncovering

• robustness: sensitivity of numerical change

• not time-dependent

10.13. Unlinkability

• unlinkability

• An attacker cannot decide if there is a connection between two entities

10.14. Unlinkability (2)

• Anonymity can build on this, ex. sender anonymity

10.15. Privacy related notions

• Anonymity: hiding who performed a given action

• Untraceability: making difficult for an adversary to identify that a given set of actions were performed by the same subject

• Unlinkability: generalization of the two former notions: hiding information about the relationships between any item

• Unobservability: hiding of the items themselves (e.g., hide the fact that a message was sent all)

• Pseudonymity: making use of a pseudonym instead of the real identity

10.16. Privacy metrics (1/2)

• Anonymity set: set of subjects that might have performed the observed action

• Is a good measure only if all the members of the set are equally likely to have performed the observed action

• Entropy-based measure of anonymity:

10.17. Privacy metrics (2/2)

• Entropy-based measure for unlinkability:

10.18. Why is privacy a question of economics?

• better services

• customized information

• price discrimination

• improve security

• collect more information

• controlling information collection is difficult

• users seek service - sacrifice privacy

• data leaks = massive compromises

• Questions:

• What info to collect?

• How much to collect?

10.19. Off-line vs. on-line identities

• On-line identity

• Carries information about an individual's tastes, her purchase history, etc. (e.g.: Amazon account)

• Off-line identity

• The persistent identity of an individual, as revealed by identifiers such as credit card numbers and social security numbers

• Linked on-line/off-line identities

• Different needs

• Externalities

10.20. Personal Information as an Economic Good

• Asymmetric information

• Individual does not know how, how often, for how long her information will be used

• Intrusions invisible and ubiquitous

• Externalities and moral hazard

• Ex-post

• Value uncertainty

• Keeps on affecting individual after transaction

• Imagine: lump sum vs. negative annuity

10.21. Personal Information as an Economic Good

• Context-dependent (states of the world)

• Anonymity sets

• Recombinant growth

• Sweeney (CMU): 87% of Americans uniquely identifiable from ZIP code, birth date, and sex

• Subjective

• "Willingness to pay" affected by considerations beyond traditional market reasoning

10.22. Personal Information as an Economic Good

• Private and public good aspects

• As information, it is non rival and non excludable

• The more other parties use that personal information, the higher the risks for original data owner

• Buy vs. sell

• Individuals value differently protection and sale of same piece of information

• Like insurance, but:

10.23. User profiling techniques

• cookies

• super-cookies

• browser fingerprinting

10.24. Web tracking: Cookies

HTTP is stateless = no info on former visits

Most websites use cookies

four parts:

• Cookie-header in the HTTP-Request

• Cookie-header in the HTTP-Response

• Cookie data placed inthe user's browser

• Backend databse on the server

10.25. Web tracking: Cookies

10.26. Web tracking: Cookies

• Set-Cookie: CO=kiskacsa;

• Set-Cookie: CO=kiskacsa; Expires=Wed, 09 Jun 2021 10:18:14 GMT

• Set-Cookie: CO=kiskacsa; Domain=.google.com; Path=/; Expires=Wed, 09 Jun 2021 10:18:14 GMT;

httpOnly

10.27. Web tracking: Cookies

Using Cookies:

• Authorization

• Shopping cart

• Recommendations

• Session state (e.g. web email)

other alternatives to keep state:

• on the hosts:

• the state is store in the application and used for several transactions

• cookies: HTTP messages contain the state

10.28. Web tracking: Super Cookies

• users are aware and concerned of cookies

• new methods to follow users

• super-cookies - much more dangerous than cookies

• Adobe Flash cookies

• DOM storage

10.29. Web tracking: Browser Fingerprinting

• Browser-fingerprinting

10.30. Inducing customers to try new goods

• Cookies-like technology vs. anonymizing technology

• Questions

• Will cookies-like technology bring more profits?

• Will buyers use the anonymizing technology?

• Results

• No larger profits from cookies-like technology...

• ...unless something more is offered

• Enhanced services based on gathered information

• Anonymizing technologies could make society worse off

10.31. Collecting and controlling information

• info collection produces

• indirect externalities

• some consumers are pushed out of the market

• cross-market use of information - sell user info

• direct externality

• spam and direct marketing

• controlling personal information

• no control - free collection

• user controls and releases

10.32. The value of private information: theory (1/2)

• Individuals should be able to control the dissemination of their private information - privacy is an intellectual property

• Model by coalitional games: the core and Shapley value

• Scenario 1: Marketing survey

• Players are a vendor and users

• Users choose between product samples

• Vendor adopts the majority choice

• Utility: vendor - majority group size, binary for the users

10.33. The value of private information: theory (2/2)

• Scenario 2: Recommendation systems

• Each player knows items

• The total payoff for the coalition is the total number of all items

• Scenario 3: Collaborative Filtering

• N players and K items

• N x M matrix to show preferences: 1 (like), -1 (doesn't like), 0 (doesn't know)

• Player gets an advice on item from the player , who aggregates the opinion of players - the best - if they agree payoff=1, if not payoff = -1

10.34. The value of private information: practice

• Cambridge campus experiment

• Pretended tracking of location via cellphones

• Question: How much money do you want to get?

• Fixed number of people can participate

• Second-price auction - all participants get the bid of the last, who could not participate

10.35. The value of private information: practice

• Results:

• People bid round numbers

• People increase their bid if commercial interest

• Who travel outside Cambridge more than once a week or communicate with a partner ask more for their privacy

10.36. Problem with anonymity

• Anonymity cannot be provided by itself

• Each node provides and consumes anonymity

• Lazy users want to consume anonymity but provide few to reduce costs

• Trust that others also seek anonymity

• Trust not well distributed trust bottleneck attacks

• Trust not properly verified dishonest nodes can sniff traffic

• Much users do not realize they want anonymity protection

• Users do not know the value of their privacy

• Results in an anonymity equilibrium

10.37. Incentives for anonymity

• Need incentive for participants to offer and use anonymity services

• System must attract cover traffic before it can attract high-sensitivity users

• Weak security parameters may produce stronger anonymity by bringing more users

• Simplify usability to increase user base

• Optimal level of free-riding

• Accept the cost of offering anonymity service to gain cover traffic

• Usage fee

• Fee depends on sensibility of nodes' traffic

• Reputation

10.38. Anonymity systems take-away

In anonymity systems usability, efficiency, reliability and cost become security objectives as they affect the size of the user base, thus the anonymity set.

10.39. Price discrimination

• different people have different type (valuation)

• separate types by gathering information

• provide the same services for them

• increases social welfare

10.40. Price Discrimination vs. Privacy

• Internet provides privacy: "On the Internet, nobody knows youâ€™re a dog" (Pat Steiner, The New Yorker, 1993)

• Internet provides a loss of privacy (so far)

• The push is from the private sector:

• Price discrimination to increase profit

• Learn more about customers' willingness to pay

• General resistance from the public

• Dilemma: Price discrimination is not fair, but might increase system efficiency (share the cost to produce something, but not equally)

10.41. Price Discrimination vs. Privacy

• Many examples:

• Versioning: hardcover vs. paperback books

• Scholarly publishing: price for a library depends on its size (e.g., JSTOR)

• How to hide it:

• Do not charge in cash: frequent flier miles, discount by purchase history

• Use bundles: site licensing (might increase usage in addition)

• Summary:

• Incentives to do price discrimination and reduce privacy

• Public resistance hide actions

• Governments must be involved, but no good strategy for them (since price discrimination improves general welfare)

10.42. Who should protect your privacy?

• Self-regulation?

• Individual responsibility?

• Policy/legislation?

• EU vs. US

• Samuelson 2003: The social costs of confusing privacy policies

• Hu, Smith, Tang 2004

10.43. Solutions

• regulation

• on collecting user information

• free markets will regulate info collection (Posner 1978, Stigler 1980)

• technology

• private browsing

• proxy servers

• user education

• privacy concerns (e.g. for cookies)

10.44. Reading for next time

• The Privacy Jungle: On the Market for Data Protection in Social Networks [Bonneau and Preibusch, WEIS 2009]

optional:

• Privacy by ReDesign: Alleviating Privacy Concerns for Third-Party Apps [Xu, Wang, and Grossklags, ICIS 2012]

11. 11 Economics of privacy

11.1. Social networks and privacy

• social networks

• information sharing in relevant circles

• maybe unwillingly

• users have a bad control over their private info

• controls

• privacy-enhancing technologies (PET)

• user choice and control

• not a selling point

• diverse privacy policies

11.2. Economics issues in privacy

• implementation errors

• interfaces to improve privacy controls

• systemic issues of privacy

• problems from the social graph

• enabler for phishing attacks

• user valuation: the privacy paradox

11.3. Survey of social networking sites

• privacy practices of 45 social networking sites

• full view - not only the big sites

• privacy problems

• categories

• general-purpose sites (Facebook, Orkut, etc.)

• only English langage

• niche sites

• business networking (LinkedIn)

• media recommendation (Last.fm, Flickster)

• reunion (classmates.com)

• activity-focused (CouchSurfing)

• privacy-specific (Imbee, Kaioo)

11.4. Evaluation methodology

• data collection

• general info about the site

• user info requested at signup

• technically consistent

11.5. Promotion techniques

11.6. Data collection during sign-up

• too much data gathered

11.7. Privacy controls

• profile visible by default

• privacy controls

11.8. Invasive features

• features enable to discover private data

11.9. Comparison of privacy policies

• privacy policies are not comprehensive

• most general-purpose sites used it

• users not encouraged to read it

• many features are not properly implemented

• problems with access

• readability

11.10. Privacy affecting factors

• positive

• functionality

• size

• age

• growth

• negative

• promotion of privacy

11.11. Summary of the survey

• not an oligopoly, but thousands of niche sites

• users and social networking sites care about privacy

• behavioral economics: users do not make rational decisions

• most sites do not implement adequate privacy

• need for non-textual privacy policies

11.12. User control over private information

• mostly in social networking sites

• customized ads are more effective

• users might also consider them more intrusive

• solution:

• give users more privacy control

• side-effect:

• users get more aware of privacy

11.13. Facebook privacy experiment

• business model

• measurement study on social apps on facebook and qualitative observations

• field study on permissions dialogues

• displaying ads with different settings

11.14. Third-party apps on FB

• Unprecedented scale of Facebook network

• 901 million monthly active Facebook users at the end of March 2012

• 488 million monthly active users who used Facebook mobile products in March 2012

• Development API for third-party apps and other exogenously developed content in May 2007

• Vast amount of social network and personal data at developers' disposal (privacy issues)

11.15. A two-sided market

11.16. Monetizing Facebook's app market

• Payments and other fees

• Negotiated fee from platform developers when users make purchases using Facebook's payments infrastructure

• Mandated use of payments infrastructure for game apps (i.e., fees related to payments are dominantly from games)

• Zynga: the most prominent and lucrative example:

• Twelve percent of Facebook's total revenue in 2011

• Sales of virtual goods and direct advertising

• Facebook retains a fee of up to 30% of the face value of user purchases in Zynga's games on the Facebook Platform

11.17. Facebook's and apps' informational market power

• Access to developer API (for developers) and apps (for users) is free and relatively unrestricted

• Eagerness for growth likely leads to exploitation of asymmetric information

• Users are imperfectly informed and cannot explore privacy and security consequences of apps (that may occur over time) without harm

• Companies have no incentive to reveal revenue streams that exploit privacy and lead to potential dissatisfaction with product

• Consequences:

• Easy to observe monetary cost are low/free

• Hard to observe privacy impact/cost may be high

11.18. Frictionless adding of apps until 2010

• Facebook granted third-party apps broad access to users' (and their friends') profiles without a granular permissions system

• Study by Felt and Evans (2008) on 150 popular apps:

• More narrow set of permissions sufficient to accomplish apps' subjectively evaluated goals Apps vastly over-privileged

11.19. Privacy and security issues in the past

• Study by Wall Street Journal

• Numerous third-party apps were found extracting identifiable user information

• Sharing this bounty with advertising companies

11.20. Regulatory intervention

• For example: Investigation by Canada's Privacy Commissioner started in 2008

• Facebook introduced the currently deployed system of granular permissions in September 2010

11.21. Measurement study

• Gathered comprehensive list of third-party apps from AppData (appdata.com)

• Apps with more than 1000 monthly active users

• Recorded profile (app ID, name, active users, rating and reviews, category, app URL redirection)

• Merged data for four samples from AppData

• List included 29,020 apps of which 27,404 apps (i.e., 94.43% of our initial dataset) were accessible

• "Disappearance" of more than 1,500 apps indicates high volatility on the market

• Google Chrome Extension used for data collection

11.22. Five app categories

• 9,411 apps (34 % of the total sample)

• Presence of a typical authentication and authorization dialog page

In document Economics of Security and Privacy (Pldal 132-0)