• vulnerabilities
• malware info + signatures
• malicious resources (IP, domain, NS, etc.)
• security logs and alerts
• incident data and meta-data
• impact evaluations
• defense efficiency data
• ...
9.4. A little history...
• problem of viruses on DOS
• "secure" OS, Windows NT introduced
• virus bulletin for quick dissemination of incidents
• CERT - limited disclosure policy
• info is shared, but no source code
• still the case today
• bugtraq mailing lists
9.5. Disclosure arguments (1/2)
• full disclosure - security view
• more info helps to make the best decision
• disclosure forces software vendors to improve quality
• expose info for a wide audience, allow them to contribute to solutions
• liability of disclosure
• does it lead to better security?
• used for exploits
9.6. Disclosure arguments (2/2)
• no disclosure - anti-virus view
• vulnerabilities should be kept secret - fear of public exploitation
• disclosure of malware code does not reveal more about the vulnerability
• better to keep "good security practices"
• partial disclosure
• warn users, but don't give full info
• mostly practiced
• used for malware
9.7. Incentives in info sharing
• interconnectivity increased exposure to attacks
• externality
• despite technology defenses, firms remain vulnerable
• attacks decrease firms' values
• interconnectivity also enables coordinated defenses
• distribute security info for enhanced protection
• federal initiative to share security info
• incentive issues
• reporting game model
9.8. Assumptions
• type of information shared - common or private
• value associated with infosharing
• type of game - Bertrand or Cournot model
• nature of products - substitute or complement
• firms' market share
9.9. Infosharing model
• based on Gordon-Loeb (Chapter 4)
• two firms
• probability of breach for I depends on
• own investment
• other's investment
• other's sharing in
• simplifaction:
9.10. Sharing is exogeneous
• defined by an external agency
• two security breach functions
• in NE, the security level is independent of
• but, the security spending is less
• firms also spend lesson security
• information sharing is beneficial only if marginal sec spending fulfills a condition
• social welfare is increased with sharing
• if enforcement is costless
9.11. Sharing is endogeneous (1/2)
• firm selects
• sharing choice is public info before they make security investment Result:
In a NE, they will not share!
9.12. Sharing is endogeneous (2/2)
Solution:
• regulator can measure losses
• cost: the actual losses of others
• subsidy: optimal losses at full sharing
• firm has incentive to minimize social costs
• in practice, this looks different
9.13. Sharing security info in practice
• incident reporting databases
• US DoE - national labs
• coordinated effort to share incident logs
• general resistance
• DShield - share firewall logs
9.14. DShield for log sharing
• SANS Internet Storm Center - DShield database
• over 500000 sensors in 50 countries
• supported by SANS tuition money + volunteering incident handlers
• most active attacking IPs
• top ports scanned
• purpose:
• early warning
• detection of large-scale attacks
• improved firewall config Critiques of DShield:
• uncontrolled and messy
9.15. Blacklisting
• compile a list of known bad resources
• first DNSBL was the Real-time Blackhole List (RBL), created in 1997, at first as a BGP feed by Paul Vixie
• inventor was Eric Ziegast, working for Vixie
9.16. Black/white/grey/red listings
• blacklist: known spammers and other badness, target is zero FP
• redlist: alert list for domains before blacklisting (currently monitored or using privacy services)
• greylist: aggressive senders, not necessarily bad
• whitelist: list of proven legitimate domains that show up in feeds for some reason
9.17. Email handling behavior
• blacklist: block
• redlist: ?
• greylist: require proof of work
• whitelist: let pass
9.18. DNS blacklists (DNSBLs)
• uses DNS to distribute blacklisting info
• query for
• TXT record can refer to the cause
9.19. Blacklisting criticism
• the method for listing can be questioned
• listing dynamic addresses or ranges
• delisting might be cumbersome
• predictive lists have high FP rate
9.20. How useful is the information?
• first-hand info: observed malicious event E at location X
• add E to a blacklist
• distribute the blacklist
• if there is E' similar to E, block it
How relevant is a spammer in Malaysia for a user in Hungary?
9.21. Highly Predictive Blacklisting (HPB)
• global worst offender list (GWOL)
• compiled from a log database (DShield)
• global view on offenders
• list too big, firewall overload
• misses targeted attacks (of low global volume)
• sensor coverage might be limited
• local worst offender list (LWOL)
• from local observation at the firewall, IDS
• entirely reactive, no predicitive power
• both
• attacker needs to "hit the bar"
Which information is better for blocking/filtering?
9.22. HPB properties
• log pre-filtering to reduce alerts
• relevance-based attack source ranking
• compute similarity of attack sources
• algorithm similar to PageRank
• iterative application to establish relevance
9.23. Results
9.24. Phishing
• phishing: produce a fake website to get login info
9.25. Cooperative defense against phishing
• malicious websites
• banks, registrars, auction sites, multi-player online games, online merchants
• takedown as a defense
• needs to be quick to avoid exposure (email)
• needs cooperation from hosting companies
• often outsourced to third parties - brand-protection firms
• special campaigns
• rock-phish - domain has to be taken down
• fast-flux
9.26. Phishing defense
• collect malicious URLs
• information is distributed (sent to everyone)
• organizations collect URLs and other data
• PhishTank
• Anti-Phishing Workgroup (APWG)
• CastleCops
• anti-spam companies (due to volume)
• takedown companies create their own feedsApproach: measure takedown time based on data feeds
• all data contains "verified" URLs
9.27. Results (1/3)
• takedown company
• non-cooperation results in longer phishing site lifetime
9.28. Results (2/3)
• - large banks, - small banks and credit unions
• large banks would benefit more from cooperation!
9.29. Results (3/3)
• damage due to non-cooperation
• calculated from average looses and exposure time
• might be strongly biased
9.30. Sharing phishing URLs
• Phish-market protocol to securely share URLs
• modeling takedown companies
• interest only in URLs for a certain client circle
• share only URLs the other party is interested in
• ideal solution - TTP
• distributed crypto solution is usually much slower
9.31. Phish-market protocol
• PMP - execute single transactions
• some room for cheating
• quality of URLs
• DoS
• mapping interest of buyers
9.32. Protocol overview
9.33. Sharing phishing URLs
• Phish-market protocol to securely share URLs
• shares only what the receiver wants
• does not reveal to the provider which URLs are sent
• secure counting of the URLs given (to allow compensation of the provider later)
• does not count URLs already at the receiver [fragile]
9.34. Reading for next time
• Odlyzko, A., "Privacy, Economics and Price Discrimination on the Internet" Proceedings of the 5th international conference on Electronic commerce, 2003
optional:
• Acquisti, A., "The Economics of Privacy," draft presentation, website:
http://www.heinz.cmu.edu/ acquisti/economics-privacy.htm
10. 10 Underground economy
10.1. What is privacy?
• Freedom to develop (Scoglio 1994)
• Aspect of human dignity (Bloustein 1964)
• Right to be left alone (Warren and Brandeis 1890)
• Ability to control own space (Sweeney 2002)
• Tort (Prosser 1960)
• Disclosure of intimate facts
• False light
• Misappropriation
• Intrusion into somebody's solitude
• Ability to control access to one's information (in/out) (Noam 1996, Samarajiva 1998 - among many others)
10.2. Anonimity or identifiability?
10.3. Anonimity or identifiability? (2)
10.4. Anonimity or identifiability? (3): the anonimity paradox
10.5. Anonimity or identifiability? (4)
10.6. Anonimity or identifiability? (5)
10.7. Anonimity or identifiability? (6): pro and contra
Anonimity
Identifiability
• "being anonymous"
• Fine-tunes information sharing
• No customized observation or surveillance
• Speaking without consequnces
• "Tabula rasa"
• "transparence"
• Everyone sees everything
• Meaningful functions, two-way communication
• Accountability, verifiability
• Building areputation
10.8. Anonimity or identifiability? (7): solving the anonimity
paradox
10.9. Anonimity or identifiability? (8): solving the anonimity paradox
10.10. Anonimity
• Anonymous, who cannot be identified in a set of potential subjects.
• Anonimity set!
• any participant
• similar features
• many facets
• in general
• from an attacker's perspective
10.11. Anonimity (2)
• Example: the green sender is anonymous if it's anonymous within the set of potential senders.
10.12. Anonimity (3)
• Anonimity subject
• Individual anonymity
• Global anonymity
• not always the same!
• context-dependent:
• observer,
• duration of observation,
• properties,
• etc.
• Characterizing anonymity:
• numerical: probability of uncovering
• robustness: sensitivity of numerical change
• not time-dependent
10.13. Unlinkability
• unlinkability
• An attacker cannot decide if there is a connection between two entities
10.14. Unlinkability (2)
• Anonymity can build on this, ex. sender anonymity
10.15. Privacy related notions
• Anonymity: hiding who performed a given action
• Untraceability: making difficult for an adversary to identify that a given set of actions were performed by the same subject
• Unlinkability: generalization of the two former notions: hiding information about the relationships between any item
• Unobservability: hiding of the items themselves (e.g., hide the fact that a message was sent all)
• Pseudonymity: making use of a pseudonym instead of the real identity
10.16. Privacy metrics (1/2)
• Anonymity set: set of subjects that might have performed the observed action
• Is a good measure only if all the members of the set are equally likely to have performed the observed action
• Entropy-based measure of anonymity:
10.17. Privacy metrics (2/2)
• Entropy-based measure for unlinkability:
10.18. Why is privacy a question of economics?
• better services
• customized information
• price discrimination
• improve security
• collect more information
• controlling information collection is difficult
• users seek service - sacrifice privacy
• data leaks = massive compromises
• Questions:
• What info to collect?
• How much to collect?
10.19. Off-line vs. on-line identities
• On-line identity
• Carries information about an individual's tastes, her purchase history, etc. (e.g.: Amazon account)
• Off-line identity
• The persistent identity of an individual, as revealed by identifiers such as credit card numbers and social security numbers
• Linked on-line/off-line identities
• Different needs
• Externalities
10.20. Personal Information as an Economic Good
• Asymmetric information
• Individual does not know how, how often, for how long her information will be used
• Intrusions invisible and ubiquitous
• Externalities and moral hazard
• Ex-post
• Value uncertainty
• Keeps on affecting individual after transaction
• Imagine: lump sum vs. negative annuity
10.21. Personal Information as an Economic Good
• Context-dependent (states of the world)
• Anonymity sets
• Recombinant growth
• Sweeney (CMU): 87% of Americans uniquely identifiable from ZIP code, birth date, and sex
• Subjective
• "Willingness to pay" affected by considerations beyond traditional market reasoning
10.22. Personal Information as an Economic Good
• Private and public good aspects
• As information, it is non rival and non excludable
• The more other parties use that personal information, the higher the risks for original data owner
• Buy vs. sell
• Individuals value differently protection and sale of same piece of information
• Like insurance, but:
10.23. User profiling techniques
• cookies
• super-cookies
• browser fingerprinting
10.24. Web tracking: Cookies
HTTP is stateless = no info on former visits
Most websites use cookies
four parts:
• Cookie-header in the HTTP-Request
• Cookie-header in the HTTP-Response
• Cookie data placed inthe user's browser
• Backend databse on the server
10.25. Web tracking: Cookies
10.26. Web tracking: Cookies
• Set-Cookie: CO=kiskacsa;
• Set-Cookie: CO=kiskacsa; Expires=Wed, 09 Jun 2021 10:18:14 GMT
• Set-Cookie: CO=kiskacsa; Domain=.google.com; Path=/; Expires=Wed, 09 Jun 2021 10:18:14 GMT;
httpOnly
10.27. Web tracking: Cookies
Using Cookies:
• Authorization
• Shopping cart
• Recommendations
• Session state (e.g. web email)
other alternatives to keep state:
• on the hosts:
• the state is store in the application and used for several transactions
• cookies: HTTP messages contain the state
10.28. Web tracking: Super Cookies
• users are aware and concerned of cookies
• new methods to follow users
• super-cookies - much more dangerous than cookies
• Adobe Flash cookies
• DOM storage
10.29. Web tracking: Browser Fingerprinting
• Browser-fingerprinting
10.30. Inducing customers to try new goods
• Cookies-like technology vs. anonymizing technology
• Questions
• Will cookies-like technology bring more profits?
• Will buyers use the anonymizing technology?
• Results
• No larger profits from cookies-like technology...
• ...unless something more is offered
• Enhanced services based on gathered information
• Anonymizing technologies could make society worse off
10.31. Collecting and controlling information
• info collection produces
• indirect externalities
• some consumers are pushed out of the market
• cross-market use of information - sell user info
• direct externality
• spam and direct marketing
• controlling personal information
• no control - free collection
• user controls and releases
10.32. The value of private information: theory (1/2)
• Individuals should be able to control the dissemination of their private information - privacy is an intellectual property
• Model by coalitional games: the core and Shapley value
• Scenario 1: Marketing survey
• Players are a vendor and users
• Users choose between product samples
• Vendor adopts the majority choice
• Utility: vendor - majority group size, binary for the users
10.33. The value of private information: theory (2/2)
• Scenario 2: Recommendation systems
• Each player knows items
• The total payoff for the coalition is the total number of all items
• Scenario 3: Collaborative Filtering
• N players and K items
• N x M matrix to show preferences: 1 (like), -1 (doesn't like), 0 (doesn't know)
• Player gets an advice on item from the player , who aggregates the opinion of players - the best - if they agree payoff=1, if not payoff = -1
10.34. The value of private information: practice
• Cambridge campus experiment
• Pretended tracking of location via cellphones
• Question: How much money do you want to get?
• Fixed number of people can participate
• Second-price auction - all participants get the bid of the last, who could not participate
10.35. The value of private information: practice
• Results:
• People bid round numbers
• People increase their bid if commercial interest
• Who travel outside Cambridge more than once a week or communicate with a partner ask more for their privacy
10.36. Problem with anonymity
• Anonymity cannot be provided by itself
• Each node provides and consumes anonymity
• Lazy users want to consume anonymity but provide few to reduce costs
• Trust that others also seek anonymity
• Trust not well distributed trust bottleneck attacks
• Trust not properly verified dishonest nodes can sniff traffic
• Much users do not realize they want anonymity protection
• Users do not know the value of their privacy
• Results in an anonymity equilibrium
10.37. Incentives for anonymity
• Need incentive for participants to offer and use anonymity services
• System must attract cover traffic before it can attract high-sensitivity users
• Weak security parameters may produce stronger anonymity by bringing more users
• Simplify usability to increase user base
• Optimal level of free-riding
• Accept the cost of offering anonymity service to gain cover traffic
• Usage fee
• Fee depends on sensibility of nodes' traffic
• Reputation
10.38. Anonymity systems take-away
In anonymity systems usability, efficiency, reliability and cost become security objectives as they affect the size of the user base, thus the anonymity set.
10.39. Price discrimination
• different people have different type (valuation)
• separate types by gathering information
• provide the same services for them
• increases social welfare
10.40. Price Discrimination vs. Privacy
• Internet provides privacy: "On the Internet, nobody knows you’re a dog" (Pat Steiner, The New Yorker, 1993)
• Internet provides a loss of privacy (so far)
• The push is from the private sector:
• Price discrimination to increase profit
• Learn more about customers' willingness to pay
• General resistance from the public
• Dilemma: Price discrimination is not fair, but might increase system efficiency (share the cost to produce something, but not equally)
10.41. Price Discrimination vs. Privacy
• Many examples:
• Versioning: hardcover vs. paperback books
• Scholarly publishing: price for a library depends on its size (e.g., JSTOR)
• How to hide it:
• Do not charge in cash: frequent flier miles, discount by purchase history
• Use bundles: site licensing (might increase usage in addition)
• Summary:
• Incentives to do price discrimination and reduce privacy
• Public resistance hide actions
• Governments must be involved, but no good strategy for them (since price discrimination improves general welfare)
10.42. Who should protect your privacy?
• Self-regulation?
• Individual responsibility?
• Policy/legislation?
• EU vs. US
• Samuelson 2003: The social costs of confusing privacy policies
• Hu, Smith, Tang 2004
10.43. Solutions
• regulation
• on collecting user information
• free markets will regulate info collection (Posner 1978, Stigler 1980)
• technology
• private browsing
• proxy servers
• user education
• privacy concerns (e.g. for cookies)
10.44. Reading for next time
• The Privacy Jungle: On the Market for Data Protection in Social Networks [Bonneau and Preibusch, WEIS 2009]
optional:
• Privacy by ReDesign: Alleviating Privacy Concerns for Third-Party Apps [Xu, Wang, and Grossklags, ICIS 2012]
11. 11 Economics of privacy
11.1. Social networks and privacy
• social networks
• information sharing in relevant circles
• maybe unwillingly
• users have a bad control over their private info
• controls
• privacy-enhancing technologies (PET)
• user choice and control
• not a selling point
• diverse privacy policies
11.2. Economics issues in privacy
• implementation errors
• interfaces to improve privacy controls
• systemic issues of privacy
• problems from the social graph
• enabler for phishing attacks
• user valuation: the privacy paradox
11.3. Survey of social networking sites
• privacy practices of 45 social networking sites
• full view - not only the big sites
• privacy problems
• categories
• general-purpose sites (Facebook, Orkut, etc.)
• only English langage
• niche sites
• business networking (LinkedIn)
• media recommendation (Last.fm, Flickster)
• reunion (classmates.com)
• activity-focused (CouchSurfing)
• privacy-specific (Imbee, Kaioo)
11.4. Evaluation methodology
• data collection
• general info about the site
• user info requested at signup
• technically consistent
11.5. Promotion techniques
11.6. Data collection during sign-up
• too much data gathered
11.7. Privacy controls
• profile visible by default
• privacy controls
11.8. Invasive features
• features enable to discover private data
11.9. Comparison of privacy policies
• privacy policies are not comprehensive
• most general-purpose sites used it
• users not encouraged to read it
• many features are not properly implemented
• problems with access
• readability
11.10. Privacy affecting factors
• positive
• functionality
• size
• age
• growth
• negative
• promotion of privacy
11.11. Summary of the survey
• not an oligopoly, but thousands of niche sites
• users and social networking sites care about privacy
• behavioral economics: users do not make rational decisions
• most sites do not implement adequate privacy
• need for non-textual privacy policies
11.12. User control over private information
• mostly in social networking sites
• customized ads are more effective
• users might also consider them more intrusive
• solution:
• give users more privacy control
• side-effect:
• users get more aware of privacy
11.13. Facebook privacy experiment
• business model
• measurement study on social apps on facebook and qualitative observations
• field study on permissions dialogues
• displaying ads with different settings
11.14. Third-party apps on FB
• Unprecedented scale of Facebook network
• 901 million monthly active Facebook users at the end of March 2012
• 488 million monthly active users who used Facebook mobile products in March 2012
• Development API for third-party apps and other exogenously developed content in May 2007
• Vast amount of social network and personal data at developers' disposal (privacy issues)
11.15. A two-sided market
11.16. Monetizing Facebook's app market
• Payments and other fees
• Negotiated fee from platform developers when users make purchases using Facebook's payments infrastructure
• Mandated use of payments infrastructure for game apps (i.e., fees related to payments are dominantly from games)
• Zynga: the most prominent and lucrative example:
• Twelve percent of Facebook's total revenue in 2011
• Sales of virtual goods and direct advertising
• Facebook retains a fee of up to 30% of the face value of user purchases in Zynga's games on the Facebook Platform
11.17. Facebook's and apps' informational market power
• Access to developer API (for developers) and apps (for users) is free and relatively unrestricted
• Eagerness for growth likely leads to exploitation of asymmetric information
• Users are imperfectly informed and cannot explore privacy and security consequences of apps (that may occur over time) without harm
• Companies have no incentive to reveal revenue streams that exploit privacy and lead to potential dissatisfaction with product
• Consequences:
• Easy to observe monetary cost are low/free
• Hard to observe privacy impact/cost may be high
11.18. Frictionless adding of apps until 2010
• Facebook granted third-party apps broad access to users' (and their friends') profiles without a granular permissions system
• Study by Felt and Evans (2008) on 150 popular apps:
• More narrow set of permissions sufficient to accomplish apps' subjectively evaluated goals Apps vastly over-privileged
11.19. Privacy and security issues in the past
• Study by Wall Street Journal
• Numerous third-party apps were found extracting identifiable user information
• Sharing this bounty with advertising companies
11.20. Regulatory intervention
• For example: Investigation by Canada's Privacy Commissioner started in 2008
• Facebook introduced the currently deployed system of granular permissions in September 2010
11.21. Measurement study
• Gathered comprehensive list of third-party apps from AppData (appdata.com)
• Apps with more than 1000 monthly active users
• Recorded profile (app ID, name, active users, rating and reviews, category, app URL redirection)
• Merged data for four samples from AppData
• List included 29,020 apps of which 27,404 apps (i.e., 94.43% of our initial dataset) were accessible
• "Disappearance" of more than 1,500 apps indicates high volatility on the market
• Google Chrome Extension used for data collection
11.22. Five app categories
• 9,411 apps (34 % of the total sample)
• Presence of a typical authentication and authorization dialog page
• Presence of a typical authentication and authorization dialog page