Clip Man

daniele


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery


Entries by tag: privacy

A post about personal data
Clip Man
daniele
Mitchell Baker, the Chairperson of Mozilla Foundation and Mozilla Corporation recently posted a series of blog entries about data:

  • Thinking About Data
  • Framework for discussing “data”
  • Why focus on data?
  • Data Relating to People
  • Data — getting to the point

    This discussion is something I've been looking forward to seeing at Mozilla since I started back in March. In the work that I do, I make every effort to safeguard data and make sure that what I process and store can't turn around and bite me later.

    One thing that I felt could use a different approach of listing out is the different forms of personal data that people are likely to generate or come across in the web world.

    To me, the best way to categorize these types of personal data is with a matrix. I've created one below that has the origin of the data as the X axis and the classification of the data as the Y axis. Inside each cell, I've placed a few examples that I think represent that intersection of data.

    I'd encourage anyone interested in this to comment on other origins, classifications, or examples of personal data. The more we have defined, the easier it will be to make sure that our discussions about data don't leave anything out.

    I've also saved this document on docs.google.com (Personal data types matrix).
    If anyone wishes to collaborate with me on enhancing it, please just let me know in the comments and I'll send you a collaboration invitation.


    Identifying

    Characterizing


    Potential1

    Definite

    Self

    Relationships

    Elicited2

    Name/Address (partial)

    IP address

    Contact information (comprehensive)

    SSN

    E-mail address3

    Blog URL

    Credit card information

    Demographics

    Location

    Interests

    Website filters

    Friend invitations

    Friends list

    Friends watched/followed

    Published

    Blog posts4


    PGP key

    Contact information (comprehensive)

    Interests

    Blog posts5

    Wishlists

    Friends list

    Harvested

    cookies

    Personal search terms


    Extrapolated interests

    clickstream in site

    Web history

    People watched/followed


    1Multiple pieces of potential identifying information are usually needed to make definite identification or direct contact

    2Data may be elicited as a requirement for interaction with the data collector (e.g. IP address required to view a web page or shipping information required for a purchase) or it may be optional (e.g. a blog comment form requesting your URL).

    3E-mail address is a definite identification because it immediately allows a person to contact you directly

    4Blog posts talking about who you are or where you live are potentially identifying.

    5Blog posts talking about topics that interest you or things you do are characterizing.


You are viewing daniele