Clip Man


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery

Entries by tag: privacy

A post about personal data
Clip Man
Mitchell Baker, the Chairperson of Mozilla Foundation and Mozilla Corporation recently posted a series of blog entries about data:

  • Thinking About Data
  • Framework for discussing “data”
  • Why focus on data?
  • Data Relating to People
  • Data — getting to the point

    This discussion is something I've been looking forward to seeing at Mozilla since I started back in March. In the work that I do, I make every effort to safeguard data and make sure that what I process and store can't turn around and bite me later.

    One thing that I felt could use a different approach of listing out is the different forms of personal data that people are likely to generate or come across in the web world.

    To me, the best way to categorize these types of personal data is with a matrix. I've created one below that has the origin of the data as the X axis and the classification of the data as the Y axis. Inside each cell, I've placed a few examples that I think represent that intersection of data.

    I'd encourage anyone interested in this to comment on other origins, classifications, or examples of personal data. The more we have defined, the easier it will be to make sure that our discussions about data don't leave anything out.

    I've also saved this document on (Personal data types matrix).
    If anyone wishes to collaborate with me on enhancing it, please just let me know in the comments and I'll send you a collaboration invitation.








    Name/Address (partial)

    IP address

    Contact information (comprehensive)


    E-mail address3

    Blog URL

    Credit card information




    Website filters

    Friend invitations

    Friends list

    Friends watched/followed


    Blog posts4

    PGP key

    Contact information (comprehensive)


    Blog posts5


    Friends list



    Personal search terms

    Extrapolated interests

    clickstream in site

    Web history

    People watched/followed

    1Multiple pieces of potential identifying information are usually needed to make definite identification or direct contact

    2Data may be elicited as a requirement for interaction with the data collector (e.g. IP address required to view a web page or shipping information required for a purchase) or it may be optional (e.g. a blog comment form requesting your URL).

    3E-mail address is a definite identification because it immediately allows a person to contact you directly

    4Blog posts talking about who you are or where you live are potentially identifying.

    5Blog posts talking about topics that interest you or things you do are characterizing.

You are viewing daniele