- Thinking About Data
- Framework for discussing “data”
- Why focus on data?
- Data Relating to People
- Data — getting to the point
This discussion is something I've been looking forward to seeing at Mozilla since I started back in March. In the work that I do, I make every effort to safeguard data and make sure that what I process and store can't turn around and bite me later.
One thing that I felt could use a different approach of listing out is the different forms of personal data that people are likely to generate or come across in the web world.
To me, the best way to categorize these types of personal data is with a matrix. I've created one below that has the origin of the data as the X axis and the classification of the data as the Y axis. Inside each cell, I've placed a few examples that I think represent that intersection of data.
I'd encourage anyone interested in this to comment on other origins, classifications, or examples of personal data. The more we have defined, the easier it will be to make sure that our discussions about data don't leave anything out.
I've also saved this document on docs.google.com (Personal data types matrix).
If anyone wishes to collaborate with me on enhancing it, please just let me know in the comments and I'll send you a collaboration invitation.
Identifying
Characterizing
Potential1
Definite
Self
Relationships
Elicited2
Name/Address (partial)
IP address
Contact information (comprehensive)
SSN
E-mail address3
Blog URL
Credit card information
Demographics
Location
Interests
Website filters
Friend invitations
Friends list
Friends watched/followed
Published
Blog posts4
PGP key
Contact information (comprehensive)
Interests
Blog posts5
Wishlists
Friends list
Harvested
cookies
Personal search terms
Extrapolated interests
clickstream in site
Web history
People watched/followed
1Multiple pieces of potential identifying information are usually needed to make definite identification or direct contact
2Data may be elicited as a requirement for interaction with the data collector (e.g. IP address required to view a web page or shipping information required for a purchase) or it may be optional (e.g. a blog comment form requesting your URL).
3E-mail address is a definite identification because it immediately allows a person to contact you directly
4Blog posts talking about who you are or where you live are potentially identifying.
5Blog posts talking about topics that interest you or things you do are characterizing.
- A post about personal data
-
Mitchell Baker, the Chairperson of Mozilla Foundation and Mozilla Corporation recently posted a series of blog entries about data:
(Leave a comment)
![[info]](http://l-stat.livejournal.com/img/userinfo.gif)

