Log in

No account? Create an account
Clip Man


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery

Previous Entry Share Next Entry
Ways to visualize and share data
Clip Man
Mozilla needs to be able to provide useful extracts of data such as download trends, etc. and allow the community to perform their own analysis on them, so I'm always keeping a lookout for useful tools to further that goal.

When Tony Wright posted the blog entry Just How Important is the Valley? Let’s Look at some Data on April 17th 2009, he was kind enough to publish the data set (it needs an attribution / license though) and the data looked interesting so I thought I'd spend a little time playing with it using some tools that I've been keeping my eye on.

First, I slurped the table into DabbleDB, a website that is very well suited to messing with this type of data (i.e. sourced from the web, might need a bit of cleanup, etc.). You can view and edit the data I imported to DabbleDB here: Acquired Startups Data

DabbleDB does a great job at allowing a user to sort, filter, group, and modify data using a simple interface, but it does not have a large array of visualizations. For that, we head over here to the IBM AlphaWorks lab's project, Many Eyes Wikified.  I created a quick wiki dashboard for throwing together a few visualizations: Acquired Startups Visualizations

This was just a quick break from real work I've been doing, so I spent less than an hour on this.  I only took about 20 minutes with DabbleDB: importing the data, cleaning the dollar values, then creating two new views that group the data by country or by state for visualization.  Then I moved over to Many Eyes and played with a few visualizations to try to find some interesting views of the data and threw them into the dashboard and two sub pages.

Being able to quickly extract, transform, and visualize this data is the big win for DabbleDB and Many Eyes in my opinion.  With both applications having open licensing of the data and collaboration as a key focus, they are tools that I hope to be able to take advantage of at Mozilla soon.

  • 1
Nice! Now if we only had data like add-on blocklist or auto-update pings or even bouncer downloads of all Mozilla apps in the open, we'd be able to openly mash up those as well...

This is absolutely a goal for the Metrics team. The biggest hurdle right now is how to provide safe and effective access to those data feeds (which aren't small enough to fit on something like DabbleDB).

  • 1