Clip Man


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery

Previous Entry Share Next Entry
Good bye Mountain View
Clip Man
It has been a great two weeks out here in the office.  I've gotten to see a lot of people face to face and had some useful meetings about my projects.  I just kicked off another round of massive data loads to run over the weekend while I'm out of pocket. Hopefully they will run smoothly and deliver me high quality data.

There are some really exciting things coming up this quarter:
  • I'll be working on one of the largest data sets yet, our AMO data.  We have several really cool mechanisms for visualizing individual extension projects hosted on AMO. The developer has control over whether to make the statistics public or not.  As an example, you can take a look at the statistics for Adblock Plus.  I'll be working on ways to be able to integrate data across projects so we can get a better understanding of the extension community that means so very much to Mozilla.
  • I'll hopefully be blogging a little more about the complexities of processing the large amount of data that I have to crunch through.
  • I'll be making several pieces of my Pentaho Data Integration (Kettle for those of you in the know) ETL scripts available in an open source repository.  It will help with the blogging, they might be useful to other people doing similar things, and who knows, maybe some people will even have suggestions for improvements!
  • Later in the quarter, I'll be working on an exciting new project to take some of the aggregated data that Mozilla has, such as the number of downloads of Firefox for given time periods, and making it available publicly for the community to explore and visualize.  At the moment, I'm leaning toward trying to use the Many-Eyes project from IBM AlphaWorks.  If anyone has any better ideas, please let me know.

Powered by ScribeFire.

Tags: ,


Log in

No account? Create an account