Clip Man

daniele


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery


Thoughts on Mozilla Community
Lizard Thumb
daniele
Jamie Zawinski's 1999 post on leaving AOL/Netscape/mozilla.org

This bit of history was linked today on Hacker News, and re-reading it touched on some points that I have been carefully considering the past few months, prompting me to write this.

I believe most people would agree that the Mozilla of today is a vastly changed beast from the dinosaur of the past.  We are faster, stronger, and we are delivering products and services that are changing the lives of hundreds of millions of people.

Of course, we still have a few of the same problems Jamie mentioned in his post, but we are much more cognizant of those problems and both the employees of Mozilla and the community are hard at work every day to solve those problems.

The piece I wanted to specifically share my thoughts about is community participation.  Jamie mentioned in his post that the release of the Netscape source code was a beacon of hope to him.  It was putting the control of the web browser into the hands of anyone who cared to step up to the task.  By the time that he decided to resign however, he did not feel that particular goal had succeeded.  He stated that the project was not adopted by the outside.  I might argue the details since it wouldn't have gotten where it was today if there was zero adoption, but the point is still useful.

Further down was the kicker for me.  In "Excuse #2" Jamie states the following:
"People only really contribute when they get something out of it. When someone is first beginning to contribute, they especially need to see some kind of payback, some kind of positive reinforcement, right away. For example, if someone were running a web browser, then stopped, added a simple new command to the source, recompiled, and had that same web browser plus their addition, they would be motivated to do this again, and possibly to tackle even larger projects."


Wow, that harmonizes with my feelings today.  Everyone who is active in the Mozilla Community today is active because they get something out of it.  I believe that the majority of people who are no longer active in the community reached some point where they stopped getting something useful out of it.  That might have been because something was too hard for them to do by themselves, or because of a bad interaction with another community member or even an employee.  But, for whatever reason, they lost that spark and then we lost them.

The Mozilla employees need to become flint and steel for the community.  We need to be constantly tossing sparks out, flashes of light that will ignite fire in the spirits of our community and will catch the attention of people across the globe and draw them in.

Back in December, I met with a few people including Pedro Alves, David Eaves, Asa Dotzler, and Dan Mosedale. We were discussing a project to quantify and provide actionable insight for community participation.

Since that meeting, the Mozilla Metrics team has been working on the first iteration of this project, a dashboard that visualizes commit activity in Mozilla's Mecurial repository. Within a few weeks, we should have this ready to share with people. Future iterations will also draw data from Bugzilla patches and reviews. There are a ton of other datasources that we could consider, but these are definitely enough to get us started. I am also very pleased to see several other teams working on other community focused projects. Obviously Mozilla Drumbeat is the flagship there, but the SUMO and Engagement teams have interesting projects in the works as well.

All this activity centered on engaging our community is thrilling to me. Every person that we inspire, assist, and collaborate with is much more likely to "pay it forward" and do the same for someone else. Community is what strengthens Mozilla. There are many open source projects with communities around them, but Mozilla is in the company of a special few whose goals are truly public benefit driven rather than using open source as a tool to make a great piece of software.

Throughout 2011, I believe we need to spend as much time focusing on ways to understand and collaborate to make our community an army of awesome as we spend on the more straightforward engineering task of making our services and products awesome. This is what Jamie stated inspired him about Netscape's original choice to embrace community and open source, and it is definitely what inspired me to become involved and make Mozilla's mission my own.

Kspice is a useful technology for BI and Hadoop
Clip Man
daniele
Having infrastructure that is completely redundant and easy to bring down for Linux kernel upgrades is a wonderful thing, but it is pretty hard and frequently you end up with systems that still require downtime maintenance windows for rebooting.
Here are a few examples I can offer:
  • Pentaho BI server
  • ETL processing servers
  • Data warehouse database servers
  • Vertica cluster nodes
  • Hadoop NameNode

What Ksplice does is pretty cool.  They take your currently installed Linux kernel and apply a series of in-memory binary patches to it without requiring a reboot to keep you up to date with the latest security patches. You can be patched and secure within hours of a critical update.

Harry Potter meets SCIENCE!
Clip Man
daniele
I enjoyed reading the Harry Potter series, but at times found myself quite despising Harry as a kid with too big of an ego who didn't listen and didn't think.
I really loved the Ender's Game series because, for the most part, Ender was quite the opposite of Harry.

To anyone who can associate these feelings with theirs: Please go read this fanfic.  Try to read at least through chapter 5, because the author states that the common consensus is that is when the story hits its stride, and I concur with that sentiment.
Please note that I have only made it through chapter 7 so far so I could be setting you up for disappointment, but given the number of chuckles I've had, I don't think this will be the case.

Harry Potter and the Methods of Rationality
Tags:

Mozilla has at least four sites in the Google top 1000 sites list
Clip Man
daniele
I just found out about Google's top 1000 websites list from a post on the fseek.me blog.
I expected to find mozilla.com near the top (it's position #10), but when, on a whim, I did a text search for getfirefox, I bumped into getpersonas.com at position #137, that's just plain cool. :)

Update: Here are the ones spotted so far:

10. mozilla.com
74. mozilla.org
137. getpersonas.com
673. mozilla-europe.org
Tags:

Book Review - Pentaho Data Integration 3.2: Beginner's Guide
Clip Man
daniele

On my plane flight yesterday, I was finally able to crack this book open and I ended up reading it from cover to cover in one sitting!
I highly recommend the book to anyone who has to deal with manipulating and moving data from one place to another on a regular basis.  Pentaho Data Integration (PDI or Kettle) is an amazing tool for these types of tasks, and this book is a fantastic way for someone to quickly become familiar with the tool and start producing useful jobs and transformations in it.

The book has a very light and easy tone, and it is filled with lots of practical, real-world examples and screen-shots which make it easy to follow along.  It has quiz questions (a couple of which I had to double check my answer on, and *that* is telling you something!).  I also enjoyed the "have a go, hero" sections where you are given some simple follow-up tasks that help drill in the topic that was just discussed.

I've ordered a few copies to be kept at the Mozilla headquarters so new members of my team can become familiar with the tool that is so much the lifeblood of our team's products, and I also want people in Mozilla who currently write little scripts to munge data to take a look as this tool makes it very easy to turn these processes into something that is visually documented and very extensible.

The book can be ordered directly from the publisher via the link above, or through Amazon.  Check it out!
Tags:

Microsoft's random browser ballot
Clip Man
daniele
When I saw what the page was supposed to do, (first five and last five sets displayed in random order) the first thing I thought was, "Man, I hope they tested the fool out of that or someone will make a fool out of them."


Looks like there wasn't quite enough testing. That said, I don't think that anyone could argue that the bug was damaging. No one browser was ever left out or in the same position.


Doing the Microsoft Shuffle


Dear LazyWeb
Clip Man
daniele
I'd like to help the Thunderbird community out by posting a series of quick video demonstrations of how I use Thunderbird + Lightning in my daily work.

In order to do this though, I need an extension/jetpack that can redact, blur, scramble or one-way pad all the user data (i.e. e-mail addresses, message/meeting/task content, etc. Those are the minimum I need to do what I want. I suppose others might want to redact other fields like dates and categories and folder names too.

Is there anyone who thinks they might be able to create such a tool without too much trouble?

SSH magic
Clip Man
daniele
I use SSH on a daily basis. Most of the machines I connect to can be accessed in one of two ways:
  1. OpenSSL VPN
  2. SSH to a jumphost then SSH from there to the desired machine

I wanted to share the configuration I use to make that easier.
Read more...Collapse )
Tags: , ,

Performance of Rhino JS engine and Janino library in Kettle
Clip Man
daniele
My friend Roland Bouman made an interesting blog post regarding the performance of a bit of JavaScript for Kettle that he saw on a different blog.

Given the large amounts of data that I am shoving through Kettle every day, I tend to be extremely concerned about performance. Even a small inefficiency can lead to dramatic slowdowns. Hence, when I saw his post, I got to thinking about how I would approach the problem if it were within the realm of the large data sets I work with and hence required extreme optimization.

I didn't have a lot of spare time to dedicate to this experiment, so I opted for a screen-cast instead of a nicely formatted blog post. That said, I think there is a certain benefit in being able to see the work flow of someone who is very comfortable with Kettle.

The screen-cast is currently in Apple QuickTime format. Bleh. I need to get a new Ogg Theora transcoder because the one that I tried to use last time is not happy with me and I didn't have time to fiddle with it.

So, if you use Kettle and are interested in these things, here is the screen-cast. Be warned it is 30 minutes long and probably not extremely exciting to anyone outside of the ETL field.

Kettle string transformation optimization walk-through

If you are familiar with developing plug-ins for Kettle and you'd like to take a look at the User Defined Java Class plug-in I demonstrated at the end of the screen-cast, you can pick it up from the Pentaho SVN plugins repository. Just wear gloves because it has rough edges.
User Defined Java Class plug-in
Tags:

Advice regarding using Travelex Cash Passport cards for travel
Clip Man
daniele
If you are a traveling to Europe and considering getting one of these debit cards to make life easier for you while there, my advice is don't! Figure out which of your credit cards charges the least amount of fees for international usage and just use it.

Further ranting and whining below the cut...Collapse )
Tags:

You are viewing daniele