<?xml version='1.0' encoding='utf-8' ?>
<!--  If you are running a bot please visit this policy page outlining rules you must respect. http://www.livejournal.com/bots/  -->
<rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:media='http://search.yahoo.com/mrss/' xmlns:atom10='http://www.w3.org/2005/Atom'>
<channel>
  <title>Daniel Einspanjer&apos;s journal</title>
  <link>http://daniele.livejournal.com/</link>
  <description>Daniel Einspanjer&apos;s journal - LiveJournal.com</description>
  <lastBuildDate>Fri, 11 Feb 2011 16:00:08 GMT</lastBuildDate>
  <generator>LiveJournal / LiveJournal.com</generator>
  <lj:journal>daniele</lj:journal>
  <lj:journalid>454686</lj:journalid>
  <lj:journaltype>personal</lj:journaltype>
  <image>
    <url>http://l-userpic.livejournal.com/18541170/454686</url>
    <title>Daniel Einspanjer&apos;s journal</title>
    <link>http://daniele.livejournal.com/</link>
    <width>74</width>
    <height>100</height>
  </image>

<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/80677.html</guid>
  <pubDate>Fri, 11 Feb 2011 16:00:08 GMT</pubDate>
  <title>Thoughts on Mozilla Community</title>
  <link>http://daniele.livejournal.com/80677.html</link>
  <description>&lt;a href=&quot;http://www.jwz.org/gruntle/nomo.html&quot; target=&quot;_blank&quot; rel=&quot;nofollow&quot;&gt;Jamie Zawinski&apos;s 1999 post on leaving AOL/Netscape/mozilla.org&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This bit of history was linked today on Hacker News, and re-reading it touched on some points that I have been carefully considering the past few months, prompting me to write this.&lt;br /&gt;&lt;br /&gt;I believe most people would agree that the Mozilla of today is a vastly changed beast from the dinosaur of the past.&amp;nbsp; We are faster, stronger, and we are delivering products and services that are changing the lives of hundreds of millions of people.&lt;br /&gt;&lt;br /&gt;Of course, we still have a few of the same problems Jamie mentioned in his post, but we are much more cognizant of those problems and both the employees of Mozilla and the community are hard at work every day to solve those problems.&lt;br /&gt;&lt;br /&gt;The piece I wanted to specifically share my thoughts about is community participation.&amp;nbsp; Jamie mentioned in his post that the release of the Netscape source code was a beacon of hope to him.&amp;nbsp; It was putting the control of the web browser into the hands of anyone who cared to step up to the task.&amp;nbsp; By the time that he decided to resign however, he did not feel that particular goal had succeeded.&amp;nbsp; He stated that the project was not adopted by the outside.&amp;nbsp; I might argue the details since it wouldn&apos;t have gotten where it was today if there was zero adoption, but the point is still useful.&lt;br /&gt;&lt;br /&gt;Further down was the kicker for me.&amp;nbsp; In &amp;quot;Excuse #2&amp;quot; Jamie states the following:&lt;br /&gt;&lt;blockquote&gt;&amp;quot;People only really contribute when they get something out of it. When someone is first beginning to contribute, they especially need to see some kind of payback, some kind of positive reinforcement, right away. For example, if someone were running a web browser, then stopped, added a simple new command to the source, recompiled, and had that same web browser plus their addition, they would be motivated to do this again, and possibly to tackle even larger projects.&amp;quot;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Wow, that harmonizes with my feelings today.&amp;nbsp; &lt;em&gt;Everyone who is active in the Mozilla Community today is active because they get something out of it.&lt;/em&gt;&amp;nbsp; I believe that the majority of people who are no longer active in the community reached some point where they stopped getting something useful out of it.&amp;nbsp; That might have been because something was too hard for them to do by themselves, or because of a bad interaction with another community member or even an employee.&amp;nbsp; But, for whatever reason, they lost that spark and then we lost them.&lt;br /&gt;&lt;br /&gt;The Mozilla employees need to become flint and steel for the community.&amp;nbsp; We need to be constantly tossing sparks out, flashes of light that will ignite fire in the spirits of our community and will catch the attention of people across the globe and draw them in.&lt;br /&gt;&lt;br /&gt;Back in December, I met with a few people including &lt;a href=&quot;http://pedroalves-bi.blogspot.com/&quot; target=&quot;_blank&quot; rel=&quot;nofollow&quot;&gt;Pedro Alves&lt;/a&gt;, &lt;a target=&quot;_blank&quot; href=&quot;http://eaves.ca/&quot; rel=&quot;nofollow&quot;&gt;David Eaves&lt;/a&gt;, &lt;a href=&quot;http://weblogs.mozillazine.org/asa/&quot; target=&quot;_blank&quot; rel=&quot;nofollow&quot;&gt;Asa Dotzler&lt;/a&gt;, and &lt;a target=&quot;_blank&quot; href=&quot;http://redpuma.net/blog/&quot; rel=&quot;nofollow&quot;&gt;Dan Mosedale&lt;/a&gt;.  We were discussing a project to quantify and provide actionable insight for community participation.&lt;br /&gt;&lt;br /&gt;Since that meeting, the Mozilla Metrics team has been working on the first iteration of this project, a dashboard that visualizes commit activity in Mozilla&apos;s Mecurial repository.  Within a few weeks, we should have this ready to share with people.  Future iterations will also draw data from Bugzilla patches and reviews.  There are a ton of other datasources that we could consider, but these are definitely enough to get us started.  I am also very pleased to see several other teams working on other community focused projects.  Obviously &lt;a target=&quot;_blank&quot; href=&quot;https://www.drumbeat.org/&quot; rel=&quot;nofollow&quot;&gt;Mozilla Drumbeat&lt;/a&gt; is the flagship there, but the SUMO and Engagement teams have interesting projects in the works as well.&lt;br /&gt;&lt;br /&gt;All this activity centered on engaging our community is thrilling to me.  Every person that we inspire, assist, and collaborate with is much more likely to &amp;quot;pay it forward&amp;quot; and do the same for someone else.  Community is what strengthens Mozilla.  There are many open source projects with communities around them, but Mozilla is in the company of a special few whose goals are truly public benefit driven rather than using open source as a tool to make a great piece of software.&lt;br /&gt;&lt;br /&gt;Throughout 2011, I believe we need to spend as much time focusing on ways to understand and collaborate to make our community an army of awesome as we spend on the more straightforward engineering task of making our services and products awesome.  This is what Jamie stated inspired him about Netscape&apos;s original choice to embrace community and open source, and it is definitely what inspired me to become involved and make Mozilla&apos;s mission my own.</description>
  <comments>http://daniele.livejournal.com/80677.html</comments>
  <category>open source</category>
  <category>mozilla</category>
  <category>community</category>
  <lj:mood>optimistic</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/80577.html</guid>
  <pubDate>Tue, 01 Jun 2010 18:13:31 GMT</pubDate>
  <title>Kspice is a useful technology for BI and Hadoop</title>
  <link>http://daniele.livejournal.com/80577.html</link>
  <description>Having infrastructure that is completely redundant and easy to bring down for Linux kernel upgrades is a wonderful thing, but it is pretty hard and frequently you end up with systems that still require downtime maintenance windows for rebooting.&lt;br /&gt;Here are a few examples I can offer:&lt;ul&gt;&lt;li&gt;Pentaho BI server&lt;/li&gt;&lt;li&gt;ETL processing servers&lt;/li&gt;&lt;li&gt;Data warehouse database servers&lt;/li&gt;&lt;li&gt;Vertica cluster nodes&lt;/li&gt;&lt;li&gt;Hadoop NameNode&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;What &lt;a href=&quot;http://www.ksplice.com/&quot; rel=&quot;nofollow&quot;&gt;Ksplice&lt;/a&gt; does is pretty cool.&amp;nbsp; They take your currently installed Linux kernel and apply a series of in-memory binary patches to it without requiring a reboot to keep you up to date with the latest security patches. You can be patched and secure within hours of a critical update.&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/80577.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/80255.html</guid>
  <pubDate>Fri, 28 May 2010 21:38:23 GMT</pubDate>
  <title>Harry Potter meets SCIENCE!</title>
  <link>http://daniele.livejournal.com/80255.html</link>
  <description>I enjoyed reading the Harry Potter series, but at times found myself quite despising Harry as a kid with too big of an ego who didn&apos;t listen and didn&apos;t think.&lt;br /&gt;I really loved the Ender&apos;s Game series because, for the most part, Ender was quite the opposite of Harry.&lt;br /&gt;&lt;br /&gt;To anyone who can associate these feelings with theirs: Please go read this fanfic.&amp;nbsp; Try to read at least through chapter 5, because the author states that the common consensus is that is when the story hits its stride, and I concur with that sentiment.&lt;br /&gt;Please note that I have only made it through chapter 7 so far so I could be setting you up for disappointment, but given the number of chuckles I&apos;ve had, I don&apos;t think this will be the case.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.fanfiction.net/s/5782108/1/Harry_Potter_and_the_Methods_of_Rationality&quot; rel=&quot;nofollow&quot;&gt;&lt;span style=&quot;font-size: larger;&quot;&gt;Harry Potter and the Methods of Rationality&lt;/span&gt;&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/80255.html</comments>
  <category>book</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/79974.html</guid>
  <pubDate>Fri, 28 May 2010 14:20:43 GMT</pubDate>
  <title>Mozilla has at least four sites in the Google top 1000 sites list</title>
  <link>http://daniele.livejournal.com/79974.html</link>
  <description>I just found out about &lt;a href=&quot;http://www.google.com/adplanner/static/top1000/&quot; rel=&quot;nofollow&quot;&gt;Google&apos;s top 1000 websites list&lt;/a&gt; from a post on the fseek.me blog.&lt;br /&gt;I&amp;nbsp;expected to find mozilla.com near the top (it&apos;s position #10), but when, on a whim, I did a text search for getfirefox, I bumped into getpersonas.com at position #137, that&apos;s just plain cool. :)&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update&lt;/strong&gt;:&amp;nbsp;Here are the ones spotted so far:&lt;br /&gt;&lt;br /&gt;10. mozilla.com&lt;br /&gt;74. mozilla.org&lt;br /&gt;137. getpersonas.com&lt;br /&gt;673. mozilla-europe.org</description>
  <comments>http://daniele.livejournal.com/79974.html</comments>
  <category>mozilla</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/79849.html</guid>
  <pubDate>Fri, 21 May 2010 22:46:56 GMT</pubDate>
  <title>Book Review - Pentaho Data Integration 3.2: Beginner&apos;s Guide</title>
  <link>http://daniele.livejournal.com/79849.html</link>
  <description>&lt;div style=&quot;text-align: center;&quot;&gt;&lt;a href=&quot;http://www.packtpub.com/pentaho-3-2-data-integration-beginners-guide/book?utm_source=daniel.yipyip.com&amp;amp;utm_medium=bookrev&amp;amp;utm_content=blog&amp;amp;utm_campaign=mdb_003124&quot; rel=&quot;nofollow&quot;&gt;&lt;span style=&quot;font-size: large;&quot;&gt;Pentaho Data Integration 3.2: Beginner&apos;s Guide&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style=&quot;text-align: center;&quot;&gt;&lt;a href=&quot;http://pics.livejournal.com/daniele/pic/00001xwt/&quot; rel=&quot;nofollow&quot;&gt;&lt;img height=&quot;240&quot; border=&quot;0&quot; width=&quot;194&quot; src=&quot;http://pics.livejournal.com/daniele/pic/00001xwt/s320x240&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;On my plane flight yesterday, I was finally able to crack this book open and I ended up reading it from cover to cover in one sitting!&lt;br /&gt;I highly recommend the book to anyone who has to deal with manipulating and moving data from one place to another on a regular basis.&amp;nbsp; Pentaho Data Integration (PDI&amp;nbsp;or Kettle) is an amazing tool for these types of tasks, and this book is a fantastic way for someone to quickly become familiar with the tool and start producing useful jobs and transformations in it.&lt;br /&gt;&lt;br /&gt;The book has a very light and easy tone, and it is filled with lots of practical, real-world examples and screen-shots which make it easy to follow along.&amp;nbsp; It has quiz questions (a couple of which I had to double check my answer on, and *that* is telling you something!).&amp;nbsp; I also enjoyed the &amp;quot;have a go, hero&amp;quot; sections where you are given some simple follow-up tasks that help drill in the topic that was just discussed.&lt;br /&gt;&lt;br /&gt;I&apos;ve ordered a few copies to be kept at the Mozilla headquarters so new members of my team can become familiar with the tool that is so much the lifeblood of our team&apos;s products, and I&amp;nbsp;also want people in Mozilla who currently write little scripts to munge data to take a look as this tool makes it very easy to turn these processes into something that is visually documented and very extensible.&lt;br /&gt;&lt;br /&gt;The book can be ordered directly from the publisher via the link above, or through Amazon.&amp;nbsp; Check it out!</description>
  <comments>http://daniele.livejournal.com/79849.html</comments>
  <category>kettle</category>
  <lj:mood>cheerful</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/79183.html</guid>
  <pubDate>Sun, 28 Feb 2010 23:48:13 GMT</pubDate>
  <title>Microsoft&apos;s random browser ballot</title>
  <link>http://daniele.livejournal.com/79183.html</link>
  <description>When I saw what the page was supposed to do, (first five and last five sets displayed in random order) the first thing I thought was, &quot;Man, I hope they tested the fool out of that or someone will make a fool out of them.&quot;&lt;p&gt;&lt;br /&gt;Looks like there wasn&apos;t quite enough testing.  That said, I don&apos;t think that anyone could argue that the bug was damaging.  No one browser was ever left out or in the same position.&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;a href=&quot;http://www.robweir.com/blog/2010/02/microsoft-random-browser-ballot.html&quot; rel=&quot;nofollow&quot;&gt;Doing the Microsoft Shuffle&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/79183.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/79037.html</guid>
  <pubDate>Mon, 04 Jan 2010 13:51:28 GMT</pubDate>
  <title>Dear LazyWeb</title>
  <link>http://daniele.livejournal.com/79037.html</link>
  <description>I&apos;d like to help the Thunderbird community out by posting a series of quick video demonstrations of how I use Thunderbird + Lightning in my daily work.&lt;br /&gt;&lt;br /&gt;In order to do this though, I need an extension/jetpack that can redact, blur, scramble or one-way pad all the user data (i.e. e-mail addresses, message/meeting/task content, etc.  Those are the minimum I need to do what I want. I suppose others might want to redact other fields like dates and categories and folder names too.&lt;br /&gt;&lt;br /&gt;Is there anyone who thinks they might be able to create such a tool without too much trouble?</description>
  <comments>http://daniele.livejournal.com/79037.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/78783.html</guid>
  <pubDate>Tue, 08 Dec 2009 20:34:33 GMT</pubDate>
  <title>SSH magic</title>
  <link>http://daniele.livejournal.com/78783.html</link>
  <description>I use SSH on a daily basis.  Most of the machines I connect to can be accessed in one of two ways:&lt;ol&gt;&lt;li&gt;OpenSSL VPN&lt;/li&gt;&lt;li&gt;SSH to a jumphost then SSH from there to the desired machine&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;I wanted to share the configuration I use to make that easier.&lt;br /&gt;&lt;a name=&quot;cutid1&quot;&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style=&quot;font-size: medium;&quot;&gt;My Bash Aliases&lt;/span&gt;&lt;br /&gt;These aliases make it easy for me to do a few useful things quickly:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;SSH to the vpn box with or without SSH compression&lt;/li&gt;&lt;li&gt;Run the &lt;a href=&quot;http://omnitty.sourceforge.net&quot; rel=&quot;nofollow&quot;&gt;omnitty&lt;/a&gt; terminal multiplexer to be able to interactively work with a cluster of machines&lt;/li&gt;&lt;li&gt;SSH to a particular machine and resume a screen session &lt;strong&gt;with&lt;/strong&gt; my SSH agent variables fixed so I can connect to other machines with my pubkey properly.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;div style=&quot;color: green; background-color: black;&quot;&gt;&lt;code&gt;$ cat .bash_aliases &lt;br /&gt;#!/bin/bash &lt;br /&gt;alias vpn=&apos;ssh vpn&apos; &lt;br /&gt;&lt;br /&gt;# VPN with compression (useful when on cellular modem) &lt;br /&gt;alias zvpn=&apos;ssh -C vpn&apos; &lt;br /&gt;&lt;br /&gt;# omnitty doesn&apos;t work well inside screen so this is a separate alias for running it. &lt;br /&gt;alias omnicluster=&apos; ssh -t vpn &amp;quot;ssh -t cluster01 \&amp;quot;omnitty -W 15 -T 125\&amp;quot;&amp;quot;&apos; &lt;br /&gt;&lt;br /&gt;alias h01=&apos; ssh -t vpn &amp;quot;ssh -t cluster01 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;alias h02=&apos; ssh -t vpn &amp;quot;ssh -t cluster02 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;alias h03=&apos; ssh -t vpn &amp;quot;ssh -t cluster03 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;alias h04=&apos; ssh -t vpn &amp;quot;ssh -t cluster04 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;alias h05=&apos; ssh -t vpn &amp;quot;ssh -t cluster05 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;alias h06=&apos; ssh -t vpn &amp;quot;ssh -t cluster06 \&amp;quot;/home/me/bin/grabssh; screen -xRR\&amp;quot;&amp;quot;&apos; &lt;br /&gt;&lt;/code&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-size: medium;&quot;&gt;My SSH Config&lt;/span&gt;&lt;br /&gt;This config sets up several important SSH features:&lt;br /&gt;&lt;dl&gt;&lt;dt&gt;SSH MasterControl&lt;/dt&gt;&lt;dd&gt;The master control feature lets you share just one SSH communication connection among multiple SSH sessions to the same server.  Since All of my sessions are going through my VPN jumphost, this makes all my sessions a little snapier since they aren&apos;t each doing their own encryption etc.&lt;br /&gt;&lt;/dd&gt;&lt;dt&gt;ServerAliveInterval&lt;/dt&gt;&lt;dd&gt;If I suspend my laptop or otherwise lose connectivity, this option makes sure that my SSH connections terminate rather than hanging for an annoyingly long time.&lt;br /&gt;&lt;/dd&gt;&lt;dt&gt;ForwardAgent&lt;/dt&gt;&lt;dd&gt;Some of the machines I connect to use pubkey and some of my other machines don&apos;t. I can configure which groups of machines should use it.&lt;br /&gt;&lt;/dd&gt;&lt;dt&gt;User&lt;/dt&gt;&lt;dd&gt;Same as above. On some machines, I need to log in as a different user.  Specifying it here means I don&apos;t have to remember to type ssh user@host all the time.&lt;/dd&gt;&lt;br /&gt;&lt;dt&gt;HostName&lt;/dt&gt;&lt;dd&gt;I can give a short easy machine name alias here so I don&apos;t have to type the FQDN everywhere else&lt;/dd&gt;&lt;/dl&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style=&quot;color: green; background-color: black;&quot;&gt;&lt;code&gt;$ cat config &lt;br /&gt;Host othercluster*&lt;br /&gt;    ForwardAgent=yes &lt;br /&gt;    User metrics &lt;br /&gt;&lt;br /&gt;Host vpn &lt;br /&gt;    ForwardAgent=yes &lt;br /&gt;    ProxyCommand none &lt;br /&gt;    # ControlMaster is magic that lets you re-use one SSH connection when you connect to the same machine multiple times. &lt;br /&gt;    # Since all my connections to the servers I use go through vpn, if I use ControlMaster on vpn, I only have one encrypted tunnel &lt;br /&gt;    # that all the connections to the different servers use.  This actually makes it feel much snappier to connect and use them remotely. &lt;br /&gt;    ControlMaster auto &lt;br /&gt;    ControlPath=~/.ssh/%r@%h:%p &lt;br /&gt;    HostName my-vpn.domain.net&lt;br /&gt;&lt;br /&gt;Host *.domain.net *.domain.com vpn cluster*&lt;br /&gt;    ForwardAgent=yes &lt;br /&gt;    # Magic so I don&apos;t try to use my machine username by default. &lt;br /&gt;    User otherusername&lt;br /&gt;&lt;br /&gt;# ProxyCommands ended up being a bit flaky in combination with ControlMaster so I&apos;m just using raw bash aliases instead now. &lt;br /&gt;#Host cluster??&lt;br /&gt;#    ProxyCommand ssh -t vpn &amp;quot;ssh cluster%h&amp;quot; &lt;br /&gt;&lt;br /&gt;Host * &lt;br /&gt;    # ServerAliveInterval makes sure that if I close my laptop or lose my net connection, the SSH session doesn&apos;t &amp;quot;hang&amp;quot; but rather returns me to a command prompt. &lt;br /&gt;    ServerAliveInterval 15 &lt;br /&gt;    IdentityFile=~/.ssh/id_dsa&lt;br /&gt;&lt;/code&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-size: medium;&quot;&gt;My grabssh Script&lt;/span&gt;&lt;br /&gt;Found this script on &lt;a href=&quot;http://www.deadman.org/sshscreen.php&quot; rel=&quot;nofollow&quot;&gt;Sam Rowe&apos;s website&lt;/a&gt;. It lets me update my SSH agent environment variables so an existing screen session can still connect to other machines with pubkey authentication.&lt;br /&gt;&lt;div style=&quot;color: green; background-color: black;&quot;&gt;&lt;code&gt;$ cat grabssh &lt;br /&gt;#!/bin/bash &lt;br /&gt;# This magic script helps when using SSH to connect to a preexisting Screen session.  If grabssh is run &lt;br /&gt;# before the screen session is reconnected, then you can run the generated &amp;quot;fixssh&amp;quot; script inside of Screen and it &lt;br /&gt;# will update your SSH agent variables so that you can ssh to other machines without a problem. &lt;br /&gt;SSHVARS=&amp;quot;SSH_CLIENT SSH_TTY SSH_AUTH_SOCK SSH_CONNECTION DISPLAY&amp;quot; &lt;br /&gt;&lt;br /&gt;for x in ${SSHVARS} ; do &lt;br /&gt;    (eval echo $x=\$$x) | sed  &apos;s/=/=&amp;quot;/ &lt;br /&gt;                                s/$/&amp;quot;/ &lt;br /&gt;                                s/^/export /&apos; &lt;br /&gt;done 1&amp;gt;/home/me/bin/fixssh &lt;br /&gt;&lt;/code&gt;&lt;/div&gt;&lt;a name=&apos;cutid1-end&apos;&gt;&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/78783.html</comments>
  <category>work</category>
  <category>ssh</category>
  <category>screen</category>
  <lj:mood>artistic</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/78409.html</guid>
  <pubDate>Tue, 17 Nov 2009 06:26:21 GMT</pubDate>
  <title>Performance of Rhino JS engine and Janino library in Kettle</title>
  <link>http://daniele.livejournal.com/78409.html</link>
  <description>My friend Roland Bouman made an interesting &lt;a href=&quot;http://rpbouman.blogspot.com/2009/11/pentaho-data-integration-javascript.html&quot; rel=&quot;nofollow&quot;&gt;blog post regarding the performance of a bit of JavaScript for Kettle&lt;/a&gt; that he saw on a different blog.&lt;br /&gt;&lt;br /&gt;Given the large amounts of data that I am shoving through Kettle every day, I tend to be extremely concerned about performance.  Even a small inefficiency can lead to dramatic slowdowns.  Hence, when I saw his post, I got to thinking about how I would approach the problem if it were within the realm of the large data sets I work with and hence required extreme optimization.&lt;br /&gt;&lt;br /&gt;I didn&apos;t have a lot of spare time to dedicate to this experiment, so I opted for a screen-cast instead of a nicely formatted blog post.  That said, I think there is a certain benefit in being able to see the work flow of someone who is very comfortable with Kettle.&lt;br /&gt;&lt;br /&gt;The screen-cast is currently in Apple QuickTime format. Bleh.  I need to get a new Ogg Theora transcoder because the one that I tried to use last time is not happy with me and I didn&apos;t have time to fiddle with it.&lt;br /&gt;&lt;br /&gt;So, if you use Kettle and are interested in these things, here is the screen-cast.  Be warned it is 30 minutes long and probably not extremely exciting to anyone outside of the ETL field.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://bit.ly/PDI_example&quot; target=&quot;_blank&quot; title=&quot;http://people.mozilla.com/~deinspanjer/KettleJSPerformance.mov&quot; rel=&quot;nofollow&quot;&gt;Kettle string transformation optimization walk-through&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If you are familiar with developing plug-ins for Kettle and you&apos;d like to take a look at the User Defined Java Class plug-in I demonstrated at the end of the screen-cast, you can pick it up from the Pentaho SVN plugins repository. Just wear gloves because it has rough edges.&lt;br /&gt;&lt;a title=&quot;svn://source.pentaho.org/svnkettleroot/plugins/UserDefinedJavaClass/branches/3.2x&quot; href=&quot;http://bit.ly/UDJC_SVN&quot; rel=&quot;nofollow&quot;&gt;User Defined Java Class plug-in&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/78409.html</comments>
  <category>kettle</category>
  <lj:mood>tired</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/78239.html</guid>
  <pubDate>Thu, 24 Sep 2009 20:10:50 GMT</pubDate>
  <title>Advice regarding using Travelex Cash Passport cards for travel</title>
  <link>http://daniele.livejournal.com/78239.html</link>
  <description>If you are a traveling to Europe and considering getting one of these debit cards to make life easier for you while there, my advice is &lt;strong&gt;don&apos;t&lt;/strong&gt;!  Figure out which of your credit cards charges the least amount of fees for international usage and just use it.&lt;br /&gt;&lt;br /&gt;&lt;a name=&quot;cutid1&quot;&gt;&lt;/a&gt;&lt;br /&gt;First, a general complaint:&amp;nbsp; interacting with Travelex is a hassle. Their phone menus are a maze of twisty little passages, all alike.&amp;nbsp; When you finally manage to reach a person, you&apos;ll have to give them all the exact same information you entered into the phone menu.&amp;nbsp; Calling them when you are on your trip means calling a long distance number plus additional charges if you must use your cell phone or hotel phone or pay phone.&amp;nbsp; The customer service agents are not rude though.&amp;nbsp; I&apos;ll give them that.&lt;br /&gt;&lt;br /&gt;Most importantly, the card does not work everywhere that MasterCard is accepted.  If you give it to a merchant that it is not compatible, their credit card processing machine might decline the card but the money can still be withheld from you for seven days.  If the merchant runs the card through three or four times, each attempt will withhold the funds again.  A nice &amp;euro;60 meal can turn into a &amp;euro;180 disaster.&lt;br /&gt;&lt;br /&gt;If you get into this situation, don&apos;t expect help from anyone.  The merchant can&apos;t do anything because it was declined on his side.  When you call Travelex (paying international long distance or international roaming fees), they will tell you that it isn&apos;t their fault and that the merchant did something wrong. Ignore the fact that you had to turn around and use a different MasterCard with the merchant and that one went through just fine.  Furthermore, Travelex will tell you that your only option is to wait seven business days and see if the hold disappears.  If it does, move on to the next challenge.  If it doesn&apos;t, then expect the following additional hassle:&lt;ol&gt;&lt;li&gt;request a dispute form from Travelex&lt;/li&gt;&lt;li&gt;wait for it to be delivered by mail&lt;/li&gt;&lt;li&gt;fill it out and send them copies of all the decline receipts and a letter from the merchant stating that they did not receive the money that is disputed and that you have already paid them through other means&lt;/li&gt;&lt;li&gt;wait.  After pulling teeth with one customer service agent, I was able to get the reassurance that it should definitely take less than a year to resolve.  Likely just a few months.&lt;/li&gt;&lt;/ol&gt;Once you have the funds back on your card, it is likely you might be in the same boat as me and your trip to Europe is now over!  So now, how do you get the money back? You have a few options:&lt;ul&gt;&lt;li&gt;Withdraw it from an ATM. You will pay:&lt;ul&gt;&lt;li&gt;&amp;euro;1.75 ATM fee converted to USD at a worse than market rate&lt;/li&gt;&lt;li&gt;$x dollars to the owner of the ATM for using a foreign ATM card&lt;/li&gt;&lt;li&gt;Whatever remainder you can&apos;t take out through the ATM will be taken by Travelex after 12 months&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Close the card at a Travelex branch. You will pay:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;$20 administration fee&lt;/li&gt;&lt;li&gt;Worse than market conversion rate (note the fine print that the &amp;quot;Currency Return Guarantee&amp;quot; doesn&apos;t apply to the money on the card!)&lt;/li&gt;&lt;li&gt;Another arbitrary fee/commission that varies from branch to branch and (I suspect) the mood of the employee&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Leave the funds on the card until your next trip. You will pay:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&amp;euro;2.30 Monthly inactivity fee each month after a year.&lt;/li&gt;&lt;li&gt;The unavailability of the money for the duration (Travelex will happily take advantage of the money while you aren&apos;t using it though!)&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a name=&apos;cutid1-end&apos;&gt;&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/78239.html</comments>
  <category>personal</category>
  <lj:mood>angry</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>7</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/77979.html</guid>
  <pubDate>Tue, 08 Sep 2009 14:16:12 GMT</pubDate>
  <title>Ubuntu screen-profiles customization</title>
  <link>http://daniele.livejournal.com/77979.html</link>
  <description>I recently loaded Ubuntu server 9.04 onto a new machine and encountered Ubuntu&apos;s screen-profiles.&lt;br /&gt;In general, I like it.  I had one problem and one customization that I wanted to share:&lt;br /&gt;&lt;br /&gt;I use Mac OS X&apos;s Terminal.app to connect to my remote machines, and by default, it has custom mappings for F1 through F4.  I have no idea what those keybindings mean, but they prevent screen-profiles&apos;s keybindings from working.  It took a little fiddling to figure out how to fix them.  Basically, you need to:&lt;ol&gt;&lt;li&gt;Open up the preferences dialog for Terminal.app&lt;/li&gt;&lt;li&gt;Go to the Settings pane&lt;/li&gt;&lt;li&gt;Click on the Keyboard tab button&lt;/li&gt;&lt;li&gt;Edit the action for each of the F1 through F4 keys&lt;/li&gt;&lt;li&gt;When editing, click the &amp;quot;delete one character&amp;quot; button twice to erase the characters currently in there (leave the \033 escape)&lt;/li&gt;&lt;li&gt;Type the following characters: [ 1 1 ~   11 is F1, 12 is F2, 13 is F3, 14 is F4&lt;/li&gt;&lt;li&gt;The new entries should look just like the F5 through F8 actions.&lt;/li&gt;&lt;/ol&gt;Once I was able to use the F2 F3 and F4 keys, I decided that they weren&apos;t that useful to me.  I prefer to use a combination of screen regions and windows.  The window commands are very easy for me, but I&apos;ve always found the split, focus, and remove keybindings to be uncomfortable so I figured those would be great commands to map to F2 F3 and F4.  Here is how I did that:&lt;ol&gt;&lt;li&gt;sudo cp /usr/share/screen-profiles/keybindings/common /usr/share/screen-profiles/keybindings/regions&lt;/li&gt;&lt;li&gt;sudo vi /usr/share/screen-profiles/keybindings/regions&lt;/li&gt;&lt;li&gt;replace the first four entries with the new entries below&lt;/li&gt;&lt;li&gt;save and close the file&lt;/li&gt;&lt;li&gt;In screen, hit F9 to bring up the menu&lt;/li&gt;&lt;li&gt;Select the option for &amp;quot;Change keybinding set&lt;/li&gt;&lt;li&gt;Select the new &amp;quot;regions&amp;quot; entry&lt;/li&gt;&lt;li&gt;Hit F5 to reload your screen-profile and pick up the new keybindings.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;code&gt;register n &amp;quot;^aS^a^i^a^c^aA&amp;quot;                                 #     | Goes with the F2 definition&lt;br /&gt;bindkey -k k2 process n                                 # F2  | Create new region and window (and name it)&lt;br /&gt;bindkey -k k3 focus                                     # F3  | Next region&lt;br /&gt;bindkey -k k4 remove                                    # F4  | Remove region&lt;br /&gt;&lt;/code&gt;</description>
  <comments>http://daniele.livejournal.com/77979.html</comments>
  <category>ubuntu</category>
  <category>screen</category>
  <lj:mood>contemplative</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/77587.html</guid>
  <pubDate>Thu, 30 Jul 2009 05:55:16 GMT</pubDate>
  <title>Shell script analytics</title>
  <link>http://daniele.livejournal.com/77587.html</link>
  <description>I just made a rather lengthy post on the &lt;a href=&quot;http://blog.mozilla.com/data/&quot; rel=&quot;nofollow&quot;&gt;Mozilla blog of data&lt;/a&gt; about &lt;a href=&quot;http://blog.mozilla.com/data/2009/07/29/shell-script-analytics/&quot; rel=&quot;nofollow&quot;&gt;shell script analytics&lt;/a&gt;.&amp;nbsp; I&apos;ll try hard not to cross post stuff like this too often, but I thought I&apos;d allow myself the spam this time around because using Bash and AWK to do things like this really is an important part of who I am personally as a geek in addition to what I do for Mozilla. :)&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/77587.html</comments>
  <category>work</category>
  <category>data</category>
  <category>mozilla</category>
  <category>etl</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/77528.html</guid>
  <pubDate>Fri, 29 May 2009 15:10:39 GMT</pubDate>
  <title>I&apos;ve always thought my job was fun.  Now I hear it is sexy too!</title>
  <link>http://daniele.livejournal.com/77528.html</link>
  <description>I just finished reading this lovely little post from the company &lt;a href=&quot;http://dataspora.com&quot; rel=&quot;nofollow&quot;&gt;dataspora&lt;/a&gt; titled &lt;a href=&quot;http://dataspora.com/blog/sexy-data-geeks/&quot; rel=&quot;nofollow&quot;&gt;The Three Sexy Skills of Data Geeks&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;By far, my favorite quote was,&amp;nbsp;&amp;quot;A good data munger excels at turning coffee into regular expressions and parsers&amp;quot;.&amp;nbsp; That certainly describes me to a tee. :)&lt;br /&gt;&lt;br /&gt;I&apos;ve always found each of these the three facets of working with data fascinating.&amp;nbsp; One of the comments mentioned that decision making was an important missing trait.&amp;nbsp; I could go either way there.&amp;nbsp; I feel it is good to be able to tell a compelling story with the data that helps others to understand it, and then those people take the understanding you imparted to them and make decisions based on it.&lt;br /&gt;&lt;br /&gt;It is incredibly hard to find a person who is skilled in just one or two of these facets.&amp;nbsp; When you find the data geek who has all three, then you count yourself lucky.&amp;nbsp; Expecting someone who has that caliber of devotion to data to also be capable of making decisions like a CEO is a bit unrealistic in my opinion.&lt;br /&gt;&lt;br /&gt;Anyway, the article is a good, quick read.&amp;nbsp; It also quite nicely summarizes the major passions in my professional life right now.&lt;br /&gt;&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/77528.html</comments>
  <category>work</category>
  <category>data</category>
  <category>visualization</category>
  <lj:mood>busy</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/77084.html</guid>
  <pubDate>Wed, 27 May 2009 14:49:42 GMT</pubDate>
  <title>Interesting crowd-sourced solution sites</title>
  <link>http://daniele.livejournal.com/77084.html</link>
  <description>A good friend of mine runs the site &lt;a href=&quot;http://bug.gd&quot; rel=&quot;nofollow&quot;&gt;bug.gd&lt;/a&gt; (and it&apos;s more professional pseudonym, &lt;a href=&quot;http://errorhelp.com&quot; rel=&quot;nofollow&quot;&gt;errorhelp.com&lt;/a&gt;).&amp;nbsp; This service provides something that is slightly missing from the typical Google search for an error to find a solution.&amp;nbsp; It allows you to enter the full text of the error message or stack trace instead of just a couple of keywords, and it provides rich community feedback on solutions.&amp;nbsp; You can even tip people for their solutions through &lt;a href=&quot;http://tipjoy.com&quot; rel=&quot;nofollow&quot;&gt;tipjoy.com&lt;/a&gt; integration.&lt;br /&gt;&lt;br /&gt;I recently came across two other nice sites created by a different company that provide a similar and complimentary service:&lt;br /&gt;&lt;a href=&quot;http://stackoverflow.com&quot; rel=&quot;nofollow&quot;&gt;stackoverflow.com&lt;/a&gt; - A site dedicated to crowd-sourcing answers to programming questions&lt;br /&gt;&lt;a href=&quot;http://serverfault.com&quot; rel=&quot;nofollow&quot;&gt;serverfault.com&lt;/a&gt; - A site dedicated to crowd-sourcing answers to system administration questions&lt;br /&gt;&lt;br /&gt;I think it is very helpful to have a list of these sites that you can go to post a question and hopefully get an answer that will even be moderated by the community to help you determine the value of the answer.&amp;nbsp; This is something that typically takes a lot longer if you search for a forum or mailing list site and post there.&amp;nbsp; While it is less immediate than IRC, the moderation and ability to leave a question and get an answer &amp;quot;soon&amp;quot; are nice features you are less likely to see in IRC (although I&apos;ve always gotten great results from #java, #sql, #mysql, and #bash).&lt;br /&gt;&lt;br /&gt;As you can tell, I&apos;m a big fan of crowd-sourcing.&amp;nbsp; I&amp;nbsp;have run a couple of contests on &lt;a href=&quot;http://99designs.com&quot; rel=&quot;nofollow&quot;&gt;99designs.com&lt;/a&gt; and have been incredibly pleased with the results that came out of that community of freelance graphic designers.&lt;br /&gt;&lt;br /&gt;Check these places out and see if they can help you or if you can help them!&lt;br /&gt;&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/77084.html</comments>
  <category>programming</category>
  <category>crowd-sourcing</category>
  <category>errors</category>
  <category>sysadmin</category>
  <lj:mood>cheerful</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/77049.html</guid>
  <pubDate>Fri, 17 Apr 2009 22:40:59 GMT</pubDate>
  <title>Ways to visualize and share data</title>
  <link>http://daniele.livejournal.com/77049.html</link>
  <description>Mozilla needs to be able to provide useful extracts of data such as download trends, etc. and allow the community to perform their own analysis on them, so I&apos;m always keeping a lookout for useful tools to further that goal.&lt;br /&gt;&lt;br /&gt;When Tony Wright posted the blog entry &lt;a href=&quot;http://www.tonywright.com/2009/just-how-important-is-the-valley-lets-look-at-some-data/&quot; rel=&quot;nofollow&quot;&gt;Just How Important is the Valley? Let&amp;rsquo;s Look at some Data&lt;/a&gt; on April 17th 2009, he was kind enough to publish the data set (it needs an attribution / license though) and the data looked interesting so I thought I&apos;d spend a little time playing with it using some tools that I&apos;ve been keeping my eye on.&lt;br /&gt;&lt;br /&gt;First, I slurped the table into &lt;a href=&quot;http://www.dabbledb.com&quot; rel=&quot;nofollow&quot;&gt;DabbleDB&lt;/a&gt;, a website that is very well suited to messing with this type of data (i.e. sourced from the web, might need a bit of cleanup, etc.).  You can view and edit the data I imported to DabbleDB here: &lt;a href=&quot;https://yipyip.dabbledb.com/page/yipyip/uqFxSObU&quot; rel=&quot;nofollow&quot;&gt;Acquired Startups Data&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;DabbleDB does a great job at allowing a user to sort, filter, group, and modify data using a simple interface, but it does not have a large array of visualizations.  For that, we head over here to the IBM AlphaWorks lab&apos;s project, &lt;a href=&quot;http://manyeyes.alphaworks.ibm.com/wikified&quot; rel=&quot;nofollow&quot;&gt;Many Eyes Wikified&lt;/a&gt;.&amp;nbsp; I created a quick wiki dashboard for throwing together a few visualizations: &lt;a href=&quot;http://manyeyes.alphaworks.ibm.com/wikified/acquired_startups/Main Page&quot; rel=&quot;nofollow&quot;&gt;Acquired Startups Visualizations&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This was just a quick break from real work I&apos;ve been doing, so I spent less than an hour on this.&amp;nbsp; I only took about 20 minutes with DabbleDB: importing the data, cleaning the dollar values, then creating two new views that group the data by country or by state for visualization.&amp;nbsp; Then I moved over to Many Eyes and played with a few visualizations to try to find some interesting views of the data and threw them into the dashboard and two sub pages.&lt;br /&gt;&lt;br /&gt;Being able to quickly extract, transform, and visualize this data is the big win for DabbleDB and Many Eyes in my opinion.&amp;nbsp; With both applications having open licensing of the data and collaboration as a key focus, they are tools that I hope to be able to take advantage of at Mozilla soon.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;https://yipyip.dabbledb.com/page/yipyip/uqFxSObU&quot; rel=&quot;nofollow&quot;&gt;&lt;img width=&quot;694&quot; height=&quot;265&quot; src=&quot;http://content.screencast.com/users/DEinspanjer/folders/Jing/media/e7f222f2-8953-47f8-a6f7-9409c93c5036/00000068.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://manyeyes.alphaworks.ibm.com/wikified/acquired_startups/Main%20Page&quot; rel=&quot;nofollow&quot;&gt;&lt;img width=&quot;733&quot; height=&quot;253&quot; src=&quot;http://content.screencast.com/users/DEinspanjer/folders/Jing/media/3742ed48-cc68-4241-9d41-a1788c68b712/00000067.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/77049.html</comments>
  <category>manyeyes</category>
  <category>data</category>
  <category>visualization</category>
  <category>dabbledb</category>
  <lj:mood>cheerful</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/76695.html</guid>
  <pubDate>Mon, 30 Mar 2009 16:17:52 GMT</pubDate>
  <title>Counting unique visitors in SQL</title>
  <link>http://daniele.livejournal.com/76695.html</link>
  <description>A lot of web metrics solutions out there like NetTracker or Omniture allow you to perform analysis on the number of unique visitors over time.  This is a pretty important metric to a lot of companies, and I recently needed to perform such an analysis, but it was on data stored in a SQL database rather than in one of these proprietary solution&apos;s data-stores.&lt;br /&gt;&lt;br /&gt;Doing any sort of distinct counting on a large volume of data in SQL can be very costly, both in terms of storage of the raw data (since you can&apos;t aggregate it), and in query performance since there are relatively few optimizations that can be performed on the table or the query.&lt;br /&gt;&lt;br /&gt;&lt;a name=&quot;cutid1&quot;&gt;&lt;/a&gt;&lt;br /&gt;Fortunately for me, our data warehouse is stored in &lt;a href=&quot;http://www.vertica.com&quot; rel=&quot;nofollow&quot;&gt;Vertica&lt;/a&gt;, and while the queries weren&apos;t blindingly fast, I was able to get the analysis done in a very reasonable time frame.&lt;br /&gt;&lt;br /&gt;I was dealing with a week worth of traffic (about 80m requests per day), and one of the biggest challenges I had was how to determine what constituted a &amp;quot;unique visitor&amp;quot; (in this case, it is actually more of a unique ping or requestor since there isn&apos;t really a person involved).&lt;br /&gt;&lt;br /&gt;I didn&apos;t have a cookie that I could use, so that left me with the less desirable course of using a combination of IP address and User Agent string.  The problem with this is that the solution will under count one class of requests, and over count a different class.  Here are the details:&lt;br /&gt;&lt;br /&gt;1. If a request comes from a host that receives its public IP address via DHCP (e.g. a cable modem or DSL) and that service provider has their DHCP configured to force a change of IP addresses when the host renews its lease, then when the IP address changes, the requestor will be considered &amp;quot;new&amp;quot;.  e.g. HostX makes a request on Monday with IP 1.2.3.4 and a request Tuesday with IP 1.2.3.4. Then, on Wednesday, their IP address changes.  Later Wednesday, HostX makes a request with its new IP 2.3.4.5.  When we perform analysis on this week, we will see one distinct requestor on Monday and Tuesday, but a new requestor on Wednesday.  In the worst case, if a new host, HostY is assigned IP 1.2.3.4 which HostX used to have and HostY also makes a request using the same OS version, we will mistakenly believe HostY on Wednesday through Saturday is the same distinct requestor as HostX from Monday and Tuesday.&lt;br /&gt; &lt;br /&gt; 2. If several hosts are on the same LAN network (e.g. an office), then the public IP address will likely be the same for each of those hosts.  I use the partial user agent string to help mitigate this problem.  I am pulling the OS and locale details out of the user agent string and using that in addition to the IP address to determine uniqueness.  Unfortunately, there are a lot of machines running Windows XP with en-US, so this is only partially helpful.  Any host with the same IP + OSversion + locale will be treated as a single distinct requestor in this analysis.&lt;br /&gt; &lt;br /&gt; 3. When I worked on this IP+UA strategy, I originally tested using the full UA (user agent) string which includes the browser version number.  This might make sense for many other websites, but unfortunately, what I saw in the test cases that I used was that we would &amp;quot;forget&amp;quot; the distinctness whenver the browser is upgraded, or in some cases, even when certain plugins or extensions are installed (ones that modify the UA) [I&apos;m glaring at you, MegaUpload!].&lt;br /&gt; &lt;br /&gt;&lt;br /&gt;So, with this strategy in place, I ran the numbers and while I could see an unfortunate amount of under counting (i.e. multiple requests being counted as the same distinct requestor when they likely should have been separate), it was as good as I was going to get.&lt;br /&gt;&lt;br /&gt;The last thing I needed to do was to write a SQL statement that added up the number of distinct requestors grouped by the number of days in the week that requests were made.  Here is the SQL I wrote to do that.  This was just my first stab at it, I got my answer, and it didn&apos;t take more than a few minutes, so I left it at that.  I&apos;d still be interested in hearing if anyone else has a better way. :)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;SELECT (d15 + d16 + d17 + d18 + d19 + d20 + d21) AS RequestsPerWeek&lt;br /&gt;, COUNT(*) AS NumDistinctRequestors&lt;br /&gt;FROM (&lt;br /&gt;    SELECT&lt;br /&gt;      MAX(CASE WHEN d.date = &apos;2009-03-15&apos; THEN 1 ELSE 0 END) AS d15&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-16&apos; THEN 1 ELSE 0 END) AS d16&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-17&apos; THEN 1 ELSE 0 END) AS d17&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-18&apos; THEN 1 ELSE 0 END) AS d18&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-19&apos; THEN 1 ELSE 0 END) AS d19&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-20&apos; THEN 1 ELSE 0 END) AS d20&lt;br /&gt;    , MAX(CASE WHEN d.date = &apos;2009-03-21&apos; THEN 1 ELSE 0 END) AS d21&lt;br /&gt;    FROM distinct_requests a1&lt;br /&gt;    JOIN dates d ON a1.utc_date_id = d.date_id&lt;br /&gt;    GROUP BY a1.ip_ua_id&lt;br /&gt;) x&lt;br /&gt;GROUP BY (d15 + d16 + d17 + d18 + d19 + d20 + d21)&lt;br /&gt;ORDER BY (d15 + d16 + d17 + d18 + d19 + d20 + d21)&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name=&apos;cutid1-end&apos;&gt;&lt;/a&gt;</description>
  <comments>http://daniele.livejournal.com/76695.html</comments>
  <category>work</category>
  <category>vertica</category>
  <category>data</category>
  <category>sql</category>
  <lj:mood>working</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/76336.html</guid>
  <pubDate>Mon, 09 Mar 2009 16:17:37 GMT</pubDate>
  <title>TinyArro.ws URLs</title>
  <link>http://daniele.livejournal.com/76336.html</link>
  <description>A friend just released an URL shrinking service that I enjoy:&amp;nbsp; &lt;a href=&quot;http://tinyarro.ws&quot; rel=&quot;nofollow&quot;&gt;tinyarro.ws&lt;/a&gt; (more nifty when written as &lt;a href=&quot;http://➡.ws&quot; rel=&quot;nofollow&quot;&gt;➡.ws&lt;/a&gt;).&lt;br /&gt;It has a few great features over the current main stream shrinkers:&lt;br /&gt;&lt;br /&gt;1. Cool/fun URLs (e.g. http://➽.ws/囨 for my website)&lt;br /&gt;2. Very short URLs due to Unicode suffixes (great for Twitter!)&lt;br /&gt;3. Preview by default! (no tweak to the URL to remember)&lt;br /&gt;4. Option to enter your own custom suffix&amp;nbsp;(TinyURL now has this, but it was too useful to not mention).&lt;br /&gt;5. &lt;a href=&quot;http://›.ws/☺&quot; rel=&quot;nofollow&quot;&gt;A Ubiquity command &amp;rsaquo;.ws/☺&lt;/a&gt; (eventually to be integrated directly on the site)&lt;br /&gt;&lt;br /&gt;Some news about the site:&lt;br /&gt;&lt;a href=&quot;http://news.ycombinator.com/item?id=507982&quot; rel=&quot;nofollow&quot; rel=&quot;nofollow&quot;&gt;TinyArro.ws: 10 new unicode domains. Defaulting previews to ON.&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://news.ycombinator.com/item?id=498051&quot; rel=&quot;nofollow&quot;&gt;Ask HN: Thoughts on TinyArro.ws? Tiniest urls in the world (or your money back)&lt;/a&gt;&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/76336.html</comments>
  <category>fun</category>
  <lj:mood>amused</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/76164.html</guid>
  <pubDate>Thu, 12 Feb 2009 19:33:08 GMT</pubDate>
  <title>Willingness to be a little evil</title>
  <link>http://daniele.livejournal.com/76164.html</link>
  <description>I have been a supporter of Firefox and Mozilla for several years now, and while I don&apos;t write patches and fix bugs, a major part of that support is educating people about Mozilla, open source, and user empowerment whenever a conversation about technology allows for it.&lt;br /&gt;&lt;br /&gt;I&apos;ve found that people who use proprietary software and operating systems often fall into two broad categories for rationalizing that choice:&lt;br /&gt;1. They are told to do so by some authority (usually their employeer, sometimes their social tech support person, and in some cases, just because they were told it was the right thing to do by an ad or magazine article).&lt;br /&gt;2. They started using it for some reason (typically reason #1 above) a long time ago and are now just accustomed to it.&lt;br /&gt;&lt;br /&gt;I&apos;m sure all this is going to be old news to most people reading this, but I bring it up because of an interesting article I read today.&lt;br /&gt;&lt;br /&gt;In the 1960&apos;s and early 70&apos;s, psychologist Stanley Milgram performed &lt;a href=&quot;http://en.wikipedia.org/wiki/Milgram_experiment&quot; rel=&quot;nofollow&quot;&gt;a series of famous experiments&lt;/a&gt; that tested the willingness of people to do something they would normally object to on moral grounds when they are in a strictly controlled environment and instructed to do so by an authority figure.&lt;br /&gt;&lt;br /&gt;More recently, psychologist Jerry Burger had the opportunity to perform a series of similar experiments.&amp;nbsp; &lt;a href=&quot;http://www.alternet.org/module/printversion/126492&quot; rel=&quot;nofollow&quot;&gt;This alternet article&lt;/a&gt; describes the story and discusses the findings.&amp;nbsp; As I read the results and Dr. Burger&apos;s statements regarding the findings, I started thinking about how easy it is for the people to choose to give up their freedom to a piece of proprietary software for reasons similar to the ones described in these experiments.&lt;br /&gt;&lt;br /&gt;In a green field, these people would normally opt for software that provided them with more freedom and in many cases, subjectively better security, but because they are instructed by an authority figure, or because they got started with it a long time ago and just slid deeper and deeper in, those preferences are not enough by themselves to prompt the person to change their behavior.&lt;br /&gt;&lt;br /&gt;Now even this thought in and of itself would not be enough to prompt me to blog about this topic.&amp;nbsp; We&apos;re still well in the territory where the people who haven&apos;t gotten lost in a Wikipedia article about toothbrush hygiene they found when they clicked my first link are saying, &amp;quot;um, DUH!&amp;quot;&amp;nbsp; So here is my point:&lt;br /&gt;&lt;br /&gt;At the end of the article, Dr. Burger focuses on an interesting finding of both experiments.&amp;nbsp; &lt;em&gt;When a person is instructed to do something &amp;quot;wrong&amp;quot;, they are significantly less likely to do so if they are surrounded by peers who object first&lt;/em&gt;.&lt;br /&gt;&lt;br /&gt;So when you talk to someone who is sighing about how much they hate product X but they don&apos;t have a choice, don&apos;t hate on them and don&apos;t deride them for not having a backbone, but just tell them and show them how you chose to stand up for your freedom and your security.&amp;nbsp; An example can go a long way toward giving them the courage to listen to that little voice inside saying, &amp;quot;I want something better!&amp;quot;&lt;br /&gt;&lt;br /&gt;</description>
  <comments>http://daniele.livejournal.com/76164.html</comments>
  <lj:mood>working</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/76011.html</guid>
  <pubDate>Thu, 05 Feb 2009 21:43:45 GMT</pubDate>
  <title>Bash functions for going up to a directory</title>
  <link>http://daniele.livejournal.com/76011.html</link>
  <description>Sometimes, if I&apos;m in a really deep directory, I don&apos;t want to cd from / nor do I want to cd ../../../..&lt;br /&gt;I just want to either go up 5 directories, or maybe I want to go up to the parent directory &amp;quot;src&amp;quot; when I&apos;m in /home/dre/src/projects/foo/bar/classes/org/apache/blah&lt;br /&gt;&lt;br /&gt;This set of Bash functions lets me do that.&lt;br /&gt;The first, up() will change your directory. The second will instead just print the desired directory name.&amp;nbsp; This makes it easy for you to mv a file up higher or something.&lt;br /&gt;&lt;br /&gt;If you pass no arguments, it just goes up one directory.&lt;br /&gt;If you pass a numeric argument it will go up that number of directories.&lt;br /&gt;If you pass a string argument, it will look for a parent directory with that name and go up to it.&lt;br /&gt;(Note, there is a small display bug there. If you give it an invalid name, cd reports the &amp;quot;No such file or directory&amp;quot; error, which is good, but it has a bogus path.  Since you can&apos;t know what path they were actually trying to go to, it should just say &amp;quot;No such parent directory: ${yourbogusname}&amp;quot;.  I don&apos;t have time to figure that out right now though.)&lt;br /&gt;&lt;br /&gt;Just put these functions in your ~/.bashrc file and don&apos;t forget to source it. (&amp;nbsp; source ~/.bashrc )&lt;br /&gt;&lt;pre&gt;

function up()
{
    dir=&amp;quot;&amp;quot;
    if [ -z &amp;quot;$1&amp;quot; ]; then
        dir=..
    elif [[ $1 =~ ^[0-9]+$ ]]; then
        x=0
        while [ $x -lt ${1:-1} ]; do
            dir=${dir}../
            x=$(($x+1))
        done
    else
        dir=${PWD%/$1/*}/$1
    fi
    cd &amp;quot;$dir&amp;quot;;
}

function upstr()
{
    echo &amp;quot;$(up &amp;quot;$1&amp;quot; &amp;amp;&amp;amp; pwd)&amp;quot;;
}
&lt;/pre&gt;</description>
  <comments>http://daniele.livejournal.com/76011.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/75621.html</guid>
  <pubDate>Fri, 19 Dec 2008 16:18:13 GMT</pubDate>
  <title>All hail Ken Kovash!</title>
  <link>http://daniele.livejournal.com/75621.html</link>
  <description>It may be showing my ignorance, but I was unaware until recently of the officially recognized day for celebrating the man, the myth, and the math that is &lt;a href=&quot;http://www.kenkovash.com/&quot; rel=&quot;nofollow&quot;&gt;Ken Kovash&lt;/a&gt;.&amp;nbsp; To think that all the time leading up to this point, I had just been satisfied with the joyous feeling in my heart every day I interacted with him.&lt;br /&gt;&lt;br /&gt;Ken can be a harsh task-master some times. &lt;br /&gt;&quot;Daniel, where are my &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/11/19/using-firefox-after-eating-turkey/&quot; rel=&quot;nofollow&quot;&gt;numbers from yesterday&lt;/a&gt;?&quot; &lt;br /&gt;&quot;Daniel, why are the &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/11/20/we-shipped-funnelcake03/&quot; rel=&quot;nofollow&quot;&gt;funnelcake&lt;/a&gt; trends low here and high there? You&apos;re  data are wrong, go find it and fix it!&quot; &lt;br /&gt;But the pain is worth it when I see him take my crude raw data and  masterfully sculpt it into bounteous bevies of &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/09/16/do-ads-driving-firefox-downloads-affect-firefox-downloads/&quot; rel=&quot;nofollow&quot;&gt;tables&lt;/a&gt;, raging rivers of &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/08/21/a-first-look-at-the-uninstall-survey/&quot; rel=&quot;nofollow&quot;&gt;trend lines&lt;/a&gt;, triumphant  towers of &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/07/23/where-will-firefox-reach-50-market-share/&quot; rel=&quot;nofollow&quot;&gt;bar charts&lt;/a&gt;, overwhelming ontologies of &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/11/06/firefox-usage-and-europe/&quot; rel=&quot;nofollow&quot;&gt;pie graphs&lt;/a&gt;, and &lt;span&gt;gilt-edged grids &lt;/span&gt;of &lt;a href=&quot;http://blog.mozilla.com/metrics/2008/09/04/visualizing-data-in-new-ways/&quot; rel=&quot;nofollow&quot;&gt;treemaps&lt;/a&gt;.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;One must weep to behold it. &lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/75621.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/75318.html</guid>
  <pubDate>Tue, 09 Dec 2008 16:09:55 GMT</pubDate>
  <title>Performance improvements at the cost of complexity</title>
  <link>http://daniele.livejournal.com/75318.html</link>
  <description>I discovered something that I feel is a bit of a bug in the Sun Java implementation.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;If you pass in a string to the method InetAddress.getByName(), it does a bunch of testing to see if it is a domain name or a literal IP address.&lt;br /&gt;If it is an IPv4 address, it will then use String.split() to split the four parts.&amp;nbsp; String.split() uses regexes to do its work.&lt;br /&gt;&lt;br /&gt;That means that if you are querying for hundreds or millions of addresses in a tight loop (as I&apos;ve been doing), the JVM is spawning and compiling hundreds or millions of regex objects, in addition to a String array and four String objects per call.&lt;br /&gt;&lt;br /&gt;So at first, I just worked around it by doing basic substringing instead of splitting.&amp;nbsp; That gave me about 100x performance improvement. But then I realized I was still generating four string objects for every call..&lt;br /&gt;&lt;br /&gt;So I came up with this mapping method and it runs about 1000x faster with a near constant minimal memory footprint.&lt;br /&gt;&lt;br /&gt;I pre-calculate a multidimensional array of shorts where each element is indexed by the literal character value - 48 of the digits making up the number 0 - 255.&lt;br /&gt;&lt;br /&gt;With that array available, at run time, I can do a simple lookup of the short value and then do the math to get the long representation of the IP address.&amp;nbsp; I&apos;m still generating a couple of references and a few intermediate int values, but the JIT optimizer can make quick work of that.&lt;br /&gt;&lt;br /&gt;Linked is the test program I created to play with the different methods:&amp;nbsp; &lt;a href=&quot;http://people.mozilla.com/%7Edeinspanjer/InetAddressParse.zip&quot; rel=&quot;nofollow&quot;&gt;InetAddressParse test&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/75318.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>8</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/75038.html</guid>
  <pubDate>Fri, 28 Nov 2008 04:16:37 GMT</pubDate>
  <title>Don&apos;t listen to bash, it will lie to you!</title>
  <link>http://daniele.livejournal.com/75038.html</link>
  <description>Remember folks,&amp;nbsp; if you mv a directory, and there is a bash shell currently in that directory, the bash prompt will not update to reflect the new name until you cd out of the directory and then back in.&lt;br /&gt;&lt;br /&gt;I just spend way too long making changes and being frustrated because the changes weren&apos;t having any effect.&amp;nbsp; I was clearing cashes and restarting applications and monitoring log files..&amp;nbsp; It wasn&apos;t until I happened to do a :pwd in vim while editing the file for the umpteenth time that I finally noticed that the file I had been editing was actually in a backup of the folder that I had just made.&lt;br /&gt;&lt;br /&gt;Sheesh.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/75038.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/74929.html</guid>
  <pubDate>Sat, 25 Oct 2008 01:37:41 GMT</pubDate>
  <title>The best DHTML date range picker I&apos;ve ever seen</title>
  <link>http://daniele.livejournal.com/74929.html</link>
  <description>&lt;big&gt;&lt;big&gt;&lt;a href=&quot;http://www.filamentgroup.com/lab/update_date_range_picker_with_jquery_ui/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Filament Group&apos;s Date Range Picker&lt;/strong&gt;&lt;/a&gt;&lt;/big&gt;&lt;/big&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;It uses &lt;a href=&quot;http://jquery.com&quot; rel=&quot;nofollow&quot;&gt;jQuery&lt;/a&gt; and a JavaScript date parsing library by the name of &lt;a href=&quot;http://www.datejs.com/&quot; rel=&quot;nofollow&quot;&gt;Date.js&lt;/a&gt;.&amp;nbsp; This thing is simply amazing.&amp;nbsp; Some of the reasons I think so:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The developer can configure a start and end date limits based on what is valid for the system (e.g. if you only have data going back to 1999, no sense in letting the user chose a date in 3000 BC)&lt;/li&gt;&lt;li&gt;The developer can configure a set of predefined ranges such as &quot;Last week&quot;, &quot;Month to date&quot;, &quot;Year to date&quot;.&lt;/li&gt;&lt;li&gt;If the developer allows it, the user can use any combination of preconfigured ranges, a single date, an arbitrary range of dates, or they can use the back and forward arrows to roll the current date range forward or back.&lt;/li&gt;&lt;li&gt;It is smooth and crisp, able to be easily themed, and seems pretty extensible/tweakable.&lt;/li&gt;&lt;/ul&gt;It is still a work in progress (they just released it today), but I think it is still usable.&amp;nbsp; The only downside that I&apos;ve found so far is that the back and forward arrows in this very first released version can produce some unexpected ranges.&amp;nbsp; They are currently strictly math based, so if you do something like select the current month and then hit the back arrow thinking it will select the previous month, you&apos;ll probably get something slightly different since most adjacent months don&apos;t have the same number of days.&lt;br /&gt;&lt;br /&gt;I&apos;m also pretty sure it has an off by one error in it that I suspect they&apos;ll fix shortly.&amp;nbsp; If you select Sunday to Saturday of a week and then scroll backward, the next range is actually Monday to Sunday and the next Tuesday to Monday...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Ignore these nitpicks and go check it out right away if your website needs a date picker though.&amp;nbsp; To get such a fantastic widget in the very first release can only mean that it is going to be the bee&apos;s knees after a little public beta testing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/74929.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>4</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/74668.html</guid>
  <pubDate>Tue, 21 Oct 2008 16:06:31 GMT</pubDate>
  <title>Open Source Hardware</title>
  <link>http://daniele.livejournal.com/74668.html</link>
  <description>I thought that this article in Slate about &lt;a href=&quot;http://www.wired.com/techbiz/startups/magazine/16-11/ff_openmanufacturing?currentPage=all&quot; rel=&quot;nofollow&quot;&gt;Open Source Hardware&lt;/a&gt; was a fun read and worth sharing.&lt;br /&gt;There is an interesting similarity in the way that &lt;a href=&quot;http://www.arduino.cc/&quot; rel=&quot;nofollow&quot;&gt;Arduino&lt;/a&gt; handles open sourcing of their design but reserves the trademark to preserve brand quality to the Mozilla Firefox trademark.&lt;br /&gt;&lt;br /&gt;If you like reading about geeks going against the status quo in their industry and trying to make the world a better place, give the article a read.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/74668.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://daniele.livejournal.com/74481.html</guid>
  <pubDate>Sat, 11 Oct 2008 05:56:56 GMT</pubDate>
  <title>Good bye Mountain View</title>
  <link>http://daniele.livejournal.com/74481.html</link>
  <description>It has been a great two weeks out here in the office.&amp;nbsp; I&apos;ve gotten to see a lot of people face to face and had some useful meetings about my projects.&amp;nbsp; I just kicked off another round of massive data loads to run over the weekend while I&apos;m out of pocket. Hopefully they will run smoothly and deliver me high quality data.&lt;br /&gt;&lt;br /&gt;There are some really exciting things coming up this quarter:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;I&apos;ll be working on one of the largest data sets yet, our AMO data.&amp;nbsp; We have several really cool mechanisms for visualizing individual extension projects hosted on AMO. The developer has control over whether to make the statistics public or not.&amp;nbsp; As an example, you can take a look at the &lt;a href=&quot;https://addons.mozilla.org/en-US/statistics/addon/1865&quot; rel=&quot;nofollow&quot;&gt;statistics for Adblock Plus&lt;/a&gt;.&amp;nbsp; I&apos;ll be working on ways to be able to integrate data across projects so we can get a better understanding of the extension community that means so very much to Mozilla.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;I&apos;ll hopefully be blogging a little more about the complexities of processing the large amount of data that I have to crunch through.&lt;/li&gt;&lt;li&gt;I&apos;ll be making several pieces of my Pentaho Data Integration (Kettle for those of you in the know) ETL scripts available in an open source repository.&amp;nbsp; It will help with the blogging, they might be useful to other people doing similar things, and who knows, maybe some people will even have suggestions for improvements!&lt;/li&gt;&lt;li&gt;Later in the quarter, I&apos;ll be working on an exciting new project to take some of the aggregated data that Mozilla has, such as the number of downloads of Firefox for given time periods, and making it available publicly for the community to explore and visualize.&amp;nbsp; At the moment, I&apos;m leaning toward trying to use the &lt;a href=&quot;http://www.many-eyes.com/&quot; rel=&quot;nofollow&quot;&gt;Many-Eyes&lt;/a&gt; project from IBM AlphaWorks.&amp;nbsp; If anyone has any better ideas, please let me know.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Powered by &lt;a href=&quot;http://www.scribefire.com/&quot; rel=&quot;nofollow&quot;&gt;ScribeFire&lt;/a&gt;.&lt;/p&gt;</description>
  <comments>http://daniele.livejournal.com/74481.html</comments>
  <category>bi</category>
  <category>kettle</category>
  <lj:mood>accomplished</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
</channel>
</rss>
